Session 7

Dongchen Hou: The Brokenness in Mediation: Speech-to-Text Technology in Chinese

When technological media become omnipresent and function perfectly, their presence goes unnoticeable. The popularity of speech-to-text technology testifies the technological convenience to human life: it can smoothly transfer vocal information to a visual, textual mode of representation. Speech-to-text technology, though only emerged in the recent decade, follows a historical genealogy that contains other mediating technologies such as stenography and typewriter. Stenographers and typists took the mediating position in transcribing audial information into visual written texts, either in handwritten or typed forms. With the development of Artificial Intelligence (AI) and machine deep learning, Google, Amazon Alexa, OS Siri, and Microsoft have developed technologies that can achieve simultaneous transcription as did by stenographers and typists. It is undeniable that speech-to-text technologies have significantly changed human communicative modes with both humans and machines in general, however, the technology also poses challenges to users in different linguistic and social contexts. This research juxtaposes speech-to-text technology with other mediating technologies, including stenography and typewriter, to explore the ontological significance of broken media/mediation. Morador (2015) defines a technology that is “broken” as “those activities, directed towards the satisficing of human wants that are intended to produce changes in the material world that either do not manage to satisfy these wants or do not produce changes in the material world, or both.” (17). This definition, however, fails to depict and understand the non-human side of the picture. In this paper, I draw on Harman’s object-oriented ontology (2018), which provides another picture in the significance of non-human objects in discourse. I argue that mediated technologies gain presence and significance in the “brokenness.” In other words, only when technologies break do they appear. I look at how digital technologies, after its total replacement of the human mediator, such as stenographer and typists, function in mediating different modalities of communication. This research examines, to be specific, when speech-to-text technology breaks, and to whom it is broken. With a strong alphanumeric mechanism built in the technology in its invention, the brokenness is prominent and more visible in the context of non-alphabetic linguistic practices, such as the Chinese, or translingual practices. Incorporating Chinese writing into techno- linguistic modernity was never easy in history due to the Chinese language’s “incompatibility” with alphabets and alphanumeric technologies (Mullaney 2017; Tsu 2014; Hou, 2019). In the Chinese language context, speech-to-text technologies are often “broken”: either the technology cannot accurately identify the correct pronunciation of Chinese words, or wrong Chinese characters or expressions are chosen in texts, or the voice-recognition fails to capture non- standard, dialect Chinese. By so doing, this research aims to provide an alternative picture in understanding the significance of brokenness in media technology with a general ontological concern, exemplified in speech-to-text problems in the context of the Chinese language.


Valeria Lopez Torres: On glitches as indicators of authenticity in AI-generated images 

Ever since technology was first introduced into daily life, people have learned to co-exist with technological error. Thus, the cultural significance of these errors—also called ​glitches​ has shifted along with technology, as they have been manipulated for various purposes—seeking and embracing them to create art; introducing them intentionally to improve interaction; or working endlessly towards their eradication. While one definition of glitch is related to a “minor malfunction” (Merriam-Webster Dictionary), another definition relates to authenticity, as in “a false or spurious electronic signal” (idem). This paper examines the cultural role of the glitch as a marker of authenticity and as a device to aid viewers in discerning truth from fakery in a context in which sophisticated algorithms are able to generate highly-realistic (and convincing) images of human-looking faces through the use of Artificial Intelligence (AI). As these AI-generated images enter the current visual landscape, viewers must arm themselves against deception, thus looking for glitches, which usually manifest as irregularities in the background and with skin texture, interruption of patterns, and unexplained blobs, among other inconsistencies. For the computer scientist who strives to refine these technologies to generate more convincing and realistic images, glitches are undesirable. Conversely, for the lay viewer who is confronted for the first time by these hyper-realistic AI-generated images, glitches are desirable, as they allow for verification of their origin and authenticity (where authentic=human​whileu​ nauthentic=computer-generated​). In an image-driven society,​ where the boundaries between the virtual and non-virtual blur, and traditional notions of truth and reality are challenged, these technologies are reminders of the exciting possibilities of their positive applications, as well as the dangers of potential misuses, particularly in light of the emergence of ​deep fakes ​(videos in which AI algorithms are used to show someone saying or doing something that they did not in fact do or say). ​The ethical consequences of these images and their potential uses are largely related to surveillance, control, and ultimately, power. Thus, d esigners, scientists, artists and scholars alike must think seriously about h​ ow technological progress might enable higher degrees of control in detriment of human rights and freedom. Similarly, e​ducators must ponder how to respond to the many ethical issues brought about by these technologies, and teach emerging designers how to engage, view, and work critically with these images as they permeate our visual cultural landscape.

Zach Whalen: Breaking GIFs

The animated GIF is one of the most common forms of visual media, and many people share them daily on social media platforms or across communication platforms like Slack, or Discord. A number of different GIF services exist — such as Giphy, Gfycat and Tenor — mainly to serve GIFs within these other platforms. Others, like Reddit and Imgur, host communities of practitioners and consumers of GIF content. Several modalities or genres of GIF rely directly on the medium-specific affordances of the format: reaction GIFs, cinemagraphs, and memes tend to be short, relatively low-resolution video clips set to repeat indefinitely. Ironically, because true GIFs do not compress well, the majority of “GIFs” one encounters on a daily basis are actually videos encoded with the H.264 standard, so I argue that these constraints and their resulting modalities construct the “GIF” as a cultural artifact more reliably and usefully than the material realities of the file specification. In this presentation, I will focus on one specific genre of artistic GIF — the glitched or “broken GIF” — where artists typically use video manipulation and compression glitching. As demonstrated in the work shared on the ‘/r/brokengifs’ subreddit, broken gifs typically exploit video datamoshing techniques mainly removing i-frames to create a “sticking” effect, or duplicating p-frames to create a sense of smearing or “blooming”. More recently, some users have begun manipulating the parameters of videos’ container format, often by algorithmically modifying an AVI file’s motion vector parameters. By choosing the best moment to “break”, GIF artists develop the drama and rhythm of a video clip to transform into something new and uncanny. In the Glitch Studies Manifesto, Rosa Menkman recognizes the uncanny and sublime in glitch art as “cool” (in the McLuhan sense), and calls to resist the formalization of an “avant-garde of mishaps.” There are indeed several commercially-available plugins and toolkits for creating specific styles of glitch, so it noteworthy that within the /r/brokengifs community, two of the most prolific and consistently-popular creators — ‘denvolnov’ (Denis Volnov) and ‘chepertom’ (Thomas Collet) — have developed signature styles of GIF breaking through more direct file manipulations. In this talk, I will analyze selections by these two artists alongside some of my own broken GIFs as I attempt to reverse-engineer these two artists’ techniques. Pinning down a particular methodology of “hot” glitch art in this way will underscore the GIF as a cultural construct and contextualize the broken GIF as a contemporary and emerging new aesthetic.