Crossposted to the Scholars’ Lab blog
This spring each Praxis Fellow will conduct a workshop on some aspect of digital humanities. Because I study the history of twentieth-century American mass media (especially the radio), I have increasingly found myself interested in sound. During the last few weeks, I have read about the possibilities of using digital tools to analyze and present sound. I plan to share some of my findings in a workshop called “An Introduction to Sound and Digital Humanities.”
I have several goals for this workshop. First and foremost, I want to create a learning environment that is welcoming, inclusive, and relevant. Second, I hope to convey why studying sound is important. Recent sound studies scholars such as Jonathan Sterne have critiqued the idea that a culture of seeing replaced a culture of hearing; yet, this idea still lingers inside and outside of the academy. Third, I want to introduce some of the possibilities and limitations of using digital tools to analyze and present sound. I have divided this workshop (and the rest of this blog!) into three parts: Introduction, Analyzing Sound Collections, and Manipulating and Presenting Sounds. This workshop design is still in progress so I’d appreciate any comments or suggestions!
My first priority as a teacher is to help cultivate an inclusive learning environment where all feel comfortable participating. The design of a space affects both how sounds are made and how people feel welcomed. I plan to rearrange the space into a circle if possible. As a teacher, I enjoy playing music related to the day’s content as students enter my classroom. Music can provide a memorable entry point into a topic. For this workshop, I imagine playing something like Miles Davis’s “Shhh / Peaceful” from his 1969 album In a Silent Way.
I want to begin the workshop with some questions that have a low barrier to entry. For instance, I plan to ask, “What is a sound you love and how does it make you feel?” and “What is a noise you dislike and how does it make you feel?” After giving attendees a chance to think through these questions on their own, I’ll have them share in partners and eventually share with the entire group. By beginning the conversation with reflective questions and a think-pair-share technique, I hope to encourage attendees to feel comfortable contributing verbally. I also hope this introduction will help attendees think about sound in their own lives and the connection between personal and scholarly knowledge about sound. As attendees share their answers, I will guide the conversation to tease out two ideas crucial to sound studies: 1) humans do not just hear with our ears but, as scholar Steph Ceraso has argued, with our whole bodies and 2) how the binary of “sound” and “noise” usually reflects a binary of listening to sounds that are “wanted” or “unwanted.”
ANALYZING SOUND COLLECTIONS
In this section, I hope to encourage attendees to think about the benefits and challenges of using digital tools to analyze sound collections. Much of my thinking here is informed by DH scholar Tanya Clement’s chapter in the edited volume Digital Sound Studies (2018). In this chapter, Clement describes how PennSound scholars thought about the ability of big-data to study large spoken-word audio collections for the NEH-funded High Performance Sound Technologies for Access and Scholarship (HiPSTAS) project. Incorporating Adaptive Recognition with Layered Optimization (ARLO) software used to identify bird sounds, scholars working on HiPSTAS developed a standardized classification system to tag digital audio media so that it could be more searchable. However, as Clement notes, scholars soon came to realize that while classification systems were necessary, they were also limiting. Not everyone agreed on how particular sounds should be classified. Ultimately, Clement proposes that political and historical contexts are embedded in standardized classification systems thereby turning up the volume on how some sounds are heard while turning down the volume on alternate interpretations.
I plan to begin this section by reflecting on how sound files present challenges for scholars because they usually have to be listened to in time. For instance, I will describe how a well-prepared student may have recorded a lecture (ideally with permission!), but the student looking for a key detail from that lecture may still be required to listen to the entire fifty-minute clip just to find the desired detail. The longer the recording, the less inclined a busy researcher is to want to listen. Thus, while sound collections may have tens of thousands of hours of sound files, many of these sounds remain unheard. Thankfully, scholars have utilized machine learning to help organize sound files and make them more accessible to researchers.
I will then introduce an activity that helps students understand how this process of using machines to classify sounds works. I will start by playing three brief sound clips and asking attendees to identify five key words they would use to identify each sound clip. Though I have not finalized which audio pieces I plan to use, I usually try to incorporate material that will seem relevant for my audience. I am currently planning to use a mix of clips that draw on local history and recent popular culture. For instance, I am currently thinking about including a snippet from the 1970 anti-war protest at UVA, a soundbite from an oral history by a UVA women’s history professor, and a brief clip from Spike Jonze’s 2013 film Her.
After attendees have created their five key words for each audio clip, I will give them time to share their results with a partner. Did they come up with the same key words? Different key words? Why or why not? I plan to reconvene the entire group to solicit feedback about the experience. What did they think about the sound clips? How did their key words compare with their partners? For those that had different key words, why might this hinder searchability? How could we improve this process so as to limit these differences?
After teasing out that standardizing keywords may help in this process, I hope to play the clips one more time. This time, I will have attendees classify each clip using the tagging schema developed by scholars working with the PennSound poetry collection. I hope this will demonstrate how standardization can help (though not completely solve) the dilemma of scholars selecting different keywords. At the same time, it will also demonstrate how standardized classification systems can sometimes hide as much as they reveal. For instance, the PennSound tagging schema was designed for spoken word poetry. As a historian, I selected clips that will likely invite different questions and categories. What classifications seem less relevant for these clips? What additional questions or classifications would be helpful for the three clips we heard? Finally, what sounds from these clips might be helpful for a computer to learn to recognize in other clips?
MANIPULATING AND PRESENTING SOUNDS
Analyzing sound collections is only one aspect of how digital humanities can augment sound studies. As time allows (I am envisioning the final 10-15 minutes, but if short on time this section could be cut), I hope to conclude by gesturing toward how digital tools can also allow us to manipulate and present sound. For instance, I plan to mention how attendees can use editing platforms such as the free and open source platform Audacity to create and edit sound files. I hope to ask attendees how manipulating the three sound clips we heard might promote new insights (possible ways I’ve thought about this include editing the lengthy oral history to make it more accessible to the public, adding music to dramatize the Antiwar protest clip, or manipulating Scarlet Johansson’s pitch in the clip from Her to stimulate conversations about pitch’s role in performing and hearing gender and sexuality). In this final conversation, I hope to convey how using audio technologies to manipulate sounds can be beneficial but also has limitations (see for instance Tina Tallon’s article on how audio technologies have distorted higher-pitched voices or how deceptive editing of interviews shaped public perceptions of planned parenthood in 2015).
I plan to conclude my workshop by making a few final remarks using Wendy Hsu and Jonathan Zorn’s Paperphone. In this wrap-up, I will thank attendees, subtly remind them of some key takeaways, and invite them to play with Paperphone tool after the workshop has ended.