Over on Vice Motherboard, Michael Byrne recently wrote about his desire for “an Instagram of sound.” He says

What I want is a place to hear things that people record in the spaces around them. This seems reasonable to me: An app with just one button to record and another to share. I’d have fewer “friends” than on Instagram, in the realm of sound, but there would surely be some. And some who use the app would be pushed to find better and more interesting sounds, and to appreciate those sounds in new and different ways.

There are already such apps–Audioboo is the one I use (there are plenty of others, as summarized here). Audioboo is a social network for sound-sharing; people follow me on Audioboo, but I’ve also linked my account to Twitter so I can also tweet sound clips and share with my twitter followers, just like I would with Instagram (if, that is, I used Instagram with any regularity). I wish it was as popular as Instagram, Snapchat, and Vine…but it’s not.

I don’t think this relative lack of popularity is primarily due to the fact that, as Byrne argues, we’re trained to use vision as our dominant sense. Certainly that’s part of it, but that’s not the only (and perhaps not even the primary) reason. I think sound recording is a different medium than both photography and even Vine’s short-attention-span videography, and that maybe this medium isn’t as well-suited as photography and videography are to the kinds of tasks we generally want to accomplish on social media. So, the controlling factor here is social media, not auditory or visual content–they’re just means to the end of social mediation.

How do pictures and 6-second videos function socially? Though people do use social media in “off-label” ways, these platforms reward virality over comprehension/comprehensiveness. Twitter counts followers, favorites, retweets, and mentions, and Facebook counts likes, shares, comments, and friends. The design of these platforms prioritizes the volume and pace of interaction (i.e., the sociality) over the digestion of content as such. Content is just a means for socialization (an interesting parallel to commodity fetishism, in which commodities are means for transacting social relations). Because it is just a means, the best, most easily shareable content is the kind you can skim and scan very efficiently. I’m always scanning an article rapidly and then retweeting or sharing it on Facebook without really spending any significant time digesting it. It’s harder to fast forward through a sound than it is to scan a text (though the visible wav files in soundcloud make this easier–I can pick out the climax or the best part of a song by some educated guesses made on the basis of the audio’s visualization.) Sounds might be less efficient means of accomplishing the social work tasked to social media.

We do have something akin to a sonic or musical version of the Vine, at least at the level of format: ringtones. These are short snippets of songs or sounds, often looped, that we use to communicate something about ourselves to others–our favorite song, our business-minded, no-frills attitude, whatever. (FWIW, my text sound is the two chimes at the beginning of Depeche Mode’s “Personal Jesus” because I want to signal to other sound geeks and music fans…and maybe now the title of this post makes more sense?) But ringtones have a different social function than Vines, because they’re not shared or liked on social media. They might alert us to social media/SMS/telephonic activity, drawing us back in to conversations (and, in this way, are a nice example of sonically augmented reality), but the ringtones are not the tokens passed around in conversation that facilitate that conversation. From another perspective, ringtones also have a different capitalist function: they’re not the means to generate social media activity, which is what Facebook et al farm and sell as data. I think this is another important reason why Audioboo and their competitors aren’t as widely used as Instagram and Snapchat–it may not be as easy or as profitable to generate sellable data from sound-sharing.

Byrne suggests additional reasons why audio social media hasn’t taken off. I want to address two of them. First, he argues that most photographs are representational (they are “of something,” as Byrne puts it), whereas sounds are not necessarily representational. This is not exactly untrue, but it’s not entirely correct, either. I would argue that most people hear and listen primarily for content. In my experience, laypeople (i.e., my students) don’t just listen abstractly to sound as such, they try to figure out who or what is making the sound, or what the sound means. We also tend to reduce “sound” to either music or human speech/voice. For example, I was guest teaching a Sociology of Gender course at Luther College this past fall, and I asked the students to think of the ways in which non-musical sounds are gendered, or the ways in which we hear gender. The first round of responses were all about human vocalization and speech. Their assumption was that the main or only type of non-musical sound was human speech; these are the two types of sounds that are about something, that have a content. The students didn’t even consider the possibility that there might be abstract, content-less sounds. All this is to say: I don’t think most people hear and listen that differently than they see.

Second, Byrne suggests that photography tends to (literally) focus on a subject, picking it out of its ambient environment. He argues:

Ambience is a realm where the visual and the auditory match up pretty well in terms of ability to represent. It’s here that sound might even be able to beat out sight. This is because ambience is interested in a subject AND all of the junk around that subject equally. In other words, it’s interested in everything—and everything doesn’t have a subject. Put another way, there’s no primacy in “everything,” however that actually works out in reality. It’s a stew.

According to Byrne, Instagrammed pictures have a subject, but audio recordings can’t focus so narrowly and capture “everything.” But sound isn’t inherently any more ambient than photography is. The difference Byrne notices seems to be more an attribute of the uneven development of video and audio recording tech in the average consumer smartphone. Your average smartphone has a really great camera but a not-so-great microphone for picking out individual sonic phenomena/subjects. (You can get mini synth and mixer apps that will let you “filter” those sounds just like you filter your Instagram photo.) If every smartphone came with, say, a bluetooth remote mic, we could easily focus our sound recording, too.

This idea of sonic ambience links Byrne’s post with Wayne Marshall’s great post on hours-long YouTube videos of ambient sounds, which he calls (brilliantly) vacuumtube. (And, you should open that link in another tab and go read the post–it’s really great.) Vacuumtube is an example of the kind of sound recording and sharing that Byrne discusses in his post. The “9 Hours of Suck” video Marshall cites has 40,000+ views, 45 comments, 107 likes, and 12 dislikes. However, because these videos are sooooooo long, Vacuumtube seems to be causally tied to YouTube’s platform. They are, as Marshall argues, “native” to YouTube:

This particular form’s nativity is, of course, directly related to one relatively big affordance: the unprecedented access to time that YouTube now provides. People, especially the non-Warhol sort, just didn’t typically make 7-12 hour films very frequently prior to the advent of unlimited time on YouTube. So one emerging “native” dimension of vernacular video we might lay at YouTube’s feet is the sudden desire to exploit the “platform” as something other than a visual medium — but not just as a jukebox, rather as a long duration white noise machine (or pink, if you prefer).

Because YouTube, unlike, say, Soundcloud or AudioBoo, has no limitations on the length of recording you can post, it is a better medium for really, really long ambient sound recordings. I wonder: Is there no “Instagram of sound” because the Instagram platform isn’t best suited to the kinds of social things we want to do with (ambient) sound recording?

So, my question boils down to this: There may not yet be an “Instagram of sound”–but is this due to the nature of sound and/or photography, or is this due to the nature of social media, as a type of sociality, as a media platform, and as a business model?

Robin is on Twitter as @doctaj.