Visualizing 39 million Reddit comments and linking the users to their related subreddit interests. Then smoking a menthol at 7:41 a.m. in 32 degree weather, wearing a coat but also shorts, watching the sun rise and having a long think.

You gonna eat that?

A few quick notes as a preface. The GitHub for the tools used to create these visualizations is located here: https://github.com/anvaka/sayit/. Want to look through the data yourself? Here’s a link: Reddit Public Comments (2007–10 through 2015–05). Why does this focus on September of 2018? Because that’s what was easily available. Torrent took a while to finish. You can do this, too — it isn’t hard like living paycheck to paycheck with a four-year degree. The live demo is located in that GitHub link.

Commentary as a data set

The internet, while providing a profoundly comprehensive means to obtain staggering amounts of data, offers an outlet not nearly as readily available 30 years ago: the ability to inform a random stranger that they are a shit-eating subhuman. Maybe that’s just specific to my case but, unpacking that thought further, it offers the ability to provide commentary. That’s important to the human experience, for better or worse.

Using the data of 39 million comments through the month of September 2018, each comment assigned a unique identifier per user, a linking structure is often evident. This presents itself in one of several ways: it’s exactly what you’d expect and mundane, it’s exactly what you expect and provides insight on collective interests, there’s a conflict of narrative or there’s an undercurrent driving narrative intentionally. I understand “it’s exactly what you’d expect” is sort of subjective but I’ll better explain that shortly.

Takeaway: Identifying propagation and narrative models on Reddit based on comments users make in subreddits outside of your focus can identify false narrative and subcultures pretty effectively.

‘It’s exactly what you’d expect.’

Before going any further, let’s take a look at what an absolutely mundane model looks like. A good example of what a mundane structure looks like are links associated with subreddits focused on cities. I live in Knoxville. Taking all comments made in r/Knoxville in September 2018, sorting IDs and then generating a cumulative total of what subreddits which users posted on: Knoxville’s model looks like this.

I promise I’m not the r/espionage link.

This is, well… exactly what you’d expect. Users who posted to Knoxville in September also heavily posted in the college football, Titans, CFC and surrounding community/sports groups. The reason you get the more fringe results to the right is due to the total unique users in the subreddit not being varied enough to create a more robust model. While I’d be totally down living in a town concerned with bioethics, espionage and Jungian Typology — Knoxville just allowed wine in grocery stores this year. We’re not quite to that stage of progressive thought.

Aside from an anomaly we can absolutely provide a reasonable narrative for: we’re a college town interested in sports with a football team that almost got a participation award last year and may be invited to play in the SEC next year. This is what an average model looks like in the community setting.

So let’s make it weird

It’s not like I needed to write this to remind me, but there’s totally a fetish for this somewhere.

We’re going to be transitioning into some topics that go way into NSFW after this line. It’s just text but it seems reasonable to provide that disclaimer.

Now that you’ve established a baseline of what it looks like when individuals focused on a specific community interact with similar communities, we arrive at the obvious next step. No, that’s the one after this. Time to abuse that subhead feature again —

NSFW Subreddits: A clustering fuck

Subreddits based on pornographic content fall under “exactly what you expect and provides insight on collective interests.” Did I have to pick pornographic subreddits as my focus? No, automotive interests would have been a more rigidly clustered graphic; however, that wouldn’t allow me to point out nearly as many weird things. And I wouldn’t get to mention r/ClopClop. What kind of passive observation of grouping doesn’t include r/ClopClop? You’re right. Every one of them. With reason.

With the example of r/Knoxville complete, here’s what r/GoneWild looks like. Do you see the trends in the model?

The three people that made it this far just grinned and agreed to read at least two more paragraphs.

I’m not here to mansplain something that’s obvious but I’m also not into letting the reader come to a conclusion. I blame comment sections for that mostly.

Looking at the groups that users commenting in r/Gonewild also comment in, a blatantly obvious model presents itself: the top portion of Gonewild’s commenters actively comment in subreddits focused on male-centric media; the bottom on females. Hard enough where neither of the two groups really interact at all. That isn’t to say that thousands of users aren’t posting in non-heteronormative NSFW subreddits: just that way more thousands are. I’m not trying to marginalize anyone because the next sentence has a solid point.

None of this data is worth a damn.

“Wait, you’re telling me that users posting thirsty comments on media generated by users that also conveniently have their own subreddit also post thirsty comments on other NSFW subreddits? And it’s somehow a polar model focused on which genitals the user would prefer to see?” I know. I’m renowned for my groundbreaking research.

This is where numbers cause more of a problem than a solution. Given the profound amount of traffic the NSFW main subreddits see daily, the model above is really the only possible presentation. But what happens when you become obscurely specific?

From rags to tits-…age

One of the websites I try to hit up at least once every month or so is redditlist.com. While the subscribers value is pretty much moot, the “Growth (24 Hours)” column is interesting in evaluating buzz loud enough to be picked up through the ground. Let’s take a look at NSFW subreddits that have had the most growth in the last 24 hours.

I knew people had a thing for Flo but who knew car insurance was hot.

What interests me in this list is that a subreddit I’ve never seen has topped the list. To save you the trouble of purchasing two bottles of Dr. Bromer to sufficiently wash this post off your brain later, I’ll just grab the text from the sidebar.

What if… Women weren’t limited to the assets given to them at puberty? What if they could grow more later in life? A rare subset of women have! This subreddit celebrates and chronicles their growth.

The entire subreddit, happily chugging into year number three of existing, focuses specifically on before-after pictures of women who have larger breasts in one picture than they do the other. No underage Before pictures allowed. And, uh. That’s about it. “Here’s me with A-cups in (20xx) and with God’s grace your upvotes will soothe the lower back pain my far larger breasts provide me on a daily basis.”

At that juncture, we can probably make a reasonable assessment that individuals gravitating to this subreddit are likely into the enhancement of female body features (hormonal, through pregnancy or Other). Let’s break down the interactions of those posting in this group with subreddits they most commonly interact with.

I’m not going to give you the satisfaction of a breast pun right now.

That’s… a lot to unpack. Opening Photoshop, I’ve included several poor flipper-baby excuses for circles around hubs that emerge in these models. In circumstances like this, where a broad topic is comprised of increasingly niche subtopics (sexual content and cars are really the most defined topics), you can follow connected topics through associating them logically.

Let’s look at the breakdown in sections.

I would have never kept your attention this long if I went with cars.

The top and left of the model are pretty standard Thirsty Internet Dude characterizations. Asians and celebrities. While I thought r/NostalgiaFapping was a place where I could cry while masturbating to a time when I had no bills and enough money to genuinely enjoy my finite time on this spinning rock in empty space, it’s really just old pictures of Christina Applegate and comments from users feeling the necessity to inform the public they first obtained orgasm to thoughts of people on the television. I just wanted to save you time in case you saw the subreddit name and were on the fence about putting the rest of this on hold while you had a teargasm.

You’ve also got a solid camgirl arpeggio rocking out to the sound of transferred tokens and twerk slaps going on in the right corner.

Watching the Titty Drop at midnight is a mutually shared human experience.

The middle of the model begins to get into various linked interests with regard to the specific model and serial number of breast that the user both presumably likes to see and feels the need to leave comments on. The left side of this picture pretty clearly illustrates this: users are also interested in Stacked and a contingency of those users are active in 2busty2hide and a smaller contingency of those users further focus on cleavage, petite women with large breasts and juicy Asians. As another further clarification, the modifier of “juicy” has little to do with the preparation or grilling method of Asians. Or even cooking. It’s just Asians with large breasts.

The left side of the model is kind of a loose connection outlier clustering of individuals with their own subreddits and broad breast-related content like AmazingTits.

Internet: Catering to specific fetishes since UseNET.

The bottom portion of the model is interesting because the clustering obviously begins to focus on specific aspects of breast related fetishes. Individuals involved with EngorgedVeinyBreasts are heavily involved specifically with large breasts but not large people. This can be observed in the connections between BigBoobsGoneWild/homegrowntits/BigBoobsGW/voluptuous. While gonewildchubby is a related topic, note that it doesn’t interact with BBW hardly at all. It’s grouped close to the topic, but there’s little user interaction between users commenting in groups specifically focusing on large breasts mounted upon small to “chubby” (so like, Size 4 to Thirsty Internet Dudes) women.

To the lower-right of that group, individuals interested in natural Progressive Enlargement through pregnancy. User interaction is focused more on media featuring lactation than pregnancy; however, related topics pretty much cover every specific way an erogenous zone can be altered while gestating or during breast feeding.

Continuing right, the clustering changes to individuals who are looking for both large breasts and women who specifically fall into whatever weird rulebook a sidebar defines as BBW. This spans numerous subreddits that start with the prefix BBW and continues to more obscure content like FitToFat, a subreddit dedicated to appreciating side-by-side pictures of women gaining weight — and also a term I use for myself when describing what drinking a six-pack nightly did to my already ratchet dad bod.

Finally, the right cluster involves users looking for a stereotypical “bimbo.” That’s a pretty subjective, negative and chauvinistic terminology; however, considering the clustering of subreddits that are variables of BoltedOnTits — I’d venture to guess the preference in content involves women that have undergone cosmetic surgery, thus providing Thirsty Internet Dudes an opportunity to make slut shaming comments before furiously masturbating to the idea of a world where people interact with them willingly and for free.

( o ) ( o )

Now you’ve seen three pretty comprehensive examples. An environment where community focuses on community with little outside scatter. An environment focused on amateur pornography that displays a distinct split between individuals interested in the male and female forms. And an example where, deciding to write this around 4 a.m., I didn’t anticipate typing the phrase Bolted On Tits. That last example provided an example of how niche begets niche.

The thing to notice in all three of these examples is that things span out. There’s some chat between related-of-related subreddits but so far it’s been pretty infrequent.

Which brings us to an example I find kind of interesting.

Like you didn’t see these Two Scoops coming

Trump excused himself shortly after this picture to locate a stick that would assist in the shedding process.

Before I get into this wheel of crazy, one other model would be helpful to explain. Numerous accounts on Reddit are simply bot/beacon posters. Here’s what that looks like.