I've long wondered how the proportion of shipping in different categories (F/M, M/M, F/F, etc) compares on Wattpad, Fanfiction.net (FFN), and Archive of Our Own (AO3). There have been a number of past analyses of shipping on AO3, where gathering stats is easy. Recently, Franzeska did a study investigating shipping categories on FFN. I decided to gather similar shipping data for Wattpad, and ended up analyzing 683 fanworks (and 2000 Wattpad works overall), with the help of a small army of fantastic volunteers. :) See acknowledgements at the end.

Categorizing these works was a difficult task. Similar to what Franzeska found when analyzing FFN works, many of the Wattpad works are extremely poorly labeled in terms of the kinds of ships and pairings involved. And on Wattpad sometimes even the fandom is unclear -- or whether it's a fanwork at all! Not only that, but many Wattpad works are not fanfic in the sense in which I am used to on the other platforms. There are some "stories" that are just collections of images or facts about celebrities. For instance, there are imagines, which contain a collection of short scenarios, some of which may be shippy and some not. There are "stories" that are requests for prompts, including pairing prompts, and occasionally later chapters contain some fanfic. There are also a lot of works that are tagged with fandoms or pairings, and/or classified as Fanfiction, but the works themselves contain only a single sentence/short passage that does not clearly connect to the metadata. So we did the best we could, but keep these uncertainties in mind as you read, and see further explanations below in the Methods section.

A couple notes to keep in mind about this graph:

For Wattpad and AO3, the categories in the above graph are not mutually exclusive -- a work can be in both the F/M and M/M categories, for instance, if there are multiple ships. Meanwhile, FFN categories are mutually exclusive, and works with poly/multi relationships end up in the "Other categories" bin (not pictured). It's definitely fudging things a bit to compare them side by side like this, given that discrepancy, but given that there's less than 1% of FFN works that fall into that "Other categories" bucket, it doesn't have much impact.

-- a work can be in both the F/M and M/M categories, for instance, if there are multiple ships. Meanwhile, FFN categories are mutually exclusive, and works with poly/multi relationships end up in the "Other categories" bin (not pictured). It's definitely fudging things a bit to compare them side by side like this, given that discrepancy, but given that there's less than 1% of FFN works that fall into that "Other categories" bucket, it doesn't have much impact. The graph above also omits many fanworks. In addition to the categories pictured above, 11.1% of Wattpad works were categorized as shippy, but the relationship ship category was unclear .

. On AO3, these are not the only relationship categories available. In addition to those shown in the graph, there are also some works tagged "Other" (2.7% of works), some tagged "Multi" (4.8%), and some with no specified relationship category (4.9%).

Franzeska said about her FFN analysis that the distinction between het and gen is often unclear, and "someone else could redo this analysis and end up with a bigger gen category and a smaller het one." The same fuzziness between the categories and resulting uncertainty applies on Wattpad.

Okay, that said, what does the above graph show?

About 62% of Wattpad fanworks are shippy (i.e., 100% - Gen in the above graph).

(i.e., 100% - Gen in the above graph). Each platform has a unique distribution. Wattpad has more Gen than anything else , with F/M the next most common. FFN has far more F/M than anything else. AO3 has far more M/M than anything else (for reasons I've discussed some in previous posts).

, with F/M the next most common. (for reasons I've discussed some in previous posts). There is significantly more Gen on Wattpad than either of the other two platforms. (But see above caveats; it's possible volunteers mistook as Gen some works that actually were shippy, or else categorized fuzzy fics inconsistently. A different analysis might show the amounts of Gen on Wattpad and FFN as closer than they appear here.) And there's significantly more Gen on FFN than AO3.

(But see above caveats; it's possible volunteers mistook as Gen some works that actually were shippy, or else categorized fuzzy fics inconsistently. A different analysis might show the amounts of Gen on Wattpad and FFN as closer than they appear here.) And there's significantly more Gen on FFN than AO3. There is significantly more F/M on FFN than the other two platforms -- and Wattpad has significantly more than AO3.

-- and Wattpad has significantly more than AO3. AO3 has far more femslash than both the other two platforms , though it's relatively rare everywhere. AO3 also has far more M/M than the others, clearly.

, though it's relatively rare everywhere. than the others, clearly. The amount of M/M is not significantly different on Wattpad vs. FFN. The same is true for F/F.

(Significant here means that p < 0.05 on a one-tailed Z test).

So how many of these ships include original characters, or feature the reader as a character?

For notes on assumptions I made about gender of Reader, see Methods section (but TL;DR, these numbers may be slightly off).

Within this sample of fanworks, 38% of the 211 Wattpad F/M works involved an original character in their ship, and 5% involved the reader. (In the remaining cases, either only canon characters were involved in the ships, or it was unclear.)

(In the remaining cases, either only canon characters were involved in the ships, or it was unclear.) By contrast, there were very few M/M works involving OCs or Reader (3 total out of 132 M/M works) .

. There were only 21 F/F works in this sample, total, so there's not enough data to generalize from. One of those works was F/Reader (but also contained many other ship types; it was a collection of Hamilton ficlets).

I don't yet have similar percentages available on AO3 or FFN for comparison, though I hope to eventually.

As I was analyzing ships on Wattpad, I saw that a ton of them involved celebrities (real person fiction, or RPF). So I compared the amount of shipping that's involves celebrities vs. fictional characters, across platforms:

Note: RPF is not allowed on FFN.

So, that's a HUGE difference between platforms. Interestingly, the percent of RPF is greater for shippy fanworks on Wattpad than for non-shippy ones (76.2% vs 61.6%).

What are the fandoms that contribute to this pattern? Well, I haven't fully finished analyzing the individual fandoms, but here's an initial rough breakdown of the most popular individual fandoms and the most popular types of RPF:

RPF fandoms are colored shades of red, while non-RPF fandoms are colored shades of blue. That final unlabeled blue stripe is Marvel. The works listed as "not classified" have fandoms that (mostly) don't fit into the existing categories; they are smaller fandoms. That is, I'm pretty sure I found all the BTS, One Direction, Harry Potter, etc. -- so those individual fandom slices probably won't change much in size as I normalize and analyze the rest of the fandoms.

I haven't yet done a full analysis of which individual pairings are most popular on Wattpad, but at least the fandom breakdown gives some hints of what we might expect.

Methods

There are over 20M works on Wattpad. I randomly sampled 2000 of those works. Each Wattpad work includes a unique number in the URL, so I generated URLs with random numbers in the correct range -- see previous Wattpad analysis for more details. For each work, I scraped all the metadata available, including title, summary, and tags. (Raw data here.) . I used Google Translate to detect language and translate title/summary/tags into English, and put those in the spreadsheet as well. Then I recruited volunteers from Tumblr and Twitter to help hand-categorize the works (see endnotes for acknowledgments).

I gave volunteers instructions for categorizing works. Since Wattpad allows all kinds of works, not just fanworks, I first asked whether the work appeared to be a fanwork. (Wattpad has genres, and those genres include Fanfiction, but it turns out that some fanworks get placed by their authors in other genres, like Romance, Teen Fiction, Mystery/Thriller, etc.). 683 of the sampled works were categorized as fanfic by the volunteers -- enough to give a 99% confidence level that these analyses generalize to Wattpad overall, with +/- 5% margin of error.

I then asked volunteers whether the work appeared to be shippy. If not, I categorized it as Gen. If so, I asked for category (e.g., F/M, M/Reader). Volunteers could reply in any way they wanted, but the most common responses, in order, were:

F/M

M/M

M/Reader

F/F

M/F

F/Reader

For all categories and frequencies, see raw data.

I counted fics with multiple shipping categories in all of those categories -- hence, the categories are not mutually exclusive. For works featuring a ship that was Reader x Male Character, I've classified them above as F/M works. There may be a few errors in doing so, but having skimmed some of these works, they almost all appear to assume a female reader (if they don't they usually seem to specify Male Reader in the metadata). There was one work that a volunteer identified as Male Reader x Male Character. I verified that the single Reader x Female Character ship assumes a female reader, and is thus F/F.

I counted everything that contained the string "F/M", "M/F", or "M/Reader" as an instance of F/M (as described above, "Reader" is usually assumed to be female). This included a few instances of multiple categories, such as "M/M, F/M" or "M/M, M/Reader", as well as an instance of "M/Female 1st person" and an instance of "multiple F/Ms". I counted everything that contained the string "M/M" as an instance of M/M, including multiple category fics and "M/M/Reader" (which was also counted as F/M, because it contained "M/Reader" as well). I counted everything that contained the string "F/F" or "F/Reader" as an instance of F/F, including "F/F?" and "F/F, M/M, F/M." I omitted from the above buckets anything that was judged to be shippy but where the person categorizing it couldn't identify the shipping category. Also omitted were some miscellaneous items that appeared once or twice each: "Sibling Relationship", "possibly poly?" "M/OFC" (which should have been counted as F/M, but I didn't find that error until later), and "N" (which appear to be cases where the volunteer entered data in the wrong column).

I counted things like "F/M, M/M" or "M/M/F" in both the relevant categories rather than making that combination its own category for two reasons: (1) AO3 returns a fic with both those tags for both a F/M search and an M/M search, so counting this way made it most analogous to AO3; and (2) Franzeska did split these multi/poly fics into their own categories, but found only 1-3 of each, so that is a negligible difference in the FFN breakdown (therefore, it's not unreasonable to compare all three side by side although we used slightly different methods of bucketing).

If we instead break out each separate category on Wattpad and do not double count (i.e., each of these is mutually exclusive), we get the following:

F/M: 235 works (34.4%)

M/M: 121 works (17.7%)

F/F: 20 works (2.9%)

F/M, M/M: 9 works (1.3%)

F/F, F/M: 3 works (0.44%)

F/F, F/M, M/M: 1 work (0.15%)

M/M/F: 1 work (0.15%)

How accurate are the categorizations?

I don't have a quantified answer, but I can describe which aspects I'm more and less certain about and give an overall sense of which aspects may be more and less reliable. (If anyone wants to help make evaluate further, please LMK!)

I categorized 36% of the works myself, and I went pretty in-depth in trying to identify shipping category. I looked first at the title, summary, and tags, then skimmed the first several chapters of the fanwork if needed to clarify. It actually turned out to be harder to judge whether something was a fanwork or not than to judge shipping category. A few of the things that the author had categorized as Fanfiction were actually just ads or other spam/mislabeled works. Others claimed to be fanworks, but were short (unfinished) fics that were so far just about OCs. In other cases, it seemed like maybe the author was just using images of celebrities to represent their OCs -- sometimes they would start with a cast of characters, where a celebrity would be listed next to an OC name. Is that an AU, or just fancasting? I generally did not consider those to be fanworks, but I tried to note them for future analysis. Despite these ambiguous and mislabeled cases, most works were fairly clear as to their shipping category. RPF was harder to be sure of, partly because of the ambiguities I mentioned around whether celebrities were appearing "as themselves" in stories, But also because for some fanworks written in other languages/regions of the world (especially in regions where Google Translate is not as good and I lacked any cultural familiarity), it was hard for me to identify which celebrities were being written about. There were also a number of nonfiction works about celebrities -- collections of facts or photos, which I considered to be fanworks, but neither shippy or RPF.

Because I was asking for help from volunteers for the other 64%, and I was asking them a bunch of other questions besides shipping, I did not ask them to dive deep into the text of the work when title/summary/tags were ambiguous; I just told them to take their best guess based on the metadata (and I did not ask them to do a bunch of Googling to try to identify which celebrities were involved).

However, I left some of the instructions ambiguous, like how to determine whether something was "shippy" or "RPF" (I didn't share the reasoning that I used above, because I developed it only after looking through a bunch of works myself). Would a fanwork featuring a canon ship, or featuring a ship only incidentally, count as shippy? I didn't investigate how people answered, and whether the answers were consistent. (It would be lovely if I'd had enough patience and/or volunteers to ask multiple raters to categorize each work, so inter-rater consistency, but I didn't.)

It's clear from the notes that volunteers left that many went above and beyond, skimming the works and leaving notes about what they contained and where confusions arose. 11.4% of all 2000 works that were hand-categorized had such notes attached. (I left notes on 15.4% of the works I categorized, especially as I kept finding more things I might want to drill into further.) Many of the notes were about interesting characteristics of the work and did not affect categorization (e.g., "A photo of a handwritten sheet of paper by a child"). Some were about ambiguities relating to whether the work was a fanwork, or what fandom it was in. Some of the notes did say that it was unclear whether the work was shippy (e.g., "From skimming this it doesn't look shippy but its not finished so it might end up with a ship between the main OFC and one or more of the band memebrs"), and other notes said that shipping category was unclear (e.g., "Group chat transcripts. tagged girlxgirl and boyxboy and gay, but also tagged with a bunch of F/M and M/M relationships [ship names] but no F/F").

I have not done a comprehensive breakdown of the types of ambiguities described in the notes. However, I did look at how many notes contained the terms "ship" or "rpf" as a rough guide to how many of the fanworks analyzed here might be miscategorized. 23 of the notes (on 1.2% of the works analyzed, or 3.4% of fanworks) contained one or more of these terms. (No notes contained the term "pairing" or other synonyms I could think of.) . All in all, I feel somewhat confident that the above graphs are not wildly inaccurate, but if anyone wants to help provide another set of ratings, we could check for inconsistencies more carefully, which would help increase (and quantify!) confidence levels.

Future work

There's still lots to do to analyze the Wattpad data in more detail. Among the future analyses I'm hoping to do:

fandoms

ships

langugages

ratings (I was going to include an analysis of ratings in this post, but it looks like I scraped this info wrong, alas :P )

genres

AUs

freeform tags

common patterns in summaries and titles

popularity metrics

deletion rates

completion rates (including the rate at which works are officially place "[ON HOLD]" in the title)

imagines, preferences, roleplay, fact lists, and other fanwork types that aren't traditional fanfic

other interesting patterns that volunteers noted

But if anyone wants to do these or related analyses, please help yourself to the data, and I'd love to see whatever you find! :)