Fig2: Free trial users every quarter; Source: Analysis by Srinivas Vadrevu of Trial users in each quarter from 2017Q1 to 2019Q4 based on data from 2019Q4 Financial statements

Furthermore, interestingly 2018Q4 has seen a surge in trial users to 9.2 million users, followed by an increase in its 2019Q1 paid subscriber base by 9.6 million users. Did most of those trial users become paid users creating a peak for paid users in 2019Q1?

While the trial users have been decreasing every quarter in 2019, the paid net additional users are increasing from 2019Q2 to 2019Q4. Are users directly creating paid accounts without a trial period!? Without user-level data, it is difficult to answer these questions.

How did the content spend per subscriber change over the quarters vis-a-vis’ the revenue per subscriber?

The revenue and marketing spends are mentioned in the historical segments section of the 2019Q4 financial statements. The calculation for content spend is a little bit calculated due to different types of content that Netflix streams.

Fig3: Type of Netflix Content; Source: Netflix Investor Relations- Jan 2018

The account treatment document as of Jan-2018 mentions that content spend is to be calculated as Addition to streaming content assets + Change in streaming content liabilities.

The following table gives you a snap-shot of the financials in terms of paid user additions, Revenues, Content spends and Marketing spend for each of the last 12 quarters.

Fig4: Revenue, Marketing and Content spend analysis; Source: Analysis by Srinivas Vadrevu based on data from Netflix financial statements 2019Q4

Fig5: Average revenue and spend per paid user; Source: Analysis by Srinivas Vadrevu based on data from Netflix financial statements 2019Q4

Average revenue per paid user is increasing over the last 4 quarters- increased from $29 in 2018Q4 to $32 in 2019Q4

Content spend per paid user on platform increased gradually from 2017Q1 to 2018Q4. It fell drastically by 20% to $20 in 2019Q1 and has been increasing in the last four quarters till it reached $27 in 2019Q4.

Marketing spend per paid user on platform has been almost consitently same for the last twelve quarters — $2-$3

Fig 6: Content cash spend and paid net member additions every quarter; Source: Analysis by Srinivas Vadrevu based on data from Netflix financial statements 2019Q4

To explain the aforementioned fall of content spend per user in 2019Q1 , let’s take look at the relationship between total content spend and the net paid user additions every quarter. In 2019Q1, the content spending decreased for the first time and the paid new additional users was at a all-time high in last 12 quarters. This explains why content spend per user was at a all time-low in last 12 quarters.

But the content spend every quarter has been increasing consistently by ~7% till 2018Q4. In 2019Q1, they reduced the content spend by 15% ~ $500 million dollars. In the next quarter, 2019Q2, the number of paid net user additions fell drastically from 9.6 million to 2.7 million. Looks like the content spend in the last quarter might have an influence on the net paid user additions in the next quarter, 2019Q2.

I computed the above metrics just to check if there are any major shifts in content spending and we figured that 2019Q2 saw such a shift. I am pretty sure that most of the stock analysts who track Netflix are busy right now with content analysis from its financial statements to prepare for the upcoming earnings announcement on 21st April, 2020. However, I am more interested to explore the content choices that Netflix made over the years, rather than content spend/financial metrics. As the content choices it made not only delighted its 89 million users as of 2016Q4, but also resulted in an increase of 78 million paid users in the next three years ( including me!).

“When we please our members, they watch more and we grow more” — Netflix 2019Q1 letter to shareholders

So what content has Netflix been adding to please its users?

More I grappled with the above question, the more I wanted to figure an approach to answer that question. As chance would have it, I stumbled upon an excellent article in the Netflix Tech Blog section which elaborates how Netflix relies on predictive data-modeling for assessing its content consumption across languages. I think it is a great blog for anyone who is passionate about both movies and analytics. Given such an emphasis on data-driven decisions to figure out both the content and its creation life-cycle, I surmise those decisions should surface as patterns in the titles that it adds to its platform. I decided to put this additional time during the COVID-19 lockdown period to learn a bit of NLP(natural-language-processing) and check if I can apply any basic NLP techniques to figure out the choices that Netflix made with regard to its content. Since Netflix relies on consumer level data to arrive at those choices, we can maybe uncover a few customer preferences along the way.

DATA SOURCES

To explore this, I looked at two data sources to triangulate its content strategy —

(1) : Content section in its quarterly letter to shareholders- Every quarter Netflix releases its letter to shareholders where it publishes a section on content. In this section, Netflix communicates about its content choices, reasons and how they fared in that quarter.

Data source: Netflix letter to shareholders on its website https://www.netflixinvestor.com/financials/quarterly-earnings/default.aspx

(2) List of actual titles that Netflix has on its platform. The above data source is top-down approach in terms of what Netflix communicates about its content strategy . Using the titles on Netflix and their description, the actual content strategy could be deciphered.

Data source: I used the dataset from kaggle.com, which is primarily based on the data collected from the website flixable.com. Flixable contains all the TV Shows and Movies currently available to stream on Netflix in the United States.

The above dataset does not classify the content into three types of content that Netflix classifies. In this article, we will not classify the content choices by self-produced titles, branded and licensed titles and only licensed titles.

Upon merging both the financial dataset and the Kaggle dataset, I find that 2018Q2 has highest content spend to acquire a title — $8.6 million dollars and 2019Q1 has lowest content spend of $5.6 million to acquire a title.

Fig 7: Total content spend and spend per title per quarter; Source: Analysis by Srinivas Vadrevu based on data from Netflix financial statements 2019Q4

Before, I dive into processing the texts from both the data sets, a big shout out to the “Text Mining with R” by Julia Silge and David Robinson. I found the content on this super helpful to learn text mining and to plot the below charts.

What Netflix communicates about its content ?

Photo by Miguel Henriques on Unsplash

Analysing the content section of its letter to shareholders (top-down)

To understand the key themes that Netflix uses in its content section, I collected the content sections from the letter to shareholders that Netflix publishes to its investors. I compiled the data for the last 30 quarters ( 2012Q3 to 2019Q4). From this dataset, I extracted bigram tokens, after removing stopwords, from the text to create a network visualisation of bigrams from the corpus. The below graph creates a network structure of bigrams (two word combination). Say, “la”+ “casa” = “la casa” is a bigram and “case de” is another bigram. Both bigrams are joined at the node “casa” forming a network structure “ la casa de”.

Fig8 : Common bigrams of the content section in letter to shareholders from 2012Q3 to 2019Q4 ( 30 quarters); (Bigram count > 4); Source: Analysis by Srinivas Vadrevu based on data collected from Netflix’s letter to shareholders

I have grouped the above bigrams into broad theme categories. Lets look at the key themes that Netflix communicated to its shareholders:

Original Content ( word: “original”, left of the red bubble) - In the last thirty quarters, Netflix has almost always highlighted its original content . Original+ films, original+feature films, original+content , original+documentary, original+ programming and original+ series, among others. Popular/Hit series (“yellow bubble)— Breaking bad, Bojack Horseman, Marvel’s Daredevil, Le Casa de Papel ( Money Heist — English version), Turbo F.a.s.t, Arrested Development, Sacred Games and Hemlock Grove Popular Comedy individuals- ( “yellow bubble”) — Actors: Ricky Gervais, Adam Sandler and writer: Jenji Kohan Key Genres in TV shows (node word: “series”, right part of red bubble) — Drama series, comedy, kids, animated, scripted series, limited series and TV series. Sci-fi genre Continuity- Netflix has mentioned the season names whenever it launched follow-on seasons — seasons 2, seasons 3 and returning seasons Variety- Catering to Wide variety and diverse tastes; Location focus - Latin America and North America Quality- Awards & Festivals ( green bubbles) —Signalling High quality content. Bigrams such as Academy awards, Golden Globes, Emmy nominations, Emmy awards and film festivals are frequently mentioned.

Change in the key bigrams from 2014–2016 to 2017–2019

Fig9: Common bigrams of the content section in letter to shareholders in three-year windows- 2017Q1 to 2019Q4 and 2014Q1 to 2016Q4; (count >4); Source: Analysis by Srinivas Vadrevu based on data collected from Netflix’s letter to shareholders

When I plot the common bigrams in three year windows, 2017–2019 and 2014–2016, there are a few shifts in content focus, apart from key trends mentioned above.

Comedy series is no longer as frequently mentioned in the 2017–2019 window as compared to 2014–2016 window.

Drama series and Limited series have been mentioned more frequently in 2017–2019

Season 2 in 2014–2016 has given way to Season3 and Returning Seasons in 2017–2019.

Of course, popular shows changed from 2014–2016 to 2017–2019

TV networks and Internet TV are not being frequently mentioned in 2017–2019

Emphasis on original content remains the same for the last six years

So what has changed in the last three years (2017,2018,2019) compared to three year window (2014,2015,2016) prior to that?

In the recent years, there is more emphasis on drama, limited series, extending seasons and of course, original content. Less emphasis on comedy series, TV networks and internet TV

Photo by Charles Deluvio on Unsplash

B. Deciphering Content strategy from the actual content on Netflix

Triangulating what they said with what they did….

Analysing all the titles and their description on Netflix ( Bottom-up approach)

Kaggle dataset (second data set in the above data description) contains the information on titles gathered for the last 12 years, from 2008 to 18 January,2020. The fields in the dataset are as follows:title’s name, type (TV show or movie), director, cast, countries involved in the production, date added on Netflix, original release year, rating (MPAA or TV Parental Guidelines), duration of the title, listed in category, and a brief description of the title. I used the description of the title to extract its key content choices and listed_in category to extract the genres.

B.1 Genre

Based on the date added, I have added the quarters to the dataset to study what type (movies/ tv shows), genres and themes are being added every quarter. As of Jan’2020, the dataset shows that Netflix has about a total of 6234 titles. 68%(4265) of which are movies and the rest of 1969 titles are classified as TV shows

Lets’s take a quick look of the split of titles added every quarter from 2016Q1 to 2020Q1*( till Jan 18, 2020). While Netflix is adding increasingly more titles every quarter, % of movies seems to follow a cyclical pattern. ( See Fig 10 below)