What Time Should You Post to Reddit?

By Max Candocia

UPDATE: A more recent and thorough analysis can be found here.

When posting anything on social media, whether a news article, a picture of yourself, or a funny image (or a combination thereof), you usually want to reach the largest audience. When posting on Reddit, I have noticed that the success of a post is largely determined by the time of day and day of week that your submission is posted. There are a few other factors, such as whether the post is an image, an article, or a text-only submission.

I have used the Python scraper I built in order to collect data on articles of particular Subreddits I wish to analyze. Among the data collected, I have...

The score of a post (roughly upvotes - downvotes)

The subreddit it was posted in

The time at which a post was made

The domain of the post's link

Using this information, I can formulate a model that describes what attributes affect the score. Specifically, I am looking for a percent change in the score with respect to values such as time of day, day of week, whether a post is an image post, etc. In my case, this can be approximated with this formula:

sign(score) * log(abs(score) + 1) = time_of_day_and_day_of_week + is_image_post + is_text_post + length_of_submission_title

I log-transform the score on the left side. Doing so ensures that the terms on the right side have a multiplicative effect on the score, as opposed to additive. The right side treats the time of day + day of week, the post being an image post, and its other attributes as independent factors that each scale the score by some value; i.e., I am controlling for other effects.

Below is a graph that estimates the effect of the time of day and day of week on six different subreddits I sampled collectively. I use Monday from 8 to 10 am as a reference, so the percentages are the percent increase in score you can expect if you post at the given time versus Monday from 8 to 10 am US Central Time .

Monday morning is a relatively good time to post in these subreddits, especially from 6-8 am. Sunday is even better during that time frame, with an expected score that is 74% higher than our reference, Monday from 8 to 10 am. Saturday, however, seems fairly strong most of the day.

Because the above image only applies to a relatively small amount of data, it helps to compare it to a different set of data. Below I sampled default subreddits, as well as thread commenter's comment histories, so this model generalizes to Reddit as a whole better.

This tells a similar story, except the tiles change a lot more smoothly. You could repeat the process, but the general takeaway is that the best time to post on Reddit is on Sunday, Monday, or Saturday from 6 to 8 am US Central Time. The next best times would be within 2 hours of that time range on those same days, or during that same time range on other days.

Additional Notes

Technically, the transformation I made to the score adds 1 to the score before calculating the percent change, and negative scores are calculated as having points equal to 1/(1+abs(score)) , which is a fractional score always decreasing as the score becomes more negative.

Code and Data

Below I have the R code I used to generate the images. You can download the data for the file here: constrasts_threadmode.csv.

library ( plyr ) library ( dplyr ) library ( htmlTable ) library ( ggplot2 ) library ( scales ) setwd ( '/mydirectory/reddit_posting' ) #makes filenames possible/better subslash <- function ( x ){ x = ( gsub ( ' ' , '-' , x )) return ( gsub ( '/' , '-' , x )) } create_threads_plot <- function ( threads , tname = 'none' , subtitle_size = 18 ){ #group times to increase significance of data threads $ hour_ = cut ( threads $ hour , seq ( 0 , 24 , 2 ), include.lowest = TRUE , right = FALSE ) source_hour_ = levels ( threads $ hour_ ) target_hour_ = c ( '12-2 am' , '2-4 am' , '4-6 am' , '6-8 am' , '8-10 am' , '10 am - 12 pm' , '12- 2 pm' , '2- 4 pm' , '4-6 pm' , '6-8 pm' , '8-10 pm' , '10 pm - 12 am' ) threads $ hour_ = mapvalues ( threads $ hour_ , from = source_hour_ , to = target_hour_ ) threads $ titlelen = nchar ( as.character ( threads $ title )) / 100 threads $ logscore = sign ( threads $ score ) * log ( 1 + abs ( threads $ score )) threads $ is_self = with ( threads , ifelse ( is_self == 't' , 'Self Post' , 'Link Post' )) daysofweek = c ( 'Sunday' , 'Monday' , 'Tuesday' , 'Wednesday' , 'Thursday' , 'Friday' , 'Saturday' ) threads $ dow = factor ( daysofweek [ threads $ dow +1 ], levels = daysofweek ) weekday_hour_grid = expand.grid ( target_hour_ , daysofweek ) #make sure order is right weekday_hour_levels = paste ( weekday_hour_grid [, 2 ], weekday_hour_grid [, 1 ]) #for a better reference, ref=Monday 8-10 am weekday_hour_levels_ = c ( weekday_hour_levels [ 17 ], weekday_hour_levels [ -17 ]) threads $ weekday_hour = factor ( paste ( threads $ dow , threads $ hour_ ), levels = weekday_hour_levels_ ) #domain vars threads $ image_submission = factor ( c ( 'Image Submission' , 'Non-Image Submission' )[ 2 - threads $ domain %in% c ( 'imgur.com' , 'i.imgur.com' , 'i.reddit.com' )]) threads $ image_submission = relevel ( threads $ image_submission , ref = 'Non-Image Submission' ) #remove moderator posts, which will most likely be very high threads = threads %>% filter ( is_distinguished == 'f' , is_stickied == 'f' ) n_data_points = nrow ( threads ) #run linear model and extract coefficients model = lm ( logscore ~ weekday_hour + titlelen + is_self + image_submission + subreddit , data = threads ) model_summary = summary ( model ) coefs = model_summary $ coefficients #round sig figs for ( i in 1 : 4 ) coefs [, i ] = signif ( coefs [, i ], 4 ) #used to produce HTML output of the model summary for display on web sink ( subslash ( paste0 ( 'reddit_thread_summary_table_' , tname , '.html' ))) print ( htmlTable ( coefs )) sink () #now format matrix to show results coefmat = as.data.frame ( cbind ( varname = rownames ( coefs ), coefs ))[, 1 : 2 ] coefmat = coefmat %>% filter ( grepl ( 'weekday_hour.*' , varname )) coefmat = rbind ( data.frame ( varname = 'weekday_hourMonday 8-10 am' , Estimate = 0 ), coefmat ) coefmat $ dow = factor ( gsub ( '.*hour' , '' , gsub ( ' .*' , '' , coefmat $ varname ) ), levels = daysofweek ) coefmat $ hour = factor ( gsub ( '^[^0-9-]*? ' , '' , coefmat $ varname ), levels = rev ( target_hour_ ) ) coefmat $ `Percent Change` = ( exp ( as.numeric ( coefmat $ Estimate )) - 1 ) #save plot to png png ( subslash ( paste0 ( 'expected_reddit_score_' , tname , '.png' )), height = 720 , width = 920 ) print ( ggplot ( coefmat , aes ( x = hour , y = dow , fill = `Percent Change` )) + geom_tile () + xlab ( '' ) + ylab ( '' ) + #axes are self-explanatory with title ggtitle ( 'Percent Change in Expected Reddit Submission Score Based on Time Posted' , subtitle = paste ( 'compared to Monday from 8 - 10 am & using' , comma ( n_data_points ), tname , 'submissions' )) + theme_bw () + theme ( plot.title = element_text ( hjust = 0.5 , size = 24 ), plot.subtitle = element_text ( hjust = 0.5 , size = subtitle_size ), axis.text.x = element_text ( size = 18 , angle = 0 , vjust = 0.8 ), axis.text.y = element_text ( size = 18 )) + scale_fill_gradient2 ( labels = scales :: percent ) + geom_text ( aes ( label = scales :: percent ( `Percent Change` )),size = 6 ) + coord_flip () ) dev.off () } #load file and create a plot + html table for each threads = read.csv ( '/mydirectory/contrasts_threadmode.csv' ) create_threads_plot ( threads , 'nintendo/boardgames/rap/classicalmusic/democrats/conservative' , subtitle_size = 12 )