Extract Top Reddit Posts of #rstats in 3 lines of R Code using jsonlite

This post is kept (literally) minimal to demonstrate how simple is this hack using R (of course could be simple in other languages too). This is also to establish a point that R has got use-cases beyond statistics and data-mining.

Objective rstats subreddit is one of the popular sources of R-related information / discussion on the internet. We’re trying to extract the top posts of rstats subreddit.

Data Format Lucky for us, Reddit offers a json file for every subreddit (also post) and we’ll use that here. subreddit url: "https://www.reddit.com/r/rstats/" subreddit json: "https://www.reddit.com/r/rstats/.json"

jsonlite @ Action The package that will help us in this endeavor is jsonlite (by Jeroen Ooms and Team) for parsing json files and feeds. It’s got a sweet function that fromJSON() that parses a json file and stores the result in a list object. Ultimately, we can find the required information - title , url of the subreddit in there. library(jsonlite) reddit <- fromJSON("https://www.reddit.com/r/rstats/.json") (top10 <- reddit$data$children$data[1:10,c("title","url")]) ## title ## 1 How does one fit a plm model? ## 2 Loading .arp files for analysis with diveRsity package ## 3 Finding "Optimal" Target Inventory for Parts ## 4 Can you change the limits on a scale in ggplot based on the data based to ggplot? Explanation inside ## 5 Error: Need Finite 'ylim'values ## 6 Why Machine Learning Beats Econometrics in the Real World ## 7 Help with reshape() Error ## 8 R & stats illustrations by @allison_horst ## 9 Time Series Qn ## 10 Flexdashboard runtime shiny renderPlot issue ## url ## 1 https://www.reddit.com/r/rstats/comments/cr48no/how_does_one_fit_a_plm_model/ ## 2 https://www.reddit.com/r/rstats/comments/cr1064/loading_arp_files_for_analysis_with_diversity/ ## 3 https://www.reddit.com/r/rstats/comments/cqxg5q/finding_optimal_target_inventory_for_parts/ ## 4 https://www.reddit.com/r/rstats/comments/cqpdq2/can_you_change_the_limits_on_a_scale_in_ggplot/ ## 5 https://www.reddit.com/r/rstats/comments/cqwac0/error_need_finite_ylimvalues/ ## 6 https://medium.com/@adrianantico/machine-learning-vs-econometrics-in-the-real-world-4058095b1013 ## 7 https://www.reddit.com/r/rstats/comments/cqq6r9/help_with_reshape_error/ ## 8 https://github.com/allisonhorst/stats-illustrations ## 9 https://www.reddit.com/r/rstats/comments/cqkgcc/time_series_qn/ ## 10 https://www.reddit.com/r/rstats/comments/cqd1u9/flexdashboard_runtime_shiny_renderplot_issue/

3-lines Load the library

Retrieve and Parse the json file

Extract the relevant information for the list object

Summary This post while is primarily intended to demonstrate the simplicity of R and jsonlite for json parsing, it can also be used to automate such a script to email or send notification about top 10 rstats subreddit post at a scheduled interval.

Please enable JavaScript to view the comments powered by Disqus.

Disqus