— Tools and Techniques — Some Important Streaming Algorithms You Should Know About Great overview of key streaming algorithms and how to work with them. This article was adapted by a talk given by Ted Dunning, the Chief Applications Architect for MapR. Density-Based Clustering Compared to centroid-based clustering like K-Means, density-based clustering works by identifying "dense" clusters of points, allowing it to learn clusters of arbitrary shape and identify outliers in the data. This is a good overview of popular density-based clustering algorithms with lots of sample code and screenshots. Querying Craigslist with Python If you've ever comparison shopped on Craigslist, you'll appreciate this. This tutorial shows how to build a Craigslist scraper and analyze the results. Web scraping isn't the most robust approach for finding things on Craigslist but it's a nice tutorial and should work for personal use. Deep Learning Libraries by Language Overview of deep learning libraries for a wide range of languages including R, Python, Julia, Lisp, .NET, JavaScript, MatLab, and Java.

— Resources — Beautiful Code for Python There was a great thread on Reddit this week about "beautiful code collections for Python." Here are the highlights - all free: Python Cookbook

Awesome Python

Python 3 Patterns, Recipes and Idioms Data Science MOOCs It's MOOC season! Here's a nice overview of each of the best courses related to data science, including pre-reqs, dates, and expected workloads. It's also worth checking out these new specializations from Coursera: The Executive Data Science specialization focuses on leading a data science team.

The Machine Learning specialization is for mastering machine learning fundamentals.

The Data Science at Scale specialization offers hands-on experience with scalable SQL and NoSQL data management, data mining, and machine learning.

— Data Viz — Podcast.__init__: Episode 22 - Bryan Van de Ven on Bokeh Great interview with Bryan Van de Ven, the project maintainer for Bokeh. In this podcast, Brian talks about Bokeh's history, some interesting use cases for it, what its near future looks like, and how it compares to other visualization libraries. Mike Bostock - Ask Me Anything Mike Bostock, the creator of the JavaScript data visualization library, D3, did a Reddit AMA this week. Talk about popular! This AMA has has nearly 3000 upvotes! For anyone working with D3, there are LOTS of useful insights and links here.