This long article with a lot of source code was posted by Suraj V Vidyadaran. Suraj is pursuing a Master in Computer Science at Temple university primarily focused in Data Science specialization. His areas of interests are in sentiment analysis, data visualization, big data and machine learning.

This data is obtained from UCI Machine learning repository. The purpose of the analysis is to evaluate the safety standard of the cars based on certain parameters and classify them. The detailed description of the dataset is provided in the original article, and in Suraj's Github repository. For another article comparing different data science techniques (by a different author), read Performance From Various Predictive Models.

Clustering based on density peaks (source: click here)

This article provide source code and results for the data set in question, for the following clustering techniques:

Logistic regression

Linear discriminant analysis

Mixture disriminant analysis

Quadratic Discriminant Analysis

Neural Network

Flexible Discriminant Analysis

Support Vector Machine

k-Nearest Neighbors

Naive Bayes

Classification and Regression Trees (CART)

C4.5

PART

Bagging CART

Random Forest

Gradient Boosted Machine

I was surprised to see the overlap with our recent article on top 10 machine learning algorithms. You can read the full article (with voluminous source code in R) here.

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge