12/02/2014

3 minutes to read

In this article

This blog post is authored by Shahrokh Mortazavi, Partner Director of Program Management on the Microsoft Azure Machine Learning team.

Two languages are closely associated with Data Science today – R and Python. In Azure ML we’ve supported R for some time – and very soon we’ll add full Python support as well. This includes a world-class Python experience in Visual Studio, in Azure ML Studio and in the browser via Jupyter/IPython. As a first step, we’re excited to announce that the Python Tools for Visual Studio (PTVS) team has added features to integrate with Azure Machine Learning APIs hosted in the cloud.

I’m also happy to announce that PTVS 2.1 RTW was recently released and is available from codeplex. Note that this is an officially supported OSS plug-in. When installed into the Professional version of Visual Studio (free, available here), you’ll have a powerful Python centric Data Science IDE that is completely free. We believe powerful open source tools such as PTVS will greatly empower developers and help democratize frontier technologies such as machine learning and advanced analytics.

PTVS 2.1: A Quick Overview

Python Tools for Visual Studio offers an IDE experience for general scripting, web programming and Data Science. With integrated IPython REPL support for smart history, shell commands and inline images, PTVS provides a great exploratory coding environment. With unique features such as mixed mode debugging of Python with C++ and remote debugging of Linux servers in Azure, Visual Studio provides a productive development environment for Python developers:

1: Multi-lingual Projects; 2: Editor with deep code intelligence

3: VS Debugger 4: Integrated IPython REPL 5: VS/Excel live bridge

For a quick walkthrough of PTVS2.1 features, take a look at this video on YouTube.

PTVS “ML Pack” and Azure ML Web Service consumption

While the focus for the 2.1 release of PTVS was Web frameworks, the team has already created a “Machine Learning Pack” which can be download from codeplex to give you a taste for ML and Azure ML web services. The ML pack has three starter templates that include everything you need from data acquisition, cleaning and training all the way to visualization using matplotlib:

Simply select your template and hit F5 to get a sense for a typical ML workflow. Then browse through the code and customize it as you like for your particular scenario. As with everything else in PTVS, the code is open source (Apache 2.0), so feel free to send us your feedback and PR’s.

Azure ML Studio is a powerful easy to use canvas that enables rapid composition of ML experiments along with 1-click operationalization. PTVS has full support for quickly building web apps and dashboards using frameworks such as Django, Flask and Bottle. The ML Pack now brings the two together via a wizard that enables easy consumption of published predictive API’s into your web app:

Simply fill out the form after you’ve published, and PTVS will generate a skeleton dashboard that you can deploy to Azure Web Sites.

IPython/Jupyter

Azure ML Studio provides a convenient drag/drop model for quickly building ML workflows and operationalizing them. PTVS provides a desktop Data Science workbench with excellent support for large projects, debugging, profiling, intellisense, git, etc. The last piece that’s missing from this picture is IPython (now the polyglot “Jupyter”), which is a browser based “notebook” REPL. Azure ML will be adding this third canvas in the near future, enabling a fully cloud hosted, cross-platform, browser based experience for data science. You’ll be able to use Jupyter on Azure ML with both Python and R. Each of these authoring environments have their own centers of gravity. Our plan is to provide an integrated experience where you can use the right tool at the right time for your project.

Conclusion

Python and its ecosystem of rich libraries is a perfect fit for Data Science. You can pair PTVS with a scientific distro such as Anaconda or Canopy today, use scikit-learn, Pandas, matplotlib, etc. for analytics / Data Science work, and deploy to a VM or Cloud Service in Azure. In the near future we plan to bring you a fully integrated Visual Studio, Jupyter and Azure ML Studio experience to maximize your productivity as developers and Data Scientists. Stay tuned!

Shahrokh