Why I push for Python

My colleagues are puzzled by my relentless push of Python as the language to teach programming to our undergraduates. They look at me funny, each time that the subject comes up and I can't help vehemently insisting "Python!"

It's natural to be skeptical of someone championing a programming language; we've all seen the language wars rage foolishly in other contexts—surely it's just a matter of preference? And many of my colleagues prefer Matlab, although some others insist that our undergraduates need to learn C. Does it matter?

Let me articulate my reasons to continue to advocate for Python. First, let's agree on the context of this discussion: I'm talking about a programming language to teach undergraduate students of engineering the computational skills that will make them more successful, both as students and as future STEM professionals.

Most entering freshmen have no programming experience whatsoever. Unfortunately, computer science is all but absent from the school curriculum: in 2011, only 5% of high-schools in the US were certified to teach the computer science AP course and only 0.6% of all AP tests taken were in CS [1].

I'm talking about teaching complete novices their first baby steps of programming, and rather swiftly bringing them to a level of skill where they can use computing in their other courses: to analyze data for those lab reports, to learn linear algebra, to solve problems in particle mechanics, problems involving differential equations, maybe even to write a program to control a robot. And by the way, these students are busy and have a social life and cannot abide courses that seem like a waste of time. Let's also make it fun, then, can we?

A programming language for beginners

What is a programming language? All computers really understand is machine language: instructions operating on binary bits. Programming languages are made for humans. Their goal is to allow humans to express what they want the computer to execute for them, in a way that other humans can understand.

Compare these two programs … which is more human-readable?

First, in C++:

#include <iostream.h>

void main()

{

cout << "Hello world" << endl;

}



Now, in Python (v.2.7):

print "Hello world"

This simplest of code examples is used in the book "How to think like a computer scientist" [2] to explain why Python is good for teaching beginners. In the C++ version, there are too many elements ( #include, void, main ) that will be confusing for students. Explaining them takes time, can be intimidating for beginners, and serves no purpose in helping students get computing. Jeff Elkner explains in the preface to "How to think…" that the C++ version of the book had 13 paragraphs explaining the "Hello world" program, while the Python version had only two. The 11-paragraph difference dealt not with programming concepts, but with details of C++ syntax. Beginner programmers will just get frustrated by obscure syntax.

Quoting Elkner again…

Using Python has improved the effectiveness of our computer science program for all students … More students leave the course with the ability to create meaningful programs and with the positive attitude toward the experience of programming that this engenders.

From anecdote to data

Analyzing 30 programs written in Java and 30 written in Python by novice programmers (in Finland, aged 16–19), Mannila et al. (2006) studied the errors found in them to identify those that could be attributed to the language. The students in both groups had the same teacher and studied the same contents in the same environment; only the language changed.

The study categorized errors as relating to understanding (logic) or arising from features of the language (syntax). Four criteria were also applied to the programs as a whole: execution, satisfying specs, error handling, and structure.

Of all the syntax errors found, only two appeared in Python programs while 19 were found in the Java programs (missing brackets or semicolons, uninitialized variables, etc.). And the logic errors were also significantly fewer in the Python programs, compared to Java (17 to 40). Also, more Python programs ran correctly and satisfied the specifications, and more included error checking/handling, compared to the Java programs.

A second part of the Mannila et al. study looked at how the students who learned with Python transitioned to a second language in a later course. Critics of Python as a first language often claim that being too simple, it makes students run into problems when having to use a more advanced language later on. But after both analyzing the programs and interviewing the students, Mannilla et al. concluded that students experienced no problems in the transition. (In particular, they had no problem adapting to static typing after having learned to code in Python.)

In conclusion, this study showed that students make fewer syntax and logic errors when learning in Python (compared to Java), and there are no pitfalls when transitioning to a second language. Python makes it easier to focus on giving students a solid ground on computational thinking.

Bonus reasons for Python

Watch the first 5 minutes of this hands-on introduction to Python for beginners, by Jessica McKellar (Director of the Python Software Foundation), to hear several reasons for learning Python:

Quoting Jessica …

Python is a versatile language: you can analyze data, build websites (*), maintain servers, make art or music. Employers love Python: people will want to hire you. Python is a great teaching language … a lot of educational institutions are switching to Python, e.g., MIT It reads very much like English (it has low syntactic overhead) It is very easy to get useful work done quickly in Python You can do data analysis and graphing with Matplotlib (even 3D animations) You can write games on Python (using PyGame)

(*) Websites are build on a Python framework called Django. Examples: Instagram, Firefox, Pinterest, even YouTube!

Continue watching the video from about 8:20 to learn a bit of Python right now! You can type along with Jessica and try the code right on your browser by going to PythonAnywhere.com

Why is it so hard to learn to program?

The well-known computer-science educator Mark Guzdial addressed just this question (Guzdial, 2010). There are many reports of high failure rates in introductory programming courses (worldwide). Why do students find it so hard?

Studying how people use a natural language to describe a task to another human gives clues. In such descriptions, people don’t define iterations, they instead put into words set operations; they are not explicit about loops terminating; people use constraints, event-driven tasks and imperative programming, but they never talk about objects. And when these natural-language instructions are given to other participants, they have no problem following them. Processing a set of data until it's finished is natural, but incrementing an index is not.

How is this related to Python? It so happens that the language's core looping idioms can often replace index manipulation, making it more like plain English. The following examples were given by Raymond Hettinger (core Python developer) in a keynote in 2013. To get the square of the numbers from 0 to 5, you might write Python code like this:

for i in [0, 1, 2, 3, 4, 5]:

print i**2

But the convenient function range() makes it easy to iterate over longer lists.

for i in range(6):

print i**2

This has the disadvantage of creating the list in memory (not good if the list is very big). So in Python 2.7, a better way is with the xrange() function (which in Python 3 dropped the x):

for i in xrange(6):

print i**2

Now, suppose you want to print the colors in a list like this:

colors = ['red', 'green', 'blue', 'yellow']

You might write this to loop over all the colors:

for i in range(len(colors)):

print colors[i]

But Python lets you do this instead, which looks more natural, and like Raymond says, more beautiful:

for color in colors:

print color

In summary, Python is a lot more like English than other programming languages, and reduces the cognitive load in learning to think computationally.

What about Matlab?

I used Matlab for years and I understand why many of my colleagues use it heavily. For many of them, it's awfully hard to imagine their workflow without Matlab. But there are reasons to at least think about switching, and I will let others speak for me here.

Luis Pedro Coelho is a computational biologist at EMBL. In "Why Python is Better than Matlab for Scientific Software," (Oct.'13) he offers these reasons:

Python has caught up with Matlab and is in the process of overtaking it.

Python is a real programming language

Python can easily interface with other languages

With Python, you can have a full open-source stack

Matlab Licensing issues are a pain. And expensive.

In "Why use Python for scientific computing" (July'13), Cyrille Rossant, a neuroscience researcher at University College London, offers similar reasons:

Python is free and open source, whereas Matlab is a closed-source commercial product.

The Python language is just far better that Matlab’s awkward language.

Python integrates better with other languages (e.g. C/C++).

Python includes natively an impressive number of general-purpose or more specialized libraries, and yet more external libraries are being developed by Python enthusiasts.

And, of course, nearly anything that is possible in Matlab is possible in Python, whereas the converse is not true.

Almar Klein, a developer and scientist in the Netherlands, has more concrete objections to Matlab in the long "Python vs. Matlab" essay:

"… the most fundamental problem with Matlab is its commercial nature

Matlab is expensive.

The algorithms are proprietary

It makes portability more difficult.

And cites these other issues:

Matlab has bad string manipulation

Indexing: Python indexing goes as it does in C … starting from 0 + Python indexing is done using brackets, so you can see the difference between an indexing operation and a function call.

A trio of additional issues brought up by Hoyt Koepke of University of Washington can be extracted from his "10 Reasons Python Rocks for Research," as follows:

becoming better at MATLAB leads to skill at quick-and-dirty scripting, but becoming better at Python leads to genuine programming skill.

when calling functions, Python allows named arguments – this universally promotes clarity and reduces stupid bookkeeping bugs, particularly with functions requiring more than one or two arguments;

with MATLAB, globally available functions are put in separate files, discouraging the use of smaller functions and – in practice – often promotes cut-and-paste programming, the bane of debugging.

And there's more, if you still need more convincing:

References