Common pitfalls to avoid when writing Python software

Photo by Marius Masalar https://unsplash.com/collections/225/into-the-wild?photo=LN_gdbQtzvk

Some time ago Ilya Etingof posted a really good article about dangerous “pitfalls” found in Python. Every language has its own pitfalls (or “wats!?”) and it’s important to know about them in order to write robust and secure software. When not considered, these pitfalls can introduce obscure bugs and even security vulnerabilities in your system. Our first intention was to share the article with our students, but we quickly realized that, even though it describes these common pitfalls, it doesn’t do much at explaining how to avoid them. So we’ve taken Ilya’s excellent article and expanded it with our own suggestions and recommendations. We’re also planning to create one of our free special classes based on them, so stay tuned.

Input function

This pitfall is explained fundamentally by the Python version being used. If you’re using Python 2, you’re in danger. The default behavior of input in Python 2 is terrible, as it evaluates the input being passed as valid python code 😰. The way to avoid being bitten is by using raw_input instead. This issue was fixed in Python 3 and now Python 3’s input function has the same behavior as Python 2’s raw_input function. But that means raw_input doesn’t exist in Python 3 anymore. So, how do you come up with Python 2 and 3 compatible code that is also secure to input? This is one approach:

But an even better approach would be to use the input function from the Six Python compatibility package:

Assert statement

This is not a technical pitfall but a conceptual idea to understand. I think Ilya makes it pretty clear in his post: you shouldn’t rely on assert to implement logic in your code. The assert statement is commonly used in two scenarios: Design by Contract and for testing purposes. When writing software based on Design by Contracts it’s pretty common to specify “preconditions” and “postconditions” that your software needs to fulfill. That’s a great use case for assertions. For example:

If you pay attention, you’ll see that the if user.great_discount_sent part is not implemented with an assertion, as it’s part of the business logic of this function. If we rely on assertions and they are disabled, our function would be basically broken. In the other hand, the code used as preconditions and postconditions is implemented with assertions, as it’s not part of the business logic. If you ever disable the assertions, you’d be lacking the pre/post conditions checks, but the function would still work as expected (assuming the rest of the code works correctly!).

The second use case for assertions is testing. For example, the popular Python testing library pytest relies on assertions to write simple, function based tests:

Reusable integers

This point is more about the is operator, than integers themselves. As a general rule we recommend our students not to use the is operator (at least until they understand completely how it works). To properly understand it, we need to note that in Python everything is an object: strings, integers, booleans; even classes and modules are objects. When you do “objectA is objectB” you’re checking if objectA is actually the same object as objectB. Objects are independent units of information living somewhere in memory. When an object is created, the Python interpreter assigns a unique ID to it. You might have two objects that “look similar” but they are actually different objects with different object IDs. You can use the id function to get the ID of an object. When we say “look similar”, we mean that those two objects, regardless of being actually different and independent objects, might still be “equal”. In this sentence, equal means different things depending on the object; equals has a higher level semantic than is. This is why we prefer to use the “==” (equals) operator. The following example might help understand this better:

In this example, when we create a new variable pointing to our previously created list1 we’re basically creating an alias. list1 and list_alias point to the exact same object, that’s why they have the same object ID. They ACTUALLY are the same object. In contrast, list1 and list2 are equals, but they’re different objects (residing in different memory locations, independently).

Again, try to avoid the is operator if you’re not entirely sure what you’re doing. For example: the is operator is generally considered safe for strings, as they’re “interned”, but take a look at the following example code to see how you can also be bitten by it:

Floats comparison

Float issues are overly discussed for every programming language, not just for Python. The best way to avoid the float pitfalls, is by not using floats at all. Use instead the decimal module:

This is similar to our previous recommendation of not using the is operator; you can very well use floats if you’re sure of what you’re doing, or the inaccuracy of floating point arithmetic is something you’re not worried about.

Private attributes

We don’t have much to add to this point. Something important to understand about Python is that we don’t try too hard to hide things. Usually we have students coming from the Java world and they try too hard to hide attributes or variables (common question: “how can I make a ‘private’ attribute?”). In Python, we trust our fellow coders. If I’ve named an attribute “__default_distance”, it’s pretty obvious that you shouldn’t be messing with it. And if you do want to mess with it, that’s up to you.

It’s also a really bad idea to set attributes to an object after it’s been created. That violates the encapsulation of your objects. Consider the following example, in which we’re setting attributes “manually” (directly from the outside part of the class):

What happens if we make a typo when setting the attribute directly? At some point we’ll get an error saying that the User object has no attribute ‘email’, and we’ll have to trace down the typo through our entire code. Instead, by using the constructor, the issue becomes much easier to resolve: