Anti-Patterns & Alternatives

Throughout this section, we’ll continue to successively refactor some Python code to increase its robustness and understandability.

Fragile code

Let’s take a look at the following is_paid function below which takes a dictionary and checks if the "paid_date" value is set:

def is_paid(invoice):

return invoice["paid_date"] is not None

Can you see the biggest problem with this function? Exactly! You don’t have any guarantee that "paid_date" is available in the dictionary. Calling the function might actually result in a KeyError: 'paid_date' . This piece of code is pretty fragile and you’d need to pay close attention while refactoring.

A safer approach would be to rewrite it as

def is_paid(invoice):

return invoice.get("paid_date") is not None

I won’t go into more details here, but if you’re interested, make sure to check out returns. It’s a library that offers a very elegant solution to these sort of problems by introducing an Elm-like Maybe container to Python.

The typing module

Let’s take another look at is_paid : How do you know what invoice is? It’s not obvious at all. It could be a dictionary. It could also be a custom type implementing a .get method. You can only derive the answer from studying other pieces of the codebase that call this function.

A good way to be more concrete is Python’s typing module. We typically use a mixed approach where we don’t necessarily type our whole codebase but focus on the critical, complex or unobvious parts. It will document your code and save a lot of time especially when new developers join your team.

An improved version would be:

from typing import Dict def is_paid(invoice: Dict[str, Any]) -> bool:

return invoice.get("paid_date") is not None

Now it’s obvious that invoice is expected to be a dictionary and it becomes apparent how .get behaves.

In case you use an IDE like PyCharm, this gives you the benefit of additional code inspection and warnings.

When you want to learn more about typing, I recommend reading realpython.com’s great guide on the subject.

Custom types

Knowing that invoice is a dictionary is better, but we’re still pretty clueless about its content. Which keys is invoice supposed to have? Is the "paid_date" key always available or not? We lack a clear definition of what an invoice consists of. This calls for a custom type!

Especially because Python’s class syntax can be pretty clunky it’s easy to fall into the anti-pattern of not defining enough custom types and rather relying on native types like dictionaries or tuples too much.

For simplicity, let our invoice consist of three attributes: amount , paid_date and status . When defining this as a regular class you’d end up with something like this:

class Invoice:

def __init__(self, amount, paid_date, status):

self.amount = amount

self.paid_date = paid_date

self.status = status

Meh, that’s quite a bit of clutter and repetition… and therefore pretty unpythonic. Luckily there are ways to define this more concisely. Here’s an example using dataclasses :

from dataclasses import dataclass

from datetime import datetime

from typing import Optional @dataclass

class Invoice:

id: int

amount: float

paid_date: Optional[datetime]

status: str

Using this, we can refactor our code to only accept our new custom type:

def is_paid(invoice: Invoice) -> bool:

return invoice.paid_date is not None

Note: At this point, you could also just make it a property of Invoice , but I leave that exercise to the ambitious reader ;)

Please note that dataclasses is only available since Python 3.7. Good alternatives are the attr project or NamedTuple’s, though they also differ in other features.

If you want to dive deeper into the subject I recommend Raymond Hettinger’s talk about dataclasses from PyCon2018 and realpython.com’s “Ultimate Guide”.

Mutability:

Another problem in a language like Python is mutability. lists and dictionaries are mutable which can cause very surprising bugs in your code. You most likely encountered this at least once with mutable default arguments in functions or methods. I won’t go into the details here, but at least want to mention that dataclasses and attr both support an optional frozen keyword to make your classes (nearly) immutable.

Enums:

In most applications, you’ll find use cases for enumerated values, i.e. an attribute that should only have a limited set of possible values. Our Invoice.status is a good example. Certainly, only a small set of values should be allowed for this field, for instance "Draft" , "Sent" , "Paid" and "Cancelled" . In our current implementation, these would just be strings. While that works, it isn’t very robust against typos.

Let’s consider the following example:

def is_open(invoice: Invoice) -> bool:

return invoice.status == "Send"

Do you see the problem? We typed "Send" instead of "Sent" . In the worst case, this could lead to overdue invoices being marked as paid and your company loses quite a bit of money. Scary! When you use an enum instead, you’d be more safeguarded against these sort of bugs:

...

from enums import Enum class InvoiceStatus(Enum):

DRAFT = "Draft"

SENT = "Sent"

PAID = "Paid"

CANCELLED = "Cancelled" @dataclass

class Invoice:

...

status: InvoiceStatus

Now our is_open function becomes

def is_open(invoice: Invoice) -> bool:

return invoice.status === InvoiceStatus.SEND

Which would raise an AttributeError: SEND when this function is called. So even when you just have a single test case that somehow invokes this function, you’ll surely find this bug before deployment.

Ambiguous types

Last but not least, I would like to highlight the issue of using ambiguous argument or return types. They will inevitably drive complexity in your code and can cause many bugs if you’re not very attentive when dealing with them.

Take the following example where we add a new InvoiceMeta type which carries some metadata for invoices (It’s a toy example, but I hope you get the point):

def get_meta(invoice, metas):

for meta in metas:

if invoice.id == meta.id:

return meta meta = get_meta(invoice, metas)

print(meta.recipient)

What’s the problem here? The code implicitly assumes that all invoices have a corresponding meta entity. But if this assumption doesn’t hold, we run into an exception:

AttributeError: 'NoneType' object has no attribute 'recipient'

Snap! Ok, if we use type annotations thoroughly we would’ve spotted this as the return type must be Optional[InvoiceMeta] . But even then you’ll find yourself writing more complex code further down the road as you always need to check whether meta is an InvoiceMeta or None :

meta = get_meta(invoice, metas)

if meta is None:

print("Recipient unknown as no InvoiceMeta was recorded.")

else:

print(meta.recipient)

As suggested in the Python Anti-Patterns, a better approach would be to raise an exception directly in get_meta and then handle the error in a different layer of your code.