Python Programming, news on the Voidspace Python Projects and all things techie.

Expert Python Programming (Free Chapter)

Expert Python programming is a new book from Packt Publishing, written by Tarek Ziade.

Based on the review by Michele Simionato I bought the book, and was then (entirely coincidentally) contacted by the publisher to see if I was interested in putting a free chapter online. The chapter they've sent me is chapter 10: Documenting Your Project.

I bought the book particularly as I was interested in primers on setting up a Windows development environment using MinGW and MSYS and several of the other chapters sounded interesting (I always need a refresher on setuptools - I only use it once it every few months and the knowledge leaks out of my head).

I was very pleasantly surprised by the book. Its relatively slim (only 350 pages) with a clear and straightforward writing style (not dry) and plenty of 'recipe style' examples. In general it isn't aimed at really advanced programmers, although it covers relatively advanced topics (metaclasses, descriptors etc) it will really only be a refresher (but a welcome refresher) to experienced Python programmers.

The book is ideal for experienced programmers coming to Python from other languages. Whilst it doesn't teach the basics of Python syntax, is does well at things like idiomatic Python. It will also be helpful for mid-level Python programmers looking to get into some of the deeper subjects.

In fact it seems that some topics that used to be considered 'deep magic' are becoming more widely used and talked about. Take descriptors for example, I'm afraid I find the standard reference doc fairly impenetrable but there have been some simple recipes using them in recent days.

It is especially a shame that the code illustrating the descriptor rules implemented by __getattribute__ is garbled as code examples like this are a great way of showing how Python works under the hood. I haven't found any other major errors in the book so far.

The book aims to cover topics useful for 'real world' development:

API design and naming conventions

Testing (including a tutorial on Test Driven Development)

Documentation

Development environment, deployment and distribution

I do take issue with the books description of a class based organisation of tests used by unittest as 'heavyweight'. I've heard this from other people, and whilst parts of unittest are hard to extend (which is more an issue of documentation and some API complexity than actual inflexibility) - but inheriting from TestCase and using assert methods is useful and not at all heavyweight. This opinion confounds me! Whilst working with doctest over the last few weeks I've really been feeling the pain of not having useful assert methods...

I'm particularly glad that the chapter Tarek and his publisher have chosen to give away is the one on documentation. I'm a big fan of reStructured Text and projects having good documentation. I'm also looking forward to converting some of my projects over to using Sphinx as a doc tool, which I haven't yet used and this chapter covers.

To give you a taste of the book, here is the introduction to chapter 10:

Documenting Your Project

Documentation is work that is often neglected by developers and sometimes by managers. This is often due to a lack of time towards the end of development cycles, and the fact that people think they are bad at writing. Some of them are bad, but the majority of them are able to produce fine documentation.

In any case, the result is a disorganized documentation made of documents that are written in a rush. Developers hate doing this kind of work most of the time. Things get even worse when existing documents need to be updated. Many projects out there are just providing poor, out-of-date documentation because the manager does not know how to deal with it.

But setting up a documentation process at the beginning of the project and treating documents as if they were modules of code makes documenting easier. Writing can even be fun when a few rules are followed.

This chapter provides a few tips to start documenting your project through:

The seven rules of technical writing that summarize the best practices

A reStructuredText primer, which is a plain text markup syntax used in most Python projects

A guide for building good project documentation

Writing good documentation is easier in many aspects than writing a code. Most developers think it is very hard, but by following a simple set of rules it becomes really easy.

We are not talking here about writing a book of poems but a comprehensive piece of text that can be used to understand a design, an API, or anything that makes up the code base.

The Seven Rules of Technical Writing

Every developer is able to produce such material, and this section provides seven rules that can be applied in all cases.

Write in two steps: Focus on ideas, and then on reviewing and shaping your text.

Focus on ideas, and then on reviewing and shaping your text. Target the readership: Who is going to read it?

Who is going to read it? Use a simple style: Keep it straight and simple. Use good grammar.

Keep it straight and simple. Use good grammar. Limit the scope of the information: Introduce one concept at a time.

Introduce one concept at a time. Use realistic code examples: Foos and bars should be dropped.

Foos and bars should be dropped. Use a light but sufficient approach: You are not writing a book!

You are not writing a book! Use templates: Help the readers to get habits.

These rules are mostly inspired and adapted from Agile Documenting, a book by Andreas Ruping that focuses on producing the best documentation in software projects.

Doctest, How I Loathe Thee

Andrew Bennets has written a series of blog entries on why doctest makes poor unittests. I agree with a lot of what he has to say, and this was brought home to me again when working on ConfigObj which is tested with doctest.

Note doctest is a Python testing tool. It executes interactive code sessions (typically cut and pasted from an interactive interpreter) embedded in documentation or docstrings. It produces failure messages if the actual output from executing the embedded code differs from what is specified in the source. It is a great tool for checking that examples in documentation / docstrings still work; but in my opinion it makes a poor unit testing tool. Thankfully unittest is great at this.

Some of my pet peeves Andrew covered. Particularly that because doctest is comparing the output of your code to a text source you can't output arbitrarily ordered data like dictionaries. Instead you have to compare your data to known good data - and the result of this is either True or False. If it is False then that's all you get, you don't get to see what your data actually was - not helpful as a diagnostic tool.

Even worse, try adding prints into relevant parts of your code for diagnostics. The prints mean extraneous output - so all your tests start failing... It's not as if you can copy your test case into a separate file either, every line starts with '>>>' or '...'.

Another nit, that could probably be fixed, is that when a line in a test fails execution continues. Normally this means a cascade of failures and you have to dig through the output to find which is the real failure.

Chinese Metaclasses in Five Minutes

"The type of magical changes in the yuan than 99% of users are worried about more, when you really do not understand the need to use it, do not need that."

Someone has translated my Metaclasses in Five Minutes article into Chinese: Chinese Metaclasses in Five Minutes

Yuan category known as Python's "profound witchcraft." You need to use it very few places (unless you zope-based programming), the fact that it can be the basis of theory is in fact surprisingly easy to understand.

Garbled translations courtesy of Google Translate.

Changes to ConfigObj in SVN

There is a new version of ConfigObj nearly ready. ConfigObj is a powerful but simple to use module for reading, writing and validating 'ini style' configuration files.

All the changes, along with tests, are checked into Subversion:

There are several new features and bugfixes. If you use ConfigObj, particularly if you use confispecs and config file validation, then I would appreciate you trying out the latest version and letting me know if breaks anything.

New features:

Pickle support Pickling a ConfigObj instance is generally a pointless exercise as ConfigObj is intended for persistence anyway: you can always just call write ... Unfortunately pickle support is needed for use with libraries like Parallel Python that provided shared access between process using serialization. Thanks to Christian Heimes for providing the code.

Hashes in configspec files broken Thanks to Jeffrey Barish for reporting this. Configspecs are also parsed by ConfigObj but have an incompatible syntax with config files. In order to fix this I had to turn off the parsing of inline comments in configspecs. This will only affect you if you are using copy=True when validating and expecting inline comments to be copied from the configspec into the ConfigObj instance (all other comments will be copied as usual). If you create the configspec by passing in a ConfigObj instance (usual way is to pass in a filename or list of lines) then you should pass in _inspec=True to the constructor to allow hashes in values. This is the magic that switches off inline comment parsing.

Repeated values now allowed in configspecs You have always been able to specify that all sections (or sub-sections) in a config file are to be validated with the same specification by providing a __many__ section in the configspec. You can now specify that all values in a section / sub-section should be validated with the same specification by providing __many__ as a value. If you want to use this feature in a section that also has a __many__ sub-section then the repeated value can be called ___many___ (note the triple underscores) instead. As an example, the following configspec specifies that every value in a config file (and every section it contains) must be an integer: ___many___ = integer [__many__] __many__ = integer

Expected sections as scalars or vice-versa Under the current version of ConfigObj, if your configspec specifies a section and a scalar value is supplied instead - or vice-versa - then ConfigObj will crash. This is now fixed (but the config file is still screwed).

In the process of working on the configspec handling code to implement the new feature and the bugfixes I ripped out and reimplemented most of it (thanks Nicola Larosa for making me write tests!). I discovered a couple of restrictions on how you can write config file specifications that were merely an artefact of how the code was written. The code is now simpler and several restrictions have been removed:

Previously a __many__ section had to be the only sub-section in a section - now __many__ can appear alongside other sections and will only be used to validate sub-sections that don't have an explicit validation section.

section had to be the only sub-section in a section - now can appear alongside other sections and will only be used to validate sub-sections that don't have an explicit validation section. You can now create an empty ConfigObj with a configspec, programmatically set values and then validate. This would only partly work before.

Despite the rewrite and these changes the API is almost entirely unchanged. The major change is that when you create a ConfigObj instance and provide a configspec, the configspec attribute is only set on the ConfigObj instance - it isn't set on the sections until you set validate. You also can't set the configspec attribute to be a dictionary. This wasn't documented but did work previously.

The next release will be 4.6.0 and will follow soon. There are also minor changes to validate. These are minor changes that should be backwards compatible; in order to get the tests passing under Python 2.5 & 2.6.

Archives