As a learning exercise, I tried converting the Universal Feed Parser to Python 3.0. I picked it because it is a relatively self contained code base that I am familiar with, one that is actively in use, and one that has seen the wear and tear of dealing with compatibility (and the need to monkeypatch the occasional bug) of a number of Python releases.

$ svn co http://svn.python.org/projects/sandbox/trunk/2to3/ $ cd 2to3 $ python refactor.py -w ../feedparser/feedparser.py > feedparser.2to3.diff root: Generating grammar tables from /home/rubys/svn/2to3/PatternGrammar.txt root: Writing grammar tables to /home/rubys/svn/2to3/PatternGrammar.pickle ../feedparser/feedparser.py: At line 3091: You should use a for loop here RefactoringTool: Files that were modified: RefactoringTool: ../feedparser/feedparser.py $ python refactor.py -w ../feedparser/feedparsertest.py > feedparsertest.2to3.diff RefactoringTool: Files that were modified: RefactoringTool: ../feedparser/feedparsertest.py

A few manual changes later, and 91% of the tests pass. I’m confident that with a little more work, I could quickly get that to 99%, perhaps even to 100%.

The Good

Python 3.0 feels very natural and comfortable, at least to this Python programmer. The language feels clean and new again. No more “new-style” classes. Things that should have been iteraters all along now are. Python 3.0 is full of those kind of small changes.

No bugs were found in the language, though some minor issues (noted below) were found in various parts of the runtime.

The 2to3 conversion, even at this first alpha sandbox state, was painless and efficient. It also didn’t reduce the readability or break the functionality of the code produced.

conversion, even at this first alpha sandbox state, was painless and efficient. It also didn’t reduce the readability or break the functionality of the code produced. The places where manual attention is required did generally seem to be places where human attention is required.

The Bad

The Unicode change (while * VERY * welcome) is going to hit people hard. While the UFP code has greater than its fair share of such code, the fact that people will no longer simply be able to open(file).read() unless that file is utf-8 is going to be a big shock.

* welcome) is going to hit people hard. While the UFP code has greater than its fair share of such code, the fact that people will no longer simply be able to unless that file is is going to be a big shock. Some of the python3 libraries don’t seem to have internalized the Unicode changes yet. base64.decodestring can only handle bytes , not characters (why?). More troublesome to me is that I couldn't get xml.sax.xmlreader.InputSource to work with a io.BytesIO (a StringIO work-alike for bytes ), but instead only seemed to work with Characters — something that is at odds with proper handling of XML. However, I do realize that this is an alpha, and fully believe that these issues will be worked out.

can only handle , not characters (why?). More troublesome to me is that I couldn't get to work with a (a work-alike for ), but instead only seemed to work with Characters — something that is at odds with proper handling of XML. However, I do realize that this is an alpha, and fully believe that these issues will be worked out. The test code for UFP relies on eval , and some portion of those strings — ones that can never be automatically handled by a 2to3 migration tool — rely on Python 2.x specific syntax. Ultimately it might be worth considering introducing an python2 module with functions like eval that can be used to ease migration.

, and some portion of those strings — ones that can never be automatically handled by a migration tool — rely on Python 2.x specific syntax. Ultimately it might be worth considering introducing an module with functions like that can be used to ease migration. The places that the 2to3 migration tool can’t currently handle, and perhaps never will be able to handle, will often require somebody who has an understanding and a history with the code base. Such people aren’t always available. I’m not sure what can be done about that.

The Ugly