Documentation walk-through

On the day of the sprint each participant was given a function or method to document — I was designated pandas.Series.str.contains. So just like Section 7, I’ll talk you through this example.

Step 1- Initial overview

Check out the current state of the documentation. Create a checklist of the potential areas of improvement and get that much needed dopamine fix after ticking each one off.

As you can see, some sections discussed above are absent, there are a couple of discrepancies in regards to parameter types and a few formatting issues. Overall, not too shabby.

Step 2- Give the function a test drive

Before you can write about a function, you’ve got to understand the function, this is essential. Think of as many conceivable reasons why you would want to use this in day to day programming, then implement some basic use cases. These can then be used in Section 7 as examples. You could also further clarify your understanding of the function by writing a couple of unit tests.

Step 3- Who else is in the same boat?

Remember, the reason why we’re doing this is to help people understand code efficiently. Search the function on Google, Stack Overflow or Github and see if anyone has had problems previously. If similar questions keep popping up, tailor the documentation towards them to clear up any ambiguity. When searching for problems with pandas.Series.str.contains I kept coming across questions involving regular expression. To deal with this, I made sure to include a few of these in the related examples.

Step 4- Conformity

Different minds have different styles and generally speaking variety is a good thing. However, as a code base grows, consistency is crucial. With python, most libraries try to follow code formatting guidelines, pandas follows PEP 257. This includes guidelines on everything from whitespaces to the use of infinitive verbs. This might sound pedantic, but these guidelines have been modified and tweaked for almost two decades, they are there for a reason — so use them!

Luckily, the core developers of pandas have built a script that checks if your code is compliant with their style guidelines. It’s handy to run this every so often and correct your code accordingly.

python scripts/validate_docstrings.py <insert-your-function-here>

You should also run the well known python linter, flake8 to keep everything tidy.

git diff upstream/master -u -- "*.py" | flake8 --diff

Once you feel like the documentation is up to standard, give it a quick preview with:

python make.py html — single pandas.Series.str.contains

Looks good? Time for a pull request. Check out this page for pushing your suggestions.

This stage involves you publishing your suggestions to the core developers, if they like your suggestions, they’ll accept it for the next release.

Just one more thing…the core developers will know more about good documentation than you, so they may have some suggestions about your suggestions. My advice is don’t take it to heart, just rinse and repeat, do the fixes and push to your pull request.

Don’t be afraid of making your mistakes public, feedback is your friend. Otherwise you’ll never learn from your mistakes, or worse — compound a bad habit. Remember the core developers are crying out for people like you who want to help!

So here it is, in all its glory, checkout my improved version of pandas.Series.str.contains.

You can see I’ve given a clear and concise description of the method, added and an extended summary, spruced up the parameters, tidied the returns section and added a tonne of typical use cases.

If this beautiful piece of documentation has inspired you to try your hand at contributing to pandas (or any other project), sift through the documentation here and try and scope out anything you think you can improve upon. If you are unsure what to delve into, you can return a list of documentation that needs improving by using this script:

python scripts/validate_docstrings.py

So there we go! I hope that gave you a bit of confidence to start your journey of becoming an open source contributor. Thanks for reading, and if you have any questions leave them in the comments section below.