Today’s post will deal with what may be one of the hardest aspects of data science which doesn’t involve analysis, but simply trying to make the backend of data science work. By backend I mean the database systems most data scientists will be working with on the job.

I will go over the following:

Build an absolute barebones Django app with a Relational Database Management System (RDBMS)

Illustrate the use of a PostgresSQL database attached to the Django app

How to move data in and out between different formats and platforms

While following this article doesn’t require any knowledge of Django, I think it’s important to appreciate the fact that a lot of data collection occurs through web apps.

For data scientists who are unfamiliar with Django, think of it as a framework for building web applications while adhering to the philosophy of “inversion of control”. This means Django takes care of the skeleton of the web app, and you’re responsible for fleshing out the actual content on top of the skeleton.

For readers who don’t like Django you can skip to the section titled: “The Payoff: Django’s Object Relational Mapper” towards the end of this post.

Our Django App: “DoubleBagger”

The app that I’m interested in creating is going to be called “DoubleBagger”, an investment blog where people self-publish their buy/sell opinions on public companies like Apple (ticker: AAPL) or Microsoft (ticker: MSFT).

And instead of firing up a Jupyter Notebook like my previous articles this time we’ll mainly be working with the command line + a text editor like Sublime Text.

And because this is aimed at data scientists, we’ll be using a conda environment:

# I like to do things on my desktop

# From the terminal: $ cd desktop && mkdir doublebagger && cd doublebagger $ conda create -n doublebagger

$ conda activate doublebagger # You should now have the (doublebagger) conda environment activated

And now we install our two main packages: Django and psycopg2 for connecting to a PostgreSQL database. Django already ships with SQLite which may actually be suitable for many organizations and for hobbyists, but we’re going to use Postgres instead. Furthermore, we’ll be using an older version of Django (current version is Django 2.1).

$ (doublebagger) conda install Django==1.9.6 psycopg2

After verifying you have these packages along with their depencies, create a source directory where we put our entire source code having to do with “Doublebagger.”

$ (doublebagger) mkdir src && cd src

We start every Django project in pretty much the same way with the same command:

# Inside of src:

# don't forget the space and period at the end $ (doublebagger) django-admin startproject doublebagger_blog .

The django-admin startproject command is what creates the skeleton or framework for our project and now if you check out what it’s inside of the src folder you should see:

doublebagger_blog: contains the project configurations our project including the settings.py file. manage.py: utility functions

Now we can open up our DoubleBagger project inside of Sublime Text or any other editor of your choice. You should see the exact same directory structure:

Assuming you have a postgres database already installed on your machine, we actually need to create a postgres database for our django app:

# from the command line: $ psql -d postgres postgres=# CREATE DATABASE doublebagger; # That's it!

# quit by: postgres=# \q

*If you don’t have postgreSQL you can follow these instructions.

Then inside of settings.py (using Sublime Text), we change the default configuration to account for the database we just created. Change this:

# settings.py DATABASES = {

'default': {

'ENGINE': 'django.db.backends.sqlite3',

'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),

}

}

To this:

# Your 'User' is probably different depending on how you set up

# postgres. In many cases, it's just 'postgres'.

# Also depends if you set up a password with you postgres. DATABASES = {

'default': {

'ENGINE': 'django.db.backends.postgresql_psycopg2',

'NAME': 'doublebagger',

'USER': 'WhoeverOwnsTheDatabase',

'PASSWORD': '',

'HOST': '127.0.0.1',

'PORT': '5432',

}

}

*Make sure to save your changes within the text-editor

Now if you go back to the command line, we can connect the app with the postgres database like so:

# Still inside of your src where manage.py lives: $ (doublebagger) python manage.py migrate

If everything went okay, you should see something like this:

Now from the same command line:

$ (doublebagger) python manage.py runserver

And point your browser to:

127.0.0.1:8000

You should see something like this: