A SQLite Tutorial with Python

Introduction

This tutorial will cover using SQLite in combination with Python's sqlite3 interface. SQLite is a single file relational database bundled with most standard Python installs. SQLite is often the technology of choice for small applications, particularly those of embedded systems and devices like phones and tablets, smart appliances, and instruments. However, it is not uncommon to hear it being used for small to medium web and desktop applications.

Creating a Database and Making a Connection

Creating a new SQLite database is as simple as creating a connection using the sqlite3 module in the Python standard library. To establish a connection all you need to do is pass a file path to the connect(...) method in the sqlite3 module, and if the database represented by the file does not exists one will be created at that path.

import sqlite3 con = sqlite3.connect('/path/to/file/db.sqlite3')

You will find that in everyday database programming you will be constantly creating connections to your database, so it is a good idea to wrap this simple connection statement into a reusable generalized function.

# db_utils.py import os import sqlite3 # create a default path to connect to and create (if necessary) a database # called 'database.sqlite3' in the same directory as this script DEFAULT_PATH = os.path.join(os.path.dirname(__file__), 'database.sqlite3') def db_connect(db_path=DEFAULT_PATH): con = sqlite3.connect(db_path) return con

Creating Tables

In order to create database tables you need to have an idea of the structure of the data you are interested in storing. There are many design considerations that go into defining the tables of a relational database, which entire books have been written about. I will not be going into the details of this practice and will instead leave it up to reader to further investigate.

However, to aid in our discussion of SQLite database programming with Python I will be working off the premise that a database needs to be created for a fictitious book store that has the below data already collected on book sales.

customer date product price Alan Turing 2/22/1944 Introduction to Combinatorics 7.99 Donald Knuth 7/3/1967 A Guide to Writing Short Stories 17.99 Donald Knuth 7/3/1967 Data Structures and Algorithms 11.99 Edgar Codd 1/12/1969 Advanced Set Theory 16.99

Upon inspecting this data it is evident that it contains information about customers, products, and orders. A common pattern in database design for transactional systems of this type are to break the orders into two additional tables, orders and line items (sometimes referred to as order details) to achieve greater normalization.

In a Python interpreter, in the same directory as the db_utils.py module defined previously, enter the SQL for creating the customers and products tables follows:

>>> from db_utils import db_connect >>> con = db_connect() # connect to the database >>> cur = con.cursor() # instantiate a cursor obj >>> customers_sql = """ ... CREATE TABLE customers ( ... id integer PRIMARY KEY, ... first_name text NOT NULL, ... last_name text NOT NULL)""" >>> cur.execute(customers_sql) >>> products_sql = """ ... CREATE TABLE products ( ... id integer PRIMARY KEY, ... name text NOT NULL, ... price real NOT NULL)""" >>> cur.execute(products_sql)

The above code creates a connection object then uses it to instantiate a cursor object. The cursor object is used to execute SQL statements on the SQLite database.

With the cursor created I then wrote the SQL to create the customers table, giving it a primary key along with a first and last name text field and assign it to a variable called customers_sql . I then call the execute(...) method of the cursor object passing it the customers_sql variable. I then create a products table in a similar way.

You can query the sqlite_master table, a built-in SQLite metadata table, to verify that the above commands were successful.

To see all the tables in the currently connected database query the name column of the sqlite_master table where the type is equal to "table".

>>> cur.execute("SELECT name FROM sqlite_master WHERE type='table'") <sqlite3.Cursor object at 0x104ff7ce0> >>> print(cur.fetchall()) [('customers',), ('products',)]

To get a look at the schema of the tables query the sql column of the same table where the type is still "table" and the name is equal to "customers" and/or "products".

>>> cur.execute("""SELECT sql FROM sqlite_master WHERE type='table' … AND name='customers'""") <sqlite3.Cursor object at 0x104ff7ce0> >>> print(cur.fetchone()[0]) CREATE TABLE customers ( id integer PRIMARY KEY, first_name text NOT NULL, last_name text NOT NULL)

The next table to define will be the orders table which associates customers to orders via a foreign key and the date of their purchase. Since SQLite does not support an actual date/time data type (or data class to be consistent with the SQLite vernacular) all dates will be represented as text values.

>>> orders_sql = """ ... CREATE TABLE orders ( ... id integer PRIMARY KEY, ... date text NOT NULL, ... customer_id integer, ... FOREIGN KEY (customer_id) REFERENCES customers (id))""" >>> cur.execute(orders_sql)

The final table to define will be the line items table which gives a detailed accounting of the products in each order.

lineitems_sql = """ ... CREATE TABLE lineitems ( ... id integer PRIMARY KEY, ... quantity integer NOT NULL, ... total real NOT NULL, ... product_id integer, ... order_id integer, ... FOREIGN KEY (product_id) REFERENCES products (id), ... FOREIGN KEY (order_id) REFERENCES orders (id))""" >>> cur.execute(lineitems_sql)

Loading the Data

In this section I will be demonstrating how to INSERT our sample data into the tables just created. A natural starting place would be to populate the products table first because without products we cannot have a sale and thus would not have the foreign keys to relate to the line items and orders. Looking at the sample data I see that there are four products:

Introduction to Combinatorics ($7.99)

A Guide to Writing Short Stories ($17.99)

Data Structures and Algorithms ($11.99)

Advanced Set Theory ($16.99)

The workflow for executing INSERT statements is simply:

Connect to the database Create a cursor object Write a parameterized insert SQL statement and store as a variable Call the execute method on the cursor object passing it the sql variable and the values, as a tuple, to be inserted into the table

Given this general outline let us write some more code.

>>> con = db_connect() >>> cur = con.cursor() >>> product_sql = "INSERT INTO products (name, price) VALUES (?, ?)" >>> cur.execute(product_sql, ('Introduction to Combinatorics', 7.99)) >>> cur.execute(product_sql, ('A Guide to Writing Short Stories', 17.99)) >>> cur.execute(product_sql, ('Data Structures and Algorithms', 11.99)) >>> cur.execute(product_sql, ('Advanced Set Theory', 16.99))

The above code probably seems pretty obvious, but let me discuss it a bit as there are some important things going on here. The insert statement follows the standard SQL syntax except for the ? bit. The ? 's are actually placeholders in what is known as a "parameterized query".

Parameterized queries are an important feature of essentially all database interfaces to modern high level programming languages such as the sqlite3 module in Python. This type of query serves to improve the efficiency of queries that are repeated several times. Perhaps more important, they also sanitize inputs that take the place of the ? placeholders which are passed in during the call to the execute method of the cursor object to prevent nefarious inputs leading to SQL injection. The following is a comic from the popular xkcd.com blog describing the dangers of SQL injection.

To populate the remaining tables we are going to follow a slightly different pattern to change things up a bit. The workflow for each order, identified by a combination of customer first and last name and the purchase date, will be:

Insert the new customer into the customers table and retrieve its primary key id Create an order entry based off the customer id and the purchase date then retrieve its primary key id For each product in the order determine its primary key id and create a line item entry associating the order and the product

To make things simpler on ourselves let us do a quick look up of all our products. For now do not worry too much about the mechanics of the SELECT SQL statement as we will devote a section to it shortly.

>>> cur.execute("SELECT id, name, price FROM products") >>> formatted_result = [f"{id:<5}{name:<35}{price:>5}" for id, name, price in cur.fetchall()] >>> id, product, price = "Id", "Product", "Price" >>> print('

'.join([f"{id:<5}{product:<35}{price:>5}"] + formatted_result)) Id Product Price 1 Introduction to Combinatorics 7.99 2 A Guide to Writing Short Stories 17.99 3 Data Structures and Algorithms 11.99 4 Advanced Set Theory 16.99

The first order was placed on Feb 22, 1944 by Alan Turing who purchased Introduction to Combinatorics for $7.99.

Start by making a new customer record for Mr. Turing then determine his primary key id by accessing the lastrowid field of the cursor object.

>>> customer_sql = "INSERT INTO customers (first_name, last_name) VALUES (?, ?)" >>> cur.execute(customer_sql, ('Alan', 'Turing')) >>> customer_id = cur.lastrowid >>> print(customer_id) 1

We can now create an order entry, collect the new order id value and associate it to a line item entry along with the product Mr. Turing ordered.

>>> order_sql = "INSERT INTO orders (date, customer_id) VALUES (?, ?)" >>> date = "1944-02-22" # ISO formatted date >>> cur.execute(order_sql, (date, customer_id)) >>> order_id = cur.lastrowid >>> print(order_id) 1 >>> li_sql = """INSERT INTO lineitems ... (order_id, product_id, quantity, total) ... VALUES (?, ?, ?, ?)""" >>> product_id = 1 >>> cur.execute(li_sql, (order_id, 1, 1, 7.99))

The remaining records are loaded exactly the same except for the order made to Donald Knuth, which will receive two line item entries. However, the repetitive nature of such a task is crying out the need to wrap these functionalities into reusable functions. In the db_utils.py module add the following code:

def create_customer(con, first_name, last_name): sql = """ INSERT INTO customers (first_name, last_name) VALUES (?, ?)""" cur = con.cursor() cur.execute(sql, (first_name, last_name)) return cur.lastrowid def create_order(con, customer_id, date): sql = """ INSERT INTO orders (customer_id, date) VALUES (?, ?)""" cur = con.cursor() cur.execute(sql, (customer_id, date)) return cur.lastrowid def create_lineitem(con, order_id, product_id, qty, total): sql = """ INSERT INTO lineitems (order_id, product_id, quantity, total) VALUES (?, ?, ?, ?)""" cur = con.cursor() cur.execute(sql, (order_id, product_id, qty, total)) return cur.lastrowid

Awh, now we can work with some efficiency!

You will need to exit() your Python interpreter and reload it to get your new functions to become accessible in the interpreter.

>>> from db_utils import db_connect, create_customer, create_order, create_lineitem >>> con = db_connect() >>> knuth_id = create_customer(con, 'Donald', 'Knuth') >>> knuth_order = create_order(con, knuth_id, '1967-07-03') >>> knuth_li1 = create_lineitem(con, knuth_order, 2, 1, 17.99) >>> knuth_li2 = create_lineitem(con, knuth_order, 3, 1, 11.99) >>> codd_id = create_customer(con, 'Edgar', 'Codd') >>> codd_order = create_order(con, codd_id, '1969-01-12') >>> codd_li = create_lineitem(con, codd_order, 4, 1, 16.99)

I feel compelled to give one additional piece of advice as a student of software craftsmanship. When you find yourself doing multiple database manipulations (INSERTs in this case) in order to accomplish what is actually one cumulative task (ie, creating an order) it is best to wrap the subtasks (creating customer, order, then line items) into a single database transaction so you can either commit on success or rollback if an error occurs along the way.

This would look something like this:

try: codd_id = create_customer(con, 'Edgar', 'Codd') codd_order = create_order(con, codd_id, '1969-01-12') codd_li = create_lineitem(con, codd_order, 4, 1, 16.99) # commit the statements con.commit() except: # rollback all database actions since last commit con.rollback() raise RuntimeError("Uh oh, an error occurred ...")

I want to finish this section with a quick demonstration of how to UPDATE an existing record in the database. Let's update the Guide to Writing Short Stories' price to 10.99 (going on sale).

>>> update_sql = "UPDATE products SET price = ? WHERE id = ?" >>> cur.execute(update_sql, (10.99, 2))

Querying the Database

Generally the most common action performed on a database is a retrieval of some of the data stored in it via a SELECT statement. For this section I will be demonstrating how to use the sqlite3 interface to perform simple SELECT queries.

To perform a basic multirow query of the customers table you pass a SELECT statement to the execute(...) method of the cursor object. After this you can iterate over the results of the query by calling the fetchall() method of the same cursor object.

>>> cur.execute("SELECT id, first_name, last_name FROM customers") >>> results = cur.fetchall() >>> for row in results: ... print(row) (1, 'Alan', 'Turing') (2, 'Donald', 'Knuth') (3, 'Edgar', 'Codd')

Lets say you would like to instead just retrieve one record from the database. You can do this by writing a more specific query, say for Donald Knuth's id of 2, and following that up by calling fetchone() method of the cursor object.

>>> cur.execute("SELECT id, first_name, last_name FROM customers WHERE id = 2") >>> result = cur.fetchone() >>> print(result) (2, 'Donald', 'Knuth')

See how the individual row of each result is in the form of a tuple? Well while tuples are a very useful Pythonic data structure for some programming use cases many people find them a bit hindering when it comes to the task of data retrieval. It just so happens that there is a way to represent the data in a way that is perhaps more flexible to some. All you need to do is set the row_factory method of the connection object to something more suitable such as sqlite3.Row . This will give you the ability to access the individual items of a row by position or keyword value.

>>> import sqlite3 >>> con.row_factory = sqlite3.Row >>> cur = con.cursor() >>> cur.execute("SELECT id, first_name, last_name FROM customers WHERE id = 2") >>> result = cur.fetchone() >>> id, first_name, last_name = result['id'], result['first_name'], result['last_name'] >>> print(f"Customer: {first_name} {last_name}'s id is {id}") Customer: Donald Knuth's id is 2

Conclusion

In this article I gave a brief demonstration of what I feel are the most important features and functionalities of the sqlite3 Python interface to the lightweight single file SQLite database that comes pre-bundled with most Python installs. I also tried to give a few bits of advices regarding best practices when it comes to database programming, but I do caution the new-comer that the intricacies of database programming is generally one of the most prone to security holes at the enterprise level and further knowledge is necessary before such an undertaking.

As always I thank you for reading and welcome comments and criticisms below.