16 min read

Python 3 Web Development Beginner’s Guide

Nowadays, a wiki is a well-known tool to enable people to maintain a body of knowledge in a cooperative way. Wikipedia (http://wikipedia.org) might be the most famous example of a wiki today, but countless numbers of forums use some sort of wiki and many tools and libraries exist to implement a wiki application.

In this article, we will develop a wiki of our own, and in doing so, we will focus on two important concepts in building web applications. The first one is the design of the data layer. The second one is input validation. A wiki is normally a very public application that might not even employ a basic authentication scheme to identify users. This makes contributing to a wiki very simple, yet also makes a wiki vulnerable in the sense that anyone can put anything on a wiki page. It’s therefore a good idea to verify the content of any submitted change. You may, for example, strip out any HTML markup or disallow external links.

Enhancing user interactions in a meaningful way is often closely related with input validation. Client-side input validation helps prevent the user from entering unwanted input and is therefore a valuable addition to any application but is not a substitute for server-side input validation as we cannot trust the outside world not to try and access our server in unintended ways.

The data layer

A wiki consists of quite a number of distinct entities we can indentify. We will implement these entities and the relations that exist between them by reusing the Entity/Relation framework developed earlier.

Time for action – designing the wiki data model

As with any application, when we start developing our wiki application we must first take a few steps to create a data model that can act as a starting point for the development:

Identify each entity that plays a role in the application. This might depend on the requirements. For example, because we want the user to be able to change the title of a topic and we want to archive revisions of the content, we define separate Topic and Page entities. Identify direct relations between entities. Our decision to define separate Topic and Page entities imply a relation between them, but there are more relations that can be identified, for example, between Topic and Tag. Do not specify indirect relations: All topics marked with the same tag are in a sense related, but in general, it is not necessary to record these indirect relations as they can easily be inferred from the recorded relation between topics and tags.

The image shows the different entities and relations we can identify in our wiki application.

In the diagram, we have illustrated the fact that a Topic may have more than one Page while a Page refers to a single User in a rather informal way by representing Page as a stack of rectangles and User as a single rectangle. In this manner, we can grasp the most relevant aspects of the relations at a glance. When we want to show more relations or relations with different characteristics, it might be a good idea to use more formal methods and tools. A good starting point is the Wikipedia entry on UML: http://en.wikipedia.org/wiki/Unified_Modelling_Language.

What just happened?

With the entities and relations in our data model identified, we can have a look at their specific qualities.

The basic entity in a wiki is a Topic. A topic, in this context, is basically a title that describes what this topic is about. A topic has any number of associated Pages. Each instance of a Page represents a revision; the most recent revision is the current version of a topic. Each time a topic is edited, a new revision is stored in the database. This way, we can simply revert to an earlier version if we made a mistake or compare the contents of two revisions. To simplify identifying revisions, each revision has a modification date. We also maintain a relation between the Page and the User that modified that Page.

In the wiki application that we will develop, it is also possible to associate any number of tags with a topic. A Tag entity consists simply of a tag attribute. The important part is the relation that exists between the Topic entity and the Tag entity.

Like a Tag, a Word entity consists of a single attribute. Again, the important bit is the relation, this time, between a Topic and any number of Words. We will maintain this relation to reflect the words used in the current versions (that is, the last revision of a Page) of a Topic. This will allow for fairly responsive full text search facilities.

The final entity we encounter is the Image entity. We will use this to store images alongside the pages with text. We do not define any relation between topics and images. Images might be referred to in the text of the topic, but besides this textual reference, we do not maintain a formal relation. If we would like to maintain such a relation, we would be forced to scan for image references each time a new revision of a page was stored, and probably we would need to signal something if a reference attempt was made to a non-existing image. In this case, we choose to ignore this: references to images that do not exist in the database will simply show nothing:

Chapter6/wikidb.py

from entity import Entity

from relation import Relation

class User(Entity): pass

class Topic(Entity): pass

class Page(Entity): pass

class Tag(Entity): pass

class Word(Entity): pass

class Image(Entity): pass

class UserPage(Relation): pass

class TopicPage(Relation): pass

class TopicTag(Relation): pass

class ImagePage(Relation): pass

class TopicWord(Relation): pass

def threadinit(db):

User.threadinit(db)

Topic.threadinit(db)

Page.threadinit(db)

Tag.threadinit(db)

Word.threadinit(db)

Image.threadinit(db)

UserPage.threadinit(db)

TopicPage.threadinit(db)

TopicTag.threadinit(db)

ImagePage.threadinit(db)

TopicWord.threadinit(db)

def inittable():

User.inittable(userid=”unique not null”)

Topic.inittable(title=”unique not null”)

Page.inittable(content=””,

modified=”not null default CURRENT_TIMESTAMP”)

Tag.inittable(tag=”unique not null”)

Word.inittable(word=”unique not null”)

Image.inittable(type=””,data=”blob”,title=””,

modified=”not null default CURRENT_TIMESTAMP”,

description=””)

UserPage.inittable(User,Page)

TopicPage.inittable(Topic,Page)

TopicTag.inittable(Topic,Tag)

TopicWord.inittable(Topic,Word)

Because we can reuse the entity and relation modules we developed earlier, the actual implementation of the database layer is straightforward (full code is available as wikidb.py). After importing both modules, we first define a subclass of Entity for each entity we identified in our data model. All these classes are used as is, so they have only a pass statement as their body.

Likewise, we define a subclass of Relation for each relation we need to implement in our wiki application.

All these Entity and Relation subclasses still need the initialization code to be called once each time the application starts and that is where the convenience function initdb() comes in. It bundles the initialization code for each entity and relation (highlighted).

Many entities we define here are simple but a few warrant a closer inspection. The Page entity contains a modified column that has a non null constraint. It also has a default: CURRENT_TIMESTAMP (highlighted). This default is SQLite specific (other database engines will have other ways of specifying such a default) and will initialize the modified column to the current date and time if we create a new Page record without explicitly setting a value.

The Image entity also has a definition that is a little bit different: its data column is explicitly defined to have a blob affinity. This will enable us to store binary data without any problem in this table, something we need to store and retrieve the binary data contained in an image. Of course, SQLite will happily store anything we pass it in this column, but if we pass it an array of bytes (not a string that is), that array is stored as is.

The delivery layer

With the foundation, that is, the data layer in place, we build on it when we develop the delivery layer. Between the delivery layer and the database layer, there is an additional layer that encapsulates the domain-specific knowledge (that is, it knows how to verify that the title of a new Topic entity conforms to the requirements we set for it before it stores it in the database):

Each different layer in our application is implemented in its own file or files. It is easy to get confused, so before we delve further into these files, have a look at the following table. It lists the different files that together make up the wiki application and refers to the names of the layers.

We’ll focus on the main CherryPy application first to get a feel for the behavior of the application.

Time for action – implementing the opening screen

The opening screen of the wiki application shows a list of all defined topics on the right and several ways to locate topics on the left. Note that it still looks quite rough because, at this point, we haven’t applied any style sheets:

Let us first take a few steps to identify the underlying structure. This structure is what we would like to represent in the HTML markup:

Identify related pieces of information that are grouped together. These form the backbone of a structured web page. In this case, the search features on the left form a group of elements distinct from the list of topics on the right.

Identify distinct pieces of functionality within these larger groups. For example, the elements (input field and search button) that together make up the word search are such a piece of functionality, as are the tag search and the tag cloud.

Try to identify any hidden functionality, that is, necessary pieces of information that will have to be part of the HTML markup, but are not directly visible on a page. In our case, we have links to the jQuery and JQuery UI JavaScript libraries and links to CSS style sheets.

Identifying these distinct pieces will not only help to put together HTML markup that reflects the structure of a page, but also help to identify necessary functionality in the delivery layer because each of these functional pieces is concerned with specific information processed and produced by the server.

What just happened?

Let us look in somewhat more detail at the structure of the opening page that we identified.

Most notable are three search input fields to locate topics based on words occurring in their bodies, based on their actual title or based on tags associated with a topic. These search fields feature auto complete functionality that allows for comma-separated lists. In the same column, there is also room for a tag cloud, an alphabetical list of tags with font sizes dependent on the number of topics marked with that tag.

The structural components

The HTML markup for this opening page is shown next. It is available as the file basepage.html and the contents of this file are served by several methods in the Wiki class implementing the delivery layer, each with a suitable content segment. Also, some of the content will be filled in by AJAX calls, as we will see in a moment:

Chapter6/basepage.html





Wiki

href=”http://ajax.googleapis.com/ajax/libs/jqueryui/1.8.3/themes/smoothness/jquery-ui.css”type=”text/css” media=”all” />type=”text/css” media=”all” />





Wiki Home





Search topic





Search







Search word





Search







Search tag







Tag cloud





%s

The element contains both links to CSS style sheets and elements that refer to the jQuery libraries. This time, we choose again to retrieve these libraries from a public content delivery network.