The database abstraction layer

DAL

Dependencies

web2py comes with a Database Abstraction Layer (DAL), an API that maps Python objects into database objects such as queries, tables, and records. The DAL dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and the application will be portable among different types of databases. A partial list of supported databases is show in the table below. Please check on the web2py web site and mailing list for more recent adapters. Google NoSQL is treated as a particular case in Chapter 13.

The Gotchas section at the end of this chapter has some more information about specific databases.

The Windows binary distribution works out of the box with SQLite, MSSQL, PostgreSQL and MySQL. The Mac binary distribution works out of the box with SQLite. To use any other database back-end, run from the source distribution and install the appropriate driver for the required back end.

database drivers

Once the proper driver is installed, start web2py from source, and it will find the driver. Here is a list of the drivers web2py can use:

DAL

SQLite

MySQL

PostgresSQL

Oracle

MSSQL

FireBird

DB2

Informix

Sybase

Teradata

MongoDB

CouchDB

SAPDB

Cubrid

database drivers (source) SQLite sqlite3 or pysqlite2 or zxJDBC [zxjdbc] (on Jython) PostgreSQL psycopg2 [psycopg2] or zxJDBC [zxjdbc] (on Jython) MySQL pymysql [pymysql] or MySQLdb [mysqldb] Oracle cx_Oracle [cxoracle] MSSQL pyodbc [pyodbc] or pypyodbc[pypyodbc] FireBird kinterbasdb [kinterbasdb] or fdb or pyodbc DB2 pyodbc [pyodbc] Informix informixdb [informixdb] Ingres ingresdbi [ingresdbi] Cubrid cubriddb [cubridb] Sybase Sybase [Sybase] Teradata pyodbc [Teradata] SAPDB sapdb [SAPDB] MongoDB pymongo [pymongo] IMAP imaplib [IMAP]

sqlite3 , pymysql , and imaplib ship with web2py. Support of MongoDB is experimental. The IMAP option allows to use DAL to access IMAP.

The DAL: A quick tour

web2py defines the following classes that make up the DAL:

The DAL object represents a database connection. For example:

sqlite

db = DAL ( 'sqlite://storage.sqlite' )

define_table

Table represents a database table. You do not directly instantiate Table; instead, DAL.define_table instantiates it.

db . define_table ( 'mytable' , Field ( 'myfield' ))

The most important methods of a Table are:

insert

truncate

drop

import_from_csv_file

count

insert , truncate , drop , and import_from_csv_file .

Field

Field represents a database field. It can be instantiated and passed as an argument to DAL.define_table .

Rows

DAL Rows

Row

Row

rows = db ( db . mytable . myfield != None ) . select ()

Row

is the object returned by a database select. It can be thought of as a list ofrows:

Row contains field values.

for row in rows : print row . myfield

Query

Query is an object that represents a SQL "where" clause:

myquery = ( db . mytable . myfield != None ) | ( db . mytable . myfield > 'A' )

Set

Set is an object that represents a set of records. Its most important methods are count , select , update , and delete . For example:

myset = db ( myquery ) rows = myset . select () myset . update ( myfield = 'somevalue' ) myset . delete ()

Expression

Expression is something like an orderby or groupby expression. The Field class is derived from the Expression. Here is an example.

myorder = db . mytable . myfield . upper () | db . mytable . id db () . select ( db . table . ALL , orderby = myorder )

Using the DAL "stand-alone"

The DAL can be used in a non-web2py environment via

from pydal import DAL , Field

DAL constructor

Basic use:

>>> db = DAL ( 'sqlite://storage.sqlite' )

The database is now connected and the connection is stored in the global variable db .

At any time you can retrieve the connection string.

_uri

>>> db . _uri sqlite://storage.sqlite

and the database name

_dbname

>>> db . _dbname sqlite

The connection string is called a _uri because it is an instance of a Uniform Resource Identifier.

The DAL allows multiple connections with the same database or with different databases, even databases of different types. For now, we will assume the presence of a single database since this is the most common situation.

DAL signature

DAL ( uri = 'sqlite://dummy.db' , pool_size = 0 , folder = None , db_codec = 'UTF-8' , check_reserved = None , migrate = True , fake_migrate = False , migrate_enabled = True , fake_migrate_all = False , decode_credentials = False , driver_args = None , adapter_args = None , attempts = 5 , auto_import = False , bigint_id = False , debug = False , lazy_tables = False , db_uid = None , do_connect = True , after_connection = None , tables = None , ignore_field_case = True , entity_quoting = False , table_hash = None )

Connection strings (the uri parameter)

connection strings

A connection with the database is established by creating an instance of the DAL object:

db = DAL ( 'sqlite://storage.sqlite' , pool_size = 0 )

db is not a keyword; it is a local variable that stores the connection object DAL . You are free to give it a different name. The constructor of DAL requires a single argument, the connection string. The connection string is the only web2py code that depends on a specific back-end database. Here are examples of connection strings for specific types of supported back-end databases (in all cases, we assume the database is running from localhost on its default port and is named "test"):

ndb

SQLite sqlite://storage.sqlite MySQL mysql://username:password@localhost/test?set_encoding=utf8mb4 PostgreSQL postgres://username:password@localhost/test MSSQL (legacy) mssql://username:password@localhost/test MSSQL (>=2005) mssql3://username:password@localhost/test MSSQL (>=2012) mssql4://username:password@localhost/test FireBird firebird://username:password@localhost/test Oracle oracle://username/password@test DB2 db2://username:password@test Ingres ingres://username:password@localhost/test Sybase sybase://username:password@localhost/test Informix informix://username:password@test Teradata teradata://DSN=dsn;UID=user;PWD=pass;DATABASE=test Cubrid cubrid://username:password@localhost/test SAPDB sapdb://username:password@localhost/test IMAP imap://user:password@server:port MongoDB mongodb://username:password@localhost/test Google/SQL google:sql://project:instance/database Google/NoSQL google:datastore Google/NoSQL/NDB google:datastore+ndb

Notice that in SQLite the database consists of a single file. If it does not exist, it is created. This file is locked every time it is accessed. In the case of MySQL, PostgreSQL, MSSQL, FireBird, Oracle, DB2, Ingres and Informix the database "test" must be created outside web2py. Once the connection is established, web2py will create, alter, and drop tables appropriately.

In the MySQL connection string, the ?set_encoding=utf8mb4 at the end sets the encoding to UTF-8 and avoids an Invalid utf8 character string: error on Unicode characters that consist of four bytes, as by default, MySQL can only handle Unicode characters that consist of one to three bytes. [mathiasbyensbe]

In the Google/NoSQL case the +ndb option turns on NDB. NDB uses a Memcache buffer to read data that is accessed often. This is completely automatic and done at the datastore level, not at the web2py level.

It is also possible to set the connection string to None . In this case DAL will not connect to any back-end database, but the API can still be accessed for testing.

Some times you may need to generate SQL as if you had a connection but without actually connecting to the database. This can be done with

db = DAL ( '...' , do_connect = False )

In this case you will be able to call _select , _insert , _update , and _delete to generate SQL but not call select , insert , update , and delete . In most of the cases you can use do_connect=False even without having the required database drivers.

Notice that by default web2py uses utf8 character encoding for databases. If you work with existing databases that behave differently, you have to change it with the optional parameter db_codec like

db = DAL ( '...' , db_codec = 'latin1' )

Otherwise you'll get UnicodeDecodeError tickets.

Connection pooling

connection pooling

A common argument of the DAL constructor is the pool_size ; it defaults to zero.

As it is rather slow to establish a new database connection for each request, web2py implements a mechanism for connection pooling. Once a connection is established and the page has been served and the transaction completed, the connection is not closed but goes into a pool. When the next http request arrives, web2py tries to recycle a connection from the pool and use that for the new transaction. If there are no available connections in the pool, a new connection is established.

When web2py starts, the pool is always empty. The pool grows up to the minimum between the value of pool_size and the max number of concurrent requests. This means that if pool_size=10 but our server never receives more than 5 concurrent requests, then the actual pool size will only grow to 5. If pool_size=0 then connection pooling is not used.

Connections in the pools are shared sequentially among threads, in the sense that they may be used by two different but not simultaneous threads. There is only one pool for each web2py process.

The pool_size parameter is ignored by SQLite and Google App Engine. Connection pooling is ignored for SQLite, since it would not yield any benefit.

Connection failures (attempts parameter)

If web2py fails to connect to the database it waits 1 second and by default tries again up to 5 times before declaring a failure. In case of connection pooling it is possible that a pooled connection that stays open but unused for some time is closed by the database end. Thanks to the retry feature web2py tries to re-establish these dropped connections. The number of attempts is set via the attempts parameter.

Lazy Tables

setting lazy_tables = True provides a major performance boost. See below: lazy tables

Model-less applications

Using web2py's model directory for your application models is very convenient and productive. With lazy tables and conditional models, performance is usually acceptable even for large applications. Many experienced developers use this is production environments.

However, it is possible to define DAL tables on demand inside controller functions or modules. This may make sense when the number or complexity of table definitions overloads the use of lazy tables and conditional models.

This is referred to as "model-less" development by the web2py community. It means less use of the automatic execution of Python files in the model directory. It does not imply abandoning the concept of models, views and controllers.

Web2py's auto-execution of Python code inside the model directory does this for you:

models are run automatically every time a request is processed models access web2py's global scope.

Models also make for useful interactive shell sessions when web2py is started with the -M commandline option.

Also, remember maintainability: other web2py developers expect to find model definitions in the model directory.

To use the "model-less" approach, you take responsibility for doing these two housekeeping tasks. You call the table definitions when you need them, and provide necessary access to global scope via the current object (as described in Chapter 4).

For example, a typical model-less application may leave the definitions of the database connection objects in the model file, but define the tables on demand per controller function.

The typical case is to move the table definitions to a module file (a Python file saved in the modules directory).

If the function to define a set of tables is called define_employee_tables() in a module called "table_setup.py", your controller that wants to refer to the tables related to employee records in order to make an SQLFORM needs to call the define_employee_tables() function before accessing any tables. The define_employee_tables() function needs to access the database connection object in order to define tables. This is why you need to correctly use the current object in the module file containing define_employee_tables() (as mentioned above).

Replicated databases

The first argument of DAL(...) can be a list of URIs. In this case web2py tries to connect to each of them. The main purpose for this is to deal with multiple database servers and distribute the workload among them). Here is a typical use case:

db = DAL ([ 'mysql://...1' , 'mysql://...2' , 'mysql://...3' ])

In this case the DAL tries to connect to the first and, on failure, it will try the second and the third. This can also be used to distribute load in a database master-slave configuration. We will talk more about this in Chapter 13 in the context of scalability.

Reserved keywords

reserved Keywords

check_reserved tells the constructor to check table names and column names against reserved SQL keywords in target back-end databases. check_reserved defaults to None.

This is a list of strings that contain the database back-end adapter names.

The adapter name is the same as used in the DAL connection string. So if you want to check against PostgreSQL and MSSQL then your connection string would look as follows:

db = DAL ( 'sqlite://storage.sqlite' , check_reserved = [ 'postgres' , 'mssql' ])

The DAL will scan the keywords in the same order as of the list.

There are two extra options "all" and "common". If you specify all, it will check against all known SQL keywords. If you specify common, it will only check against common SQL keywords such as SELECT , INSERT , UPDATE , etc.

For supported back-ends you may also specify if you would like to check against the non-reserved SQL keywords as well. In this case you would append _nonreserved to the name. For example:

check_reserved = [ 'postgres' , 'postgres_nonreserved' ]

The following database backends support reserved words checking.

PostgreSQL postgres(_nonreserved) MySQL mysql FireBird firebird(_nonreserved) MSSQL mssql Oracle oracle

Database quoting and case settings

entity_quoting

ignore_field_case

Quoting of SQL entities are enabled by default in DAL, that is:

entity_quoting = True

This way identifiers are automatically quoted in SQL generated by DAL. At SQL level keywords and unquoted identifiers are case insensitive, thus quoting an SQL identifier makes it case sensitive.

Notice that unquoted identifiers should always be folded to lower case by the back-end engine according to SQL standard but not all engines are compliant with this (for example PostgreSQL default folding is upper case).

By default DAL ignores field case too, to change this use:

ignore_field_case = False

To be sure of using the same names in python and in the DB schema, you must arrange for both settings above. Here is an example:

db = DAL ( ignore_field_case = False ) db . define_table ( 'table1' , Field ( 'column' ), Field ( 'COLUMN' )) query = db . table1 . COLUMN != db . table1 . column

Making a secure connection

Sometimes it is necessary (and advised) to connect to your database using secure connection, especially if your database is not on the same server as your application. In this case you need to pass additional parameters to the database driver. You should refer to database driver documentation for details.

For PostgreSQL with psycopg2 it should look like this:

DAL ( 'postgres://user_name:user_password@server_addr/db_name' , driver_args = { 'sslmode' : 'require' , 'sslrootcert' : 'root.crt' , 'sslcert' : 'postgresql.crt' , 'sslkey' : 'postgresql.key' })

where parameters sslrootcert , sslcert and sslkey should contain the full path to the files. You should refer to PostgreSQL documentation on how to configure PostgreSQL server to accept secure connections.

Other DAL constructor parameters

Database folder location

folder sets the place where migration files will be created (see Migrations section in this chapter for details). It is also used for SQLite databases. Automatically set within web2py. Set a path when using DAL outside web2py.

Default migration settings

The DAL constructor migration settings are booleans affecting defaults and global behaviour.

migrate = True sets default migrate behavior for all tables

fake_migrate = False sets default fake_migrate behavior for all tables

migrate_enabled = True If set to False disables ALL migrations

fake_migrate_all = False If set to True fake migrates ALL tables

Experiment with the web2py shell

You can experiment with the DAL API using the web2py shell, that is available using the -S command line option (read more in Chapter 4).

You need to choose an application to run the shell on, mind that database changes may be persistent. So be carefull and do NOT exitate to create a new application for doing testing instead of tampering with an existing one.

Start by creating a connection. For the sake of example, you can use SQLite. Nothing in this discussion changes when you change the back-end engine.

Note that most of the code snippets that contain the python prompt >>> are directly executable via a plain shell, which you can obtain using -PS command line options.

Table constructor

define_table

Table

Tables are defined in the DAL via define_table .

define_table signature

Field

The signature for define_table method is:

define_table ( tablename , * fields , ** kwargs )

It accepts a mandatory table name and an optional number of Field instances (even none). You can also pass a Table (or subclass) object instead of a Field one, this clones and adds all the fields (but the "id") to the defining table. Other optional keyword args are: rname , redefine , common_filter , fake_migrate , fields , format , migrate , on_define , plural , polymodel , primarykey , sequence_name , singular , table_class , and trigger_name , which are discussed below.

For example:

>>> db . define_table ( 'person' , Field ( 'name' )) <Table person (id, name)>

It defines, stores and returns a Table object called "person" containing a field (column) "name". This object can also be accessed via db.person , so you do not need to catch the value returned by define_table.

id : Notes about the primary key

Do not declare a field called "id", because one is created by web2py anyway. Every table has a field called "id" by default. It is an auto-increment integer field (usually starting at 1) used for cross-reference and for making every record unique, so "id" is a primary key. (Note: the id counter starting at 1 is back-end specific. For example, this does not apply to the Google App Engine NoSQL.)

named id field

Optionally you can define a field of type='id' and web2py will use this field as auto-increment id field. This is not recommended except when accessing legacy database tables which have a primary key under a different name. With some limitation, you can also use different primary keys using the primarykey parameter.

plural and singular

Smartgrid objects may need to know the singular and plural name of the table. The defaults are smart but these parameters allow you to be specific. Smartgrid is described in Chapter 7.

redefine

Tables can be defined only once but you can force web2py to redefine an existing table:

db . define_table ( 'person' , Field ( 'name' )) db . define_table ( 'person' , Field ( 'name' ), redefine = True )

The redefinition may trigger a migration if table definition changes.

format : Record representation

It is optional but recommended to specify a format representation for records with the format parameter.

db . define_table ( 'person' , Field ( 'name' ), format = ' %(name)s ' )

or

db . define_table ( 'person' , Field ( 'name' ), format = ' %(name)s %(id)s ' )

or even more complex ones using a function:

db . define_table ( 'person' , Field ( 'name' ), format = lambda r : r . name or 'anonymous' )

The format attribute will be used for two purposes:

To represent referenced records in select/option drop-downs.

To set the db.othertable.otherfield.represent attribute for all fields referencing this table. This means that SQLTABLE will not show references by id but will use the format preferred representation instead. (Look at Serializing Rows in views section in this chapter to learn more about SQLTABLE.)

rname : Real name

rname sets a database backend name for the table. This makes the web2py table name an alias, and rname is the real name used when constructing the query for the backend. To illustrate just one use, rname can be used to provide MSSQL fully qualified table names accessing tables belonging to other databases on the server:

rname = 'db1.dbo.table1'

primarykey : Support for legacy tables

primarykey helps support legacy tables with existing primary keys, even multi-part. See Legacy databases and keyed tables section in this chapter.

migrate , fake_migrate

migrate sets migration options for the table. Refer to Migrations section in this chapter for details.

table_class

If you define your own Table class as a sub-class of pydal.objects.Table, you can provide it here; this allows you to extend and override methods. Example:

from pydal.objects import Table class MyTable ( Table ): ... db . define_table ( ... , table_class = MyTable )

sequence_name

The name of a custom table sequence (if supported by the database). Can create a SEQUENCE (starting at 1 and incrementing by 1) or use this for legacy tables with custom sequences.

Note that when necessary, web2py will create sequences automatically by default.

trigger_name

Relates to sequence_name . Relevant for some backends which do not support auto-increment numeric fields.

polymodel

For Google App Engine

on_define

on_define is a callback triggered when a lazy_table is instantiated, although it is called anyway if the table is not lazy. This allows dynamic changes to the table without losing the advantages of delayed instantiation.

Example:

db = DAL ( lazy_tables = True ) db . define_table ( 'person' , Field ( 'name' ), Field ( 'age' , 'integer' ), on_define = lambda table : [ table . name . set_attributes ( requires = IS_NOT_EMPTY (), default = '' ), table . age . set_attributes ( requires = IS_INT_IN_RANGE ( 0 , 120 ), default = 30 ) ])

Note this example shows how to use on_define but it is not actually necessary. The simple requires values could be added to the Field definitions and the table would still be lazy. However, requires which take a Set object as the first argument, such as IS_IN_DB, will make a query like

db . sometable . somefield == some_value

sometable

on_define

which would causeto be defined early. This is the situation saved by

Lazy Tables, a major performance boost

lazy tables

web2py models are executed before controllers, so all tables are defined at every request. Not all tables are needed to handle each request, so it is possible that some of the time spent defining tables is wasted. Conditional models (see Chapter 4) can help, but web2py offers a big performance boost via lazy_tables. This feature means that table creation is deferred until the table is actually referenced. Enabling lazy tables is made when initialising a database via the DAL constructor. It requires setting the lazy_tables parameter:

DAL ( ... , lazy_tables = True )

Adding attributes to fields and tables

This is one of the most significant response-time performance boosts in web2py.

If you need to add custom attributes to fields, you can simply do this:

db . table . field . extra = {}

"extra" is not a keyword ; it's a custom attributes now attached to the field object. You can do it with tables too but they must be preceded by an underscore to avoid naming conflicts with fields:

db . table . _extra = {}

Field constructor

Field constructor

These are the default values of a Field constructor:

Field ( fieldname , type = 'string' , length = None , default = DEFAULT , required = False , requires = DEFAULT , ondelete = 'CASCADE' , notnull = False , unique = False , uploadfield = True , widget = None , label = None , comment = None , writable = True , readable = True , searchable = True , listable = True , update = None , authorize = None , autodelete = False , represent = None , uploadfolder = None , uploadseparate = None , uploadfs = None , compute = None , filter_in = None , filter_out = None , custom_qualifier = None , map_none = None , rname = None )

where DEFAULT is a special value used to allow the value None for a parameter.

Not all of them are relevant for every field. length is relevant only for fields of type "string". uploadfield , authorize , and autodelete are relevant only for fields of type "upload". ondelete is relevant only for fields of type "reference" and "upload".

length sets the maximum length of a "string", "password" or "upload" field. If length is not specified a default value is used but the default value is not guaranteed to be backward compatible. To avoid unwanted migrations on upgrades, we recommend that you always specify the length for string, password and upload fields.

sets the maximum length of a "string", "password" or "upload" field. If is not specified a default value is used but the default value is not guaranteed to be backward compatible. To avoid unwanted migrations on upgrades, we recommend that you always specify the length for string, password and upload fields. default sets the default value for the field. The default value is used when performing an insert if a value is not explicitly specified. It is also used to pre-populate forms built from the table using SQLFORM. Note, rather than being a fixed value, the default can instead be a function (including a lambda function) that returns a value of the appropriate type for the field. In that case, the function is called once for each record inserted, even when multiple records are inserted in a single transaction.

sets the default value for the field. The default value is used when performing an insert if a value is not explicitly specified. It is also used to pre-populate forms built from the table using SQLFORM. Note, rather than being a fixed value, the default can instead be a function (including a lambda function) that returns a value of the appropriate type for the field. In that case, the function is called once for each record inserted, even when multiple records are inserted in a single transaction. required tells the DAL that no insert should be allowed on this table if a value for this field is not explicitly specified.

tells the DAL that no insert should be allowed on this table if a value for this field is not explicitly specified. requires is a validator or a list of validators. This is not used by the DAL, but it is used by SQLFORM. The default validators for the given types are shown in the next section.

Notice that requires=... is enforced at the level of forms, required=True is enforced at the level of the DAL (insert), while notnull , unique and ondelete are enforced at the level of the database. While they sometimes may seem redundant, it is important to maintain the distinction when programming with the DAL.

rname provides the field with a "real name", a name for the field known to the database adapter; when the field is used, it is the rname value which is sent to the database. The web2py name for the field is then effectively an alias.

ondelete

ondelete translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set ondelete to "NO ACTION" or "SET NULL".

translates into the "ON DELETE" SQL statement. By default it is set to "CASCADE". This tells the database that when it deletes a record, it should also delete all records that refer to it. To disable this feature, set to "NO ACTION" or "SET NULL". notnull=True translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field.

translates into the "NOT NULL" SQL statement. It prevents the database from inserting null values for the field. unique=True translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level.

translates into the "UNIQUE" SQL statement and it makes sure that values of this field are unique within the table. It is enforced at the database level. uploadfield applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If uploadfield is set to True, then the file is stored in a blob field within the same table and the value of uploadfield is the name of the blob field. This will be discussed in more detail later in the More on uploads section in this chapter.

applies only to fields of type "upload". A field of type "upload" stores the name of a file saved somewhere else, by default on the filesystem under the application "uploads/" folder. If is set to True, then the file is stored in a blob field within the same table and the value of is the name of the blob field. This will be discussed in more detail later in the More on uploads section in this chapter. uploadfolder sets the folder for uploaded files. By default, an uploaded file goes into the application's "uploads/" folder, that is into os.path.join(request.folder, 'uploads') (this seems not the case for MongoAdapter at present). For example: Field ( ... , uploadfolder = os . path . join ( request . folder , 'static/temp' ))

sets the folder for uploaded files. By default, an uploaded file goes into the application's "uploads/" folder, that is into (this seems not the case for MongoAdapter at present). For example: uploadseparate if set to True will upload files under different subfolders of the uploadfolder folder. This is optimized to avoid too many files under the same folder/subfolder. ATTENTION: You cannot change the value of uploadseparate from True to False without breaking links to existing uploads. web2py either uses the separate subfolders or it does not. Changing the behavior after files have been uploaded will prevent web2py from being able to retrieve those files. If this happens it is possible to move files and fix the problem but this is not described here.

uploadfs

PyFileSystem

uploadfs allows you specify a different file system where to upload files, including an Amazon S3 storage or a remote SFTP storage. You need to have PyFileSystem installed for this to work. uploadfs must point to PyFileSystem.

allows you specify a different file system where to upload files, including an Amazon S3 storage or a remote SFTP storage. autodelete determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only. However, records deleted by the database itself due to a CASCADE operation will not trigger web2py's autodelete. The web2py Google group has workaround discussions.

determines if the corresponding uploaded file should be deleted when the record referencing the file is deleted. For "upload" fields only. However, records deleted by the database itself due to a CASCADE operation will not trigger web2py's autodelete. The web2py Google group has workaround discussions. widget must be one of the available widget objects, including custom widgets, for example: SQLFORM.widgets.string.widget . A list of available widgets will be discussed later. Each field type has a default widget.

must be one of the available widget objects, including custom widgets, for example: . A list of available widgets will be discussed later. Each field type has a default widget. label is a string (or a helper or something that can be serialized to a string) that contains the label to be used for this field in auto-generated forms.

is a string (or a helper or something that can be serialized to a string) that contains the label to be used for this field in auto-generated forms. comment is a string (or a helper or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms.

is a string (or a helper or something that can be serialized to a string) that contains a comment associated with this field, and will be displayed to the right of the input field in the autogenerated forms. writable declares whether a field is writable in forms.

declares whether a field is writable in forms. readable declares whether a field is readable in forms. If a field is neither readable nor writable, it will not be displayed in create and update forms.

declares whether a field is readable in forms. If a field is neither readable nor writable, it will not be displayed in create and update forms. searchable declares whether a field is searchable in grids ( SQLFORM.grid and SQLFORM.smartgrid are described in Chapter 7). Notice that a field must also be readable to be searched.

declares whether a field is searchable in grids ( and are described in Chapter 7). Notice that a field must also be readable to be searched. listable declares whether a field is visible in grids (when listing multiple records)

declares whether a field is visible in grids (when listing multiple records) update contains the default value for this field when the record is updated.

contains the default value for this field when the record is updated. compute is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a dict , and the dict will not include the current value of that, or any other compute field.

is an optional function. If a record is inserted or updated, the compute function will be executed and the field will be populated with the function result. The record is passed to the compute function as a , and the dict will not include the current value of that, or any other compute field. authorize can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization.

can be used to require access control on the corresponding field, for "upload" fields only. It will be discussed more in detail in the context of Authentication and Authorization. represent can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples: db . mytable . name . represent = lambda name , row : name . capitalize () db . mytable . other_id . represent = lambda oid , row : row . myfield db . mytable . some_uploadfield . represent = lambda val , row : A ( 'get it' , _href = URL ( 'download' , args = val ))

can be None or can point to a function that takes a field value and returns an alternate representation for the field value. Examples: filter_in and filter_out can be set to callables for further processing of field's value. filter_in is passed the field's value to be written to the database before an insert or update while filter_out is passed the value retrieved from the database before field assignment. The value returned by the callable is then used. See filter_in and filter_out section in this chapter.

and can be set to callables for further processing of field's value. is passed the field's value to be written to the database before an insert or update while is passed the value retrieved from the database before field assignment. The value returned by the callable is then used. See filter_in and filter_out section in this chapter. custom_qualifier is a custom SQL qualifier for the field to be used at table creation time (cannot use for field of type "id", "reference", or "big-reference").

Field types

field types

field type default field validators string IS_LENGTH(length) default length is 512 text IS_LENGTH(length) default length is 32768 blob None default length is 2**31 (2 GiB) boolean None integer IS_INT_IN_RANGE(-2**31, 2**31) double IS_FLOAT_IN_RANGE(-1e100, 1e100) decimal(n,m) IS_DECIMAL_IN_RANGE(-10**10, 10**10) date IS_DATE() time IS_TIME() datetime IS_DATETIME() password IS_LENGTH(length) default length is 512 upload None default length is 512 reference <table> IS_IN_DB(db, table.field, format) list:string None list:integer None list:reference <table> IS_IN_DB(db, table._id, format, multiple=True) json IS_EMPTY_OR(IS_JSON()) default length is 512 bigint IS_INT_IN_RANGE(-2**63, 2**63) big-id None big-reference None

Decimal requires and returns values as Decimal objects, as defined in the Python decimal module. SQLite does not handle the decimal type so internally we treat it as a double . The (n,m) are the number of digits in total and the number of digits after the decimal point respectively.

The big-id and, big-reference are only supported by some of the database engines and are experimental. They are not normally used as field types unless for legacy tables, however, the DAL constructor has a bigint_id argument that when set to True makes the id fields and reference fields big-id and big-reference respectively.

The list:<type> fields are special because they are designed to take advantage of certain denormalization features on NoSQL (in the case of Google App Engine NoSQL, the field types ListProperty and StringListProperty ) and back-port them all the other supported relational databases. On relational databases lists are stored as a text field. The items are separated by a | and each | in string item is escaped as a || . They are discussed in list:<type> and contains section in this chapter.

The json field type is pretty much explanatory. It can store any json serializable object. It is designed to work specifically for MongoDB and backported to the other database adapters for portability.

blob

blob fields are also special. By default, binary data is encoded in base64 before being stored into the actual database field, and it is decoded when extracted. This has the negative effect of using 33% more storage space than necessary in blob fields, but has the advantageof making the communication independent of back-end-specific escaping conventions.

Run-time field and table modification

Most attributes of fields and tables can be modified after they are defined:

>>> db . define_table ( 'person' , Field ( 'name' , default = '' ), format = ' %(name)s ' ) <Table person (id, name)> >>> db . person . _format = ' %(name)s / %(id)s ' >>> db . person . name . default = 'anonymous'

notice that attributes of tables are usually prefixed by an underscore to avoid conflict with possible field names.

You can list the tables that have been defined for a given database connection:

tables

>>> db . tables ['person']

You can query for the type of a table:

Table

>>> type ( db . person ) <class 'pydal.objects.Table'>

You can access a table using different syntaxes:

>>> db . person is db [ 'person' ] True

You can also list the fields that have been defined for a given table:

fields

>>> db . person . fields ['id', 'name']

Similarly you can access fields from their name in multiple equivalent ways:

>>> type ( db . person . name ) <class 'pydal.objects.Field'> >>> db . person . name is db . person [ 'name' ] True

Given a field, you can access the attributes set in its definition:

>>> db . person . name . type string >>> db . person . name . unique False >>> db . person . name . notnull False >>> db . person . name . length 32

including its parent table, tablename, and parent connection:

>>> db . person . name . _table == db . person True >>> db . person . name . _tablename == 'person' True >>> db . person . name . _db == db True

A field also has methods. Some of them are used to build queries and we will see them later. A special method of the field object is validate and it calls the validators for the field.

>>> db . person . name . validate ( 'John' ) ('John', None)

which returns a tuple (value, error) . error is None if the input passes validation.

Migrations

migrations

define_table checks whether or not the corresponding table exists. If it does not, it generates the SQL to create it and executes the SQL. If the table does exist but differs from the one being defined, it generates the SQL to alter the table and executes it. If a field has changed type but not name, it will try to convert the data (If you do not want this, you need to redefine the table twice, the first time, letting web2py drop the field by removing it, and the second time adding the newly defined field so that web2py can create it.). If the table exists and matches the current definition, it will leave it alone. In all cases it will create the db.person object that represents the table.

We refer to this behavior as a "migration". web2py logs all migrations and migration attempts in the file "sql.log".

Notice that by default web2py uses the "app/databases" folder for the log file and all other migration files it needs. You can change this setting the folder argument to DAL. To set a different log file name, for example "migrate.log" you can do db = DAL ( ... , adapter_args = dict ( logfile = 'migrate.log' ))

The first argument of define_table is always the table name. The other unnamed arguments are the fields (Field). The function also takes an optional keyword argument called "migrate":

db . define_table ( 'person' , ... , migrate = 'person.table' )

The value of migrate is the filename where web2py stores internal migration information for this table. These files are very important and should never be removed while the corresponding tables exist. In cases where a table has been dropped and the corresponding file still exist, it can be removed manually. By default, migrate is set to True. This causes web2py to generate the filename from a hash of the connection string. If migrate is set to False, the migration is not performed, and web2py assumes that the table exists in the datastore and it contains (at least) the fields listed in define_table .

There may not be two tables in the same application with the same migrate filename.

The DAL class also takes a "migrate" argument, which determines the default value of migrate for calls to define_table . For example,

db = DAL ( 'sqlite://storage.sqlite' , migrate = False )

will set the default value of migrate to False whenever db.define_table is called without a migrate argument.

Notice that web2py only migrates new columns, removed columns, and changes in column type (except in SQLite). web2py does not migrate changes in attributes such as changes in the values of default , unique , notnull , and ondelete .

Migrations can be disabled for all tables at once:

db = DAL ( ... , migrate_enabled = False )

This is the recommended behavior when two apps share the same database. Only one of the two apps should perform migrations, the other should disabled them.

Fixing broken migrations

fake_migrate

There are two common problems with migrations and there are ways to recover from them.

One problem is specific with SQLite. SQLite does not enforce column types and cannot drop columns. This means that if you have a column of type string and you remove it, it is not really removed. If you add the column again with a different type (for example datetime) you end up with a datetime column that contains strings (junk for practical purposes). web2py does not complain about this because it does not know what is in the database, until it tries to retrieve records and fails.

If web2py returns an error in some parse function when selecting records, most likely this is due to corrupted data in a column because of the above issue.

The solution consists in updating all records of the table and updating the values in the column in question with None.

The other problem is more generic but typical with MySQL. MySQL does not allow more than one ALTER TABLE in a transaction. This means that web2py must break complex transactions into smaller ones (one ALTER TABLE at the time) and commit one piece at the time. It is therefore possible that part of a complex transaction gets committed and one part fails, leaving web2py in a corrupted state. Why would part of a transaction fail? Because, for example, it involves altering a table and converting a string column into a datetime column, web2py tries to convert the data, but the data cannot be converted. What happens to web2py? It gets confused about what exactly is the table structure actually stored in the database.

The solution consists of enabling fake migrations:

db . define_table ( .... , migrate = True , fake_migrate = True )

This will rebuild web2py metadata about the table according to the table definition. Try multiple table definitions to see which one works (the one before the failed migration and the one after the failed migration). Once successful remove the fake_migrate=True parameter.

Before attempting to fix migration problems it is prudent to make a copy of "applications/yourapp/databases/*.table" files.

Migration problems can also be fixed for all tables at once:

db = DAL ( ... , fake_migrate_all = True )

This also fails if the model describes tables that do not exist in the database, but it can help narrowing down the problem.

Migration control summary

The logic of the various migration arguments are summarized in this pseudo-code:

if DAL . migrate_enabled and table . migrate : if DAL . fake_migrate_all or table . fake_migrate : perform fake migration else : perform migration

insert

insert

Given a table, you can insert records

>>> db . person . insert ( name = "Alex" ) 1 >>> db . person . insert ( name = "Bob" ) 2

Insert returns the unique "id" value of each record inserted.

You can truncate the table, i.e., delete all records and reset the counter of the id.

truncate

>>> db . person . truncate ()

Now, if you insert a record again, the counter starts again at 1 (this is back-end specific and does not apply to Google NoSQL):

>>> db . person . insert ( name = "Alex" ) 1

Notice you can pass a parameter to truncate , for example you can tell SQLite to restart the id counter.

>>> db . person . truncate ( 'RESTART IDENTITY CASCADE' )

The argument is in raw SQL and therefore engine specific.

bulk_insert

web2py also provides a bulk_insert method

>>> db . person . bulk_insert ([{ 'name' : 'Alex' }, { 'name' : 'John' }, { 'name' : 'Tim' }]) [3, 4, 5]

It takes a list of dictionaries of fields to be inserted and performs multiple inserts at once. It returns the list of "id" values of the inserted records. On the supported relational databases there is no advantage in using this function as opposed to looping and performing individual inserts but on Google App Engine NoSQL, there is a major speed advantage.

commit and rollback

commit

The insert, truncate, delete, and update operations aren't actually committed until web2py issues the commit command. The create and drop operations may be executed immediately, depending on the database engine. Calls to web2py actions are automatically wrapped in transactions. If you executed commands via the shell, you are required to manually commit:

>>> db . commit ()

To check it let's insert a new record:

>>> db . person . insert ( name = "Bob" ) 2

and roll back, i.e., ignore all operations since the last commit:

rollback

>>> db . rollback ()

If you now insert again, the counter will again be set to 2, since the previous insert was rolled back.

>>> db . person . insert ( name = "Bob" ) 2

Code in models, views and controllers is enclosed in web2py code that looks like this (pseudo code) :

try : execute models , controller function and view except : rollback all connections log the traceback send a ticket to the visitor else : commit all connections save cookies , sessions and return the page

So in models, views and controllers there is no need to ever call commit or rollback explicitly in web2py unless you need more granular control. However, in modules you will need to use commit() .

Raw SQL

Timing queries

All queries are automatically timed by web2py. The variable db._timings is a list of tuples. Each tuple contains the raw SQL query as passed to the database driver and the time it took to execute in seconds. This variable can be displayed in views using the toolbar:

{{ = response.toolbar () }}

executesql

executesql

The DAL allows you to explicitly issue SQL statements.

>>> db . executesql ( 'SELECT * FROM person;' ) [(1, u'Massimo'), (2, u'Massimo')]

In this case, the return values are not parsed or transformed by the DAL, and the format depends on the specific database driver. This usage with selects is normally not needed, but it is more common with indexes.

executesql takes five optional arguments: placeholders , as_dict , fields , colnames , and as_ordered_dict .

placeholders is an optional sequence of values to be substituted in or, if supported by the DB driver, a dictionary with keys matching named placeholders in your SQL.

If as_dict is set to True, the results cursor returned by the DB driver will be converted to a sequence of dictionaries keyed with the db field names. Results returned with as_dict = True are the same as those returned when applying .as_list() to a normal select:

[{ 'field1' : val1_row1 , 'field2' : val2_row1 }, { 'field1' : val1_row2 , 'field2' : val2_row2 }]

as_ordered_dict is pretty much like as_dict but the former ensures that the order of resulting fields (OrderedDict keys) reflect the order on which they are returned from DB driver:

[ OrderedDict ([( 'field1' , val1_row1 ), ( 'field2' , val2_row1 )]), OrderedDict ([( 'field1' , val1_row2 ), ( 'field2' , val2_row2 )])]

The fields argument is a list of DAL Field objects that match the fields returned from the DB. The Field objects should be part of one or more Table objects defined on the DAL object. The fields list can include one or more DAL Table objects in addition to or instead of including Field objects, or it can be just a single table (not in a list). In that case, the Field objects will be extracted from the table(s).

Instead of specifying the fields argument, the colnames argument can be specified as a list of field names in tablename.fieldname format. Again, these should represent tables and fields defined on the DAL object.

It is also possible to specify both fields and the associated colnames . In that case, fields can also include DAL Expression objects in addition to Field objects. For Field objects in "fields", the associated colnames must still be in tablename.fieldname format. For Expression objects in fields , the associated colnames can be any arbitrary labels.

Notice, the DAL Table objects referred to by fields or colnames can be dummy tables and do not have to represent any real tables in the database. Also, note that the fields and colnames must be in the same order as the fields in the results cursor returned from the DB.

_lastsql

_lastsql

Whether SQL was executed manually using executesql or was SQL generated by the DAL, you can always find the SQL code in db._lastsql . This is useful for debugging purposes:

>>> rows = db () . select ( db . person . ALL ) >>> db . _lastsql SELECT person.id, person.name FROM person;

web2py never generates queries using the "*" operator. web2py is always explicit when selecting fields.

drop

drop

Finally, you can drop tables and all data will be lost:

db . person . drop ()

Indexes

Currently the DAL API does not provide a command to create indexes on tables, but this can be done using the executesql command. This is because the existence of indexes can make migrations complex, and it is better to deal with them explicitly. Indexes may be needed for those fields that are used in recurrent queries.

Here is an example of how to create an index using SQL in SQLite:

db = DAL ( 'sqlite://storage.sqlite' ) db . define_table ( 'person' , Field ( 'name' )) db . executesql ( 'CREATE INDEX IF NOT EXISTS myidx ON person (name);' )

Other database dialects have very similar syntaxes but may not support the optional "IF NOT EXISTS" directive.

Legacy databases and keyed tables

web2py can connect to legacy databases under some conditions.

The easiest way is when these conditions are met:

Each table must have a unique auto-increment integer field called "id"

Records must be referenced exclusively using the "id" field.

When accessing an existing table, i.e., a table not created by web2py in the current application, always set migrate=False .

If the legacy table has an auto-increment integer field but it is not called "id", web2py can still access it but the table definition must declare the auto-increment field with 'id' type (that is using FIeld('...', 'id') ).

keyed table

Finally if the legacy table uses a primary key that is not an auto-increment id field it is possible to use a "keyed table", for example:

db . define_table ( 'account' , Field ( 'accnum' , 'integer' ), Field ( 'acctype' ), Field ( 'accdesc' ), primarykey = [ 'accnum' , 'acctype' ], migrate = False )

primarykey is a list of the field names that make up the primary key.

is a list of the field names that make up the primary key. All primarykey fields have a NOT NULL set even if not specified.

set even if not specified. Keyed tables can only reference other keyed tables.

Referencing fields must use the reference tablename.fieldname format.

format. The update_record function is not available for Rows of keyed tables.

Currently keyed tables are only supported for DB2, MSSQL, Ingres and Informix, but others engines will be added.

At the time of writing, we cannot guarantee that the primarykey attribute works with every existing legacy table and every supported database backend. For simplicity, we recommend, if possible, creating a database view that has an auto-increment id field.

Distributed transaction

distributed transactions

At the time of writing this feature is only supported by PostgreSQL, MySQL and Firebird, since they expose API for two-phase commits.

Assuming you have two (or more) connections to distinct PostgreSQL databases, for example:

db_a = DAL ( 'postgres://...' ) db_b = DAL ( 'postgres://...' )

In your models or controllers, you can commit them concurrently with:

DAL . distributed_transaction_commit ( db_a , db_b )

On failure, this function rolls back and raises an Exception .

In controllers, when one action returns, if you have two distinct connections and you do not call the above function, web2py commits them separately. This means there is a possibility that one of the commits succeeds and one fails. The distributed transaction prevents this from happening.

More on uploads

Consider the following model:

db . define_table ( 'myfile' , Field ( 'image' , 'upload' , default = 'path/to/file' ))

In the case of an "upload" field, the default value can optionally be set to a path (an absolute path or a path relative to the current app folder), the default value is then assigned to each new record that does not specify an image.

Notice that this way multiple records may end to reference the same default image file and this could be a problem on a Field having autodelete enabled. When you do not want to allow duplicates for the image field (i.e. multiple records referencing the same file) but still want to set a default value for the "upload" then you need a way to copy the default file for each new record that does not specify an image. This can be obtained using a file-like object referencing the default file as the default argument to Field, or even with:

Field ( 'image' , 'upload' , default = dict ( data = '<file_content>' , filename = '<file_name>' ))

Normally an insert is handled automatically via a SQLFORM or a crud form (which is a SQLFORM) but occasionally you already have the file on the filesystem and want to upload it programmatically. This can be done in this way:

with open ( filename , 'rb' ) as stream : db . myfile . insert ( image = db . myfile . image . store ( stream , filename ))

It is also possible to insert a file in a simpler way and have the insert method call store automatically:

with open ( filename , 'rb' ) as stream : db . myfile . insert ( image = stream )

In this case the filename is obtained from the stream object if available.

The store method of the upload field object takes a file stream and a filename. It uses the filename to determine the extension (type) of the file, creates a new temp name for the file (according to web2py upload mechanism) and loads the file content in this new temp file (under the uploads folder unless specified otherwise). It returns the new temp name, which is then stored in the image field of the db.myfile table.

Note, if the file is to be stored in an associated blob field rather than the file system, the store method will not insert the file in the blob field (because store is called before the insert), so the file must be explicitly inserted into the blob field:

db . define_table ( 'myfile' , Field ( 'image' , 'upload' , uploadfield = 'image_file' ), Field ( 'image_file' , 'blob' )) with open ( filename , 'rb' ) as stream : db . myfile . insert ( image = db . myfile . image . store ( stream , filename ), image_file = stream . read ())

The retrieve method does the opposite of store .

When uploaded files are stored on filesystem (as in the case of a plain Field('image', 'upload') ) the code:

row = db ( db . myfile ) . select () . first () ( filename , fullname ) = db . myfile . image . retrieve ( row . image , nameonly = True )

retrieves the original file name (filename) as seen by the user at upload time and the name of stored file (fullname, with path relative to application folder). While in general the call:

( filename , stream ) = db . myfile . image . retrieve ( row . image )

retrieves the original file name (filename) and a file-like object ready to access uploaded file data (stream).

Notice that the stream returned by retrieve is a real file object in the case that uploaded files are stored on filesystem. In that case remember to close the file when you have done calling stream.close() .

Here is an example of safe usage of retrieve :

from contextlib import closing import shutil row = db ( db . myfile ) . select () . first () ( filename , stream ) = db . myfile . image . retrieve ( row . image ) with closing ( stream ) as src , closing ( open ( filename , 'wb' )) as dest : shutil . copyfileobj ( src , dest )

Query , Set , Rows

Query

Set

Let's consider again the table defined (and dropped) previously and insert three records:

>>> db . define_table ( 'person' , Field ( 'name' )) <Table person (id, name)> >>> db . person . insert ( name = "Alex" ) 1 >>> db . person . insert ( name = "Bob" ) 2 >>> db . person . insert ( name = "Carl" ) 3

You can store the table in a variable. For example, with variable person , you could do:

Table

>>> person = db . person

You can also store a field in a variable such as name . For example, you could also do:

Field

>>> name = person . name

You can even build a query (using operators like ==, !=, <, >, <=, >=, like, belongs) and store the query in a variable q such as in:

>>> q = name == 'Alex'

When you call db with a query, you define a set of records. You can store it in a variable s and write:

>>> s = db ( q )

Notice that no database query has been performed so far. DAL + Query simply define a set of records in this db that match the query. web2py determines from the query which table (or tables) are involved and, in fact, there is no need to specify that.

select

select

Given a Set, s , you can fetch the records with the command select :

Rows

>>> rows = s . select ()

Row

It returns an iterable object of class pydal.objects.Rows whose elements are Row objects. pydal.objects.Row objects act like dictionaries, but their elements can also be accessed as attributes, like gluon.storage.Storage .The former differ from the latter because its values are read-only.

The Rows object allows looping over the result of the select and printing the selected field values for each row:

>>> for row in rows : ... print row . id , row . name ... 1 Alex

You can do all the steps in one statement:

>>> for row in db ( db . person . name == 'Alex' ) . select (): ... print row . name ... Alex

ALL

The select command can take arguments. All unnamed arguments are interpreted as the names of the fields that you want to fetch. For example, you can be explicit on fetching field "id" and field "name":

>>> for row in db () . select ( db . person . id , db . person . name ): ... print row . name ... Alex Bob Carl

The table attribute ALL allows you to specify all fields:

>>> for row in db () . select ( db . person . ALL ): ... print row . id , row . name ... 1 Alex 2 Bob 3 Carl

Notice that there is no query string passed to db. web2py understands that if you want all fields of the table person without additional information then you want all records of the table person.

An equivalent alternative syntax is the following:

>>> for row in db ( db . person ) . select (): ... print row . id , row . name ... 1 Alex 2 Bob 3 Carl

and web2py understands that if you ask for all records of the table person without additional information, then you want all the fields of table person.

Given one row

>>> row = rows [ 0 ]

you can extract its values using multiple equivalent expressions:

>>> row . name Alex >>> row [ 'name' ] Alex >>> row ( 'person.name' ) Alex

The latter syntax is particularly handy when selecting en expression instead of a column. We will show this later.

You can also do

rows . compact = False

to disable the notation

rows [ i ] . name

and enable, instead, the less compact notation:

rows [ i ] . person . name

Yes this is unusual and rarely needed.

Row objects also have two important methods:

row . delete_record ()

and

row . update_record ( name = "new value" )

Using an iterator-based select for lower memory use

Python "iterators" are a type of "lazy-evaluation". They 'feed' data one step at time; traditional Python loops create the entire set of data in memory before looping.

The traditional use of select is:

for row in db ( db . table ) . select (): ...

but for large numbers of rows, using an iterator-based alternative has dramatically lower memory use:

for row in db ( db . table ) . iterselect (): ...

Testing shows this is around 10% faster as well, even on machines with large RAM.

Rendering rows using represent

You may wish to rewrite rows returned by select to take advantage of formatting information contained in the represents setting of the fields.

rows = db ( query ) . select () repr_row = rows . render ( 0 )

If you don't specify an index, you get a generator to iterate over all the rows:

for row in rows . render (): print row . myfield

Can also be applied to slices:

for row in rows [ 0 : 10 ] . render (): print row . myfield

If you only want to transform selected fields via their "represent" attribute, you can list them in the "fields" argument:

repr_row = row . render ( 0 , fields = [ db . mytable . myfield ])

Note, it returns a transformed copy of the original Row, so there's no update_record (which you wouldn't want anyway) or delete_record.

Shortcuts

DAL shortcuts

The DAL supports various code-simplifying shortcuts. In particular:

myrecord = db . mytable [ id ]

returns the record with the given id if it exists. If the id does not exist, it returns None . The above statement is equivalent to

myrecord = db ( db . mytable . id == id ) . select () . first ()

You can delete records by id:

del db . mytable [ id ]

and this is equivalent to

db ( db . mytable . id == id ) . delete ()

and deletes the record with the given id , if it exists.

Note: this delete shortcut syntax does not currently work if versioning is activated

You can insert records:

db . mytable [ None ] = dict ( myfield = 'somevalue' )

It is equivalent to

db . mytable . insert ( myfield = 'somevalue' )

and it creates a new record with field values specified by the dictionary on the right hand side.

Note: insert shortcut was previously db.table[0] = ... . It has changed in PyDAL 19.02 to permit normal usage of id 0.

You can update records:

db . mytable [ id ] = dict ( myfield = 'somevalue' )

which is equivalent to

db ( db . mytable . id == id ) . update ( myfield = 'somevalue' )

and it updates an existing record with field values specified by the dictionary on the right hand side.

Fetching a Row

Yet another convenient syntax is the following:

record = db . mytable ( id ) record = db . mytable ( db . mytable . id == id ) record = db . mytable ( id , myfield = 'somevalue' )

Apparently similar to db.mytable[id] the above syntax is more flexible and safer. First of all it checks whether id is an int (or str(id) is an int) and returns None if not (it never raises an exception). It also allows to specify multiple conditions that the record must meet. If they are not met, it also returns None .

Recursive select s

recursive selects

Consider the previous table person and a new table "thing" referencing a "person":

db . define_table ( 'thing' , Field ( 'name' ), Field ( 'owner_id' , 'reference person' ))

and a simple select from this table:

things = db ( db . thing ) . select ()

which is equivalent to

things = db ( db . thing . _id != None ) . select ()

_id

where _id is a reference to the primary key of the table. Normally db.thing._id is the same as db.thing.id and we will assume that in most of this book.

For each Row of things it is possible to fetch not just fields from the selected table (thing) but also from linked tables (recursively):

for thing in things : print thing . name , thing . owner_id . name

Here thing.owner_id.name requires one database select for each thing in things and it is therefore inefficient. We suggest using joins whenever possible instead of recursive selects, nevertheless this is convenient and practical when accessing individual records.

You can also do it backwards, by selecting the things referenced by a person:

person = db . person ( id ) for thing in person . thing . select ( orderby = db . thing . name ): print person . name , 'owns' , thing . name

In this last expression person.thing is a shortcut for

db ( db . thing . owner_id == person . id )

i.e. the Set of thing s referenced by the current person . This syntax breaks down if the referencing table has multiple references to the referenced table. In this case one needs to be more explicit and use a full Query.

Serializing Rows in views

Given the following action containing a query

SQLTABLE

def index (): return dict ( rows = db ( query ) . select ())

The result of a select can be displayed in a view with the following syntax:

{{ extend 'layout.html' }} <h1> Records </h1> {{ = rows }}

Which is equivalent to:

{{ extend 'layout.html' }} <h1> Records </h1> {{ = SQLTABLE ( rows ) }}

SQLTABLE converts the rows into an HTML table with a header containing the column names and one row per record. The rows are marked as alternating class "even" and class "odd". Under the hood, Rows is first converted into a SQLTABLE object (not to be confused with Table) and then serialized. The values extracted from the database are also formatted by the validators associated to the field and then escaped.

Yet it is possible and sometimes convenient to call SQLTABLE explicitly.

The SQLTABLE constructor takes the following optional arguments:

linkto lambda function or an action to be used to link reference fields (default to None).

If you assign it a string with the name of an action, it will generate a link to that function passing it, as args, the name of the table and the id of each record (in this order). Example:

linkto = 'pointed_function' # generates something like <a href="pointed_function/table_name/id_value">

If you want a different link to be generated, you can specify a lambda, wich will receive as parameters, the value of the id, the type of the object (e.g. table), and the name of the object. For example, if you want to receive the args in reverse order:

linkto = lambda id , type , name : URL ( f = 'pointed_function' , args = [ id , name ])

upload the URL or the download action to allow downloading of uploaded files (default to None)

the URL or the download action to allow downloading of uploaded files (default to None) headers a dictionary mapping field names to their labels to be used as headers (default to {} ). It can also be an instruction. Currently we support headers='fieldname:capitalize' .

a dictionary mapping field names to their labels to be used as headers (default to ). It can also be an instruction. Currently we support . truncate the number of characters for truncating long values in the table (default is 16)

the number of characters for truncating long values in the table (default is 16) columns the list of fieldnames to be shown as columns (in tablename.fieldname format). Those not listed are not displayed (defaults to all).

the list of fieldnames to be shown as columns (in tablename.fieldname format). Those not listed are not displayed (defaults to all). **attributes generic helper attributes to be passed to the most external TABLE object.

Here is an example:

{{ extend 'layout.html' }} <h1> Records </h1> {{ = SQLTABLE ( rows , headers = 'fieldname:capitalize' , truncate = 100 , upload = URL ( 'download' )) }}

SQLFORM.grid

SQLFORM.smartgrid

SQLTABLE is useful but there are times when one needs more. SQLFORM.grid is an extension of SQLTABLE that creates a table with search features and pagination, as well as ability to open detailed records, create, edit and delete records. SQLFORM.smartgrid is a further generalization that allows all of the above but also creates buttons to access referencing records.

Here is an example of usage of SQLFORM.grid :

def index (): return dict ( grid = SQLFORM . grid ( query ))

and the corresponding view:

{{ extend 'layout.html' }} {{ = grid }}

For working with multiple rows, SQLFORM.grid and SQLFORM.smartgrid are preferred to SQLTABLE because they are more powerful. Please see Chapter 7.

orderby , groupby , limitby , distinct , having , orderby_on_limitby , join , left , cache

The select command takes a number of optional arguments.

orderby

You can fetch the records sorted by name:

orderby

groupby

having

>>> for row in db () . select ( db . person . ALL , orderby = db . person . name ): ... print row . name ... Alex Bob Carl

You can fetch the records sorted by name in reverse order (notice the tilde):

>>> for row in db () . select ( db . person . ALL , orderby =~ db . person . name ): ... print row . name ... Carl Bob Alex

You can have the fetched records appear in random order:

>>> for row in db () . select ( db . person . ALL , orderby = '<random>' ): ... print row . name ... Carl Alex Bob

The use of orderby='<random>' is not supported on Google NoSQL. However, to overcome this limit, sorting can be accomplished on selected rows: import random rows = db ( ... ) . select () . sort ( lambda row : random . random ())

You can sort the records according to multiple fields by concatenating them with a "|":

>>> for row in db () . select ( db . person . name , orderby = db . person . name | db . person . id ): ... print row . name ... Alex Bob Carl

groupby, having

Using groupby together with orderby , you can group records with the same value for the specified field (this is back-end specific, and is not on the Google NoSQL):

>>> for row in db () . select ( db . person . ALL , ... orderby = db . person . name , ... groupby = db . person . name ): ... print row . name ... Alex Bob Carl

You can use having in conjunction with groupby to group conditionally (only those having the condition are grouped).

>>> print db ( query1 ) . select ( db . person . ALL , groupby = db . person . name , having = query2 )

Notice that query1 filters records to be displayed, query2 filters records to be grouped.

distinct

distinct

With the argument distinct=True , you can specify that you only want to select distinct records. This has the same effect as grouping using all specified fields except that it does not require sorting. When using distinct it is important not to select ALL fields, and in particular not to select the "id" field, else all records will always be distinct.

Here is an example:

>>> for row in db () . select ( db . person . name , distinct = True ): ... print row . name ... Alex Bob Carl

Notice that distinct can also be an expression, for example:

>>> for row in db () . select ( db . person . name , distinct = db . person . name ): ... print row . name ... Alex Bob Carl

limitby

With limitby=(min, max) , you can select a subset of the records from offset=min to but not including offset=max. In the next example we select the first two records starting at zero:

limitby

>>> for row in db () . select ( db . person . ALL , limitby = ( 0 , 2 )): ... print row . name ... Alex Bob

orderby_on_limitby

orderby_on_limitby

Note that the DAL defaults to implicitly adding an orderby when using a limitby. This ensures the same query returns the same results each time, important for pagination. But it can cause performance problems. use orderby_on_limitby = False to change this (this defaults to True).

join, left

These are involved in managing one to many relations. They are described in Inner join and Left outer join sections respectively.

cache, cacheable

An example use which gives much faster selects is:

rows = db ( query ) . select ( cache = ( cache . ram , 3600 ), cacheable = True )

Look at Caching selects section in this chapter, to understand what the trade-offs are.

Logical operators

and

or

not

Queries can be combined using the binary AND operator " & ":

>>> rows = db (( db . person . name == 'Alex' ) & ( db . person . id > 3 )) . select () >>> for row in rows : print row . id , row . name >>> len ( rows ) 0

and the binary OR operator " | ":

>>> rows = db (( db . person . name == 'Alex' ) | ( db . person . id > 3 )) . select () >>> for row in rows : print row . id , row . name 1 Alex

You can negate a sub-query inverting its operator:

>>> rows = db (( db . person . name != 'Alex' ) | ( db . person . id > 3 )) . select () >>> for row in rows : print row . id , row . name 2 Bob 3 Carl

or by explicit negation with the " ~ " unary operator:

>>> rows = db ( ~ ( db . person . name == 'Alex' ) | ( db . person . id > 3 )) . select () >>> for row in rows : print row . id , row . name 2 Bob 3 Carl

Due to Python restrictions in overloading " and " and " or " operators, these cannot be used in forming queries. The binary operators " & " and " | " must be used instead. Note that these operators (unlike " and " and " or ") have higher precedence than comparison operators, so the "extra" parentheses in the above examples are mandatory. Similarly, the unary operator " ~ " has higher precedence than comparison operators, so ~ -negated comparisons must also be parenthesized.

It is also possible to build queries using in-place logical operators:

>>> query = db . person . name != 'Alex' >>> query &= db . person . id > 3 >>> query |= db . person . name == 'John'

count , isempty , delete , update

You can count records in a set:

count

isempty

>>> db ( db . person . name != 'William' ) . count () 3

Notice that count takes an optional distinct argument which defaults to False, and it works very much like the same argument for select . count has also a cache argument that works very much like the equivalent argument of the select method.

Sometimes you may need to check if a table is empty. A more efficient way than counting is using the isempty method:

>>> db ( db . person ) . isempty () False

You can delete records in a set:

delete

>>> db ( db . person . id > 3 ) . delete () 0

The delete method returns the number of records that were deleted.

And you can update all records in a set by passing named arguments corresponding to the fields that need to be updated:

update

>>> db ( db . person . id > 2 ) . update ( name = 'Ken' ) 1

The update method returns the number of records that were updated.

Expressions

The value assigned an update statement can be an expression. For example consider this model

db . define_table ( 'person' , Field ( 'name' ), Field ( 'visits' , 'integer' , default = 0 )) db ( db . person . name == 'Massimo' ) . update ( visits = db . person . visits + 1 )

The values used in queries can also be expressions

db . define_table ( 'person' , Field ( 'name' ), Field ( 'visits' , 'integer' , default = 0 ), Field ( 'clicks' , 'integer' , default = 0 )) db ( db . person . visits == db . person . clicks + 1 ) . delete ()

case

case

An expression can contain a case clause for example:

>>> condition = db . person . name . startswith ( 'B' ) >>> yes_or_no = condition . case ( 'Yes' , 'No' ) >>> for row in db () . select ( db . person . name , yes_or_no ): ... print row . person . name , row [ yes_or_no ] # could be row(yes_or_no) too ... Alex No Bob Yes Ken No

update_record

update_record

web2py also allows updating a single record that is already in memory using update_record

>>> row = db ( db . person . id == 2 ) . select () . first () >>> row . update_record ( name = 'Curt' ) <Row {'id': 2L, 'name': 'Curt'}>

update_record should not be confused with

>>> row . update ( name = 'Curt' )

because for a single row, the method update updates the row object but not the database record, as in the case of update_record .

It is also possible to change the attributes of a row (one at a time) and then call update_record() without arguments to save the changes:

>>> row = db ( db . person . id > 2 ) . select () . first () >>> row . name = 'Philip' >>> row . update_record () # saves above change <Row {'id': 3L, 'name': 'Philip'}>

Note, you should avoid using row.update_record() with no arguments when the row object contains fields that have an update attribute (e.g., Field('modified_on', update=request.now) ). Calling row.update_record() will retain all of the existing values in the row object, so any fields with update attributes will have no effect in this case. Be particularly mindful of this with tables that include auth.signature .

The update_record method is available only if the table's id field is included in the select, and cacheable is not set to True .

Inserting and updating from a dictionary

A common issue consists of needing to insert or update records in a table where the name of the table, the field to be updated, and the value for the field are all stored in variables. For example: tablename , fieldname , and value .

The insert can be done using the following syntax:

db [ tablename ] . insert ( ** { fieldname : value })

The update of record with given id can be done with:

_id

db ( db [ tablename ] . _id == id ) . update ( ** { fieldname : value })

Notice we used table._id instead of table.id . In this way the query works even for tables with a primary key field with type other than "id".

first and last

first

last

Given a Rows object containing records:

rows = db ( query ) . select () first_row = rows . first () last_row = rows . last ()

are equivalent to

first_row = rows [ 0 ] if len ( rows ) else None last_row = rows [ - 1 ] if len ( rows ) else None

Notice, first() and last() allow you to obtain obviously the first and last record present in your query, but this won't mean that these records are going to be the first or last inserted records. In case you want the first or last record inputted in a given table don't forget to use orderby=db.table_name.id . If you forget you will only get the first and last record returned by your query which are often in a random order determined by the backend query optimiser.

as_dict and as_list

as_list

as_dict

A Row object can be serialized into a regular dictionary using the as_dict() method and a Rows object can be serialized into a list of dictionaries using the as_list() method. Here are some examples:

rows = db ( query ) . select () rows_list = rows . as_list () first_row_dict = rows . first () . as_dict ()

These methods are convenient for passing Rows to generic views and or to store Rows in sessions (since Rows objects themselves cannot be serialized since contain a reference to an open DB connection):

rows = db ( query ) . select () session . rows = rows # not allowed! session . rows = rows . as_list () # allowed!

Combining rows

Rows objects can be combined at the Python level. Here we assume:

>>> print rows1 person.name Max Tim >>> print rows2 person.name John Tim

You can do union of the records in two sets of rows:

>>> rows3 = rows1 + rows2 >>> print rows3 person.name Max Tim John Tim

You can do union of the records removing duplicates:

>>> rows3 = rows1 | rows2 >>> print rows3 person.name Max Tim John

You can do intersection of the records in two sets of rows:

>>> rows3 = rows1 & rows2 >>> print rows3 person.name Tim

find , exclude , sort

find

exclude

sort

Some times you need to perform two selects and one contains a subset of a previous select. In this case it is pointless to access the database again. The find , exclude and sort objects allow you to manipulate a Rows object and generate another one without accessing the database. More specifically:

find returns a new set of Rows filtered by a condition and leaves the original unchanged.

returns a new set of Rows filtered by a condition and leaves the original unchanged. exclude returns a new set of Rows filtered by a condition and removes them from the original Rows.

returns a new set of Rows filtered by a condition and removes them from the original Rows. sort returns a new set of Rows sorted by a condition and leaves the original unchanged.

All these methods take a single argument, a function that acts on each individual row.

Here is an example of usage:

>>> db . define_table ( 'person' , Field ( 'name' )) <Table person (id, name)> >>> db . person . insert ( name = 'John' ) 1 >>> db . person . insert ( name = 'Max' ) 2 >>> db . person . insert ( name = 'Alex' ) 3 >>> rows = db ( db . person ) . select () >>> for row in rows . find ( lambda row : row . name [ 0 ] == 'M' ): ... print row . name ... Max >>> len ( rows ) 3 >>> for row in rows . exclude ( lambda row : row . name [ 0 ] == 'M' ): ... print row . name ... Max >>> len ( rows ) 2 >>> for row in rows . sort ( lambda row : row . name ): ... print row . name ... Alex John

They can be combined:

>>> rows = db ( db . person ) . select () >>> rows = rows . find ( lambda row : 'x' in row . name ) . sort ( lambda row : row . name ) >>> for row in rows : ... print row . name ... Alex Max

Sort takes an optional argument reverse=True with the obvious meaning.

The find method has an optional limitby argument with the same syntax and functionality as the Set select method.

Other methods

update_or_insert

update_or_insert

Some times you need to perform an insert only if there is no record with the same values as those being inserted. This can be done with

db . define_table ( 'person' , Field ( 'name' ), Field ( 'birthplace' )) db . person . update_or_insert ( name = 'John' , birthplace = 'Chicago' )

The record will be inserted only if there is no other user called John born in Chicago.

You can specify which values to use as a key to determine if the record exists. For example:

db . person . update_or_insert ( db . person . name == 'John' , name = 'John' , birthplace = 'Chicago' )

and if there is John his birthplace will be updated else a new record will be created.

The selection criteria in the example above is a single field. It can also be a query, such as

db . person . update_or_insert (( db . person . name == 'John' ) & ( db . person . birthplace == 'Chicago' ), name = 'John' , birthplace = 'Chicago' , pet = 'Rover' )

validate_and_insert , validate_and_update

validate_and_insert

validate_and_update

The function

ret = db . mytable . validate_and_insert ( field = 'value' )

works very much like

id = db . mytable . insert ( field = 'value' )

except that it calls the validators for the fields before performing the insert and bails out if the validation does not pass. If validation does not pass the errors can be found in ret.errors . ret.errors holds a key-value mapping where each key is the field name whose validation failed, and the value of the key is the result from the validation error (much like form.errors ). If it passes, the id of the new record is in ret.id . Mind that normally validation is done by the form processing logic so this function is rarely needed.

Similarly

ret = db ( query ) . validate_and_update ( field = 'value' )

works very much the same as

num = db ( query ) . update ( field = 'value' )

except that it calls the validators for the fields before performing the update. Notice that it only works if query involves a single table. The number of updated records can be found in ret.updated and errors will be in ret.errors .

smart_query (experimental)

There are times when you need to parse a query using natural language such as

name contains m and age greater than 18

The DAL provides a method to parse this type of queries:

search = 'name contains m and age greater than 18' rows = db . smart_query ([ db . person ], search ) . select ()

The first argument must be a list of tables or fields that should be allowed in the search. It raises a RuntimeError if the search string is invalid. This functionality can be used to build RESTful interfaces (see chapter 10) and it is used internally by the SQLFORM.grid and SQLFORM.smartgrid .

In the smart_query search string, a field can be identified by fieldname only and or by tablename.fieldname. Strings may be delimited by double quotes if they contain spaces.

Computed fields

compute

DAL fields may have a compute attribute. This must be a function (or lambda) that takes a Row object and returns a value for the field. When a new record is modified, including both insertions and updates, if a value for the field is not provided, web2py tries to compute from the other field values using the compute function. Here is an example:

>>> db . define_table ( 'item' , ... Field ( 'unit_price' , 'double' ), ... Field ( 'quantity' , 'integer' ), ... Field ( 'total_price' , ... compute = lambda r : r [ 'unit_price' ] * r [ 'quantity' ])) <Table item (id, unit_price, quantity, total_price)> >>> rid = db . item . insert ( unit_price = 1.99 , quantity = 5 ) >>> db . item [ rid ] <Row {'total_price': '9.95', 'unit_price': 1.99, 'id': 1L, 'quantity': 5L}>

Notice that the computed value is stored in the db and it is not computed on retrieval, as in the case of virtual fields, described next. Two typical applications of computed fields are:

in wiki applications, to store the processed input wiki text as HTML, to avoid re-processing on every request

for searching, to compute normalized values for a field, to be used for searching.

Computed fields are evaluated in the order in which they are defined in the table definition. A computed field can refer to previously defined computed fields (new after v 2.5.1)

Virtual fields

virtual fields

Virtual fields are also computed fields (as in the previous subsection) but they differ from those because they are virtual in the sense that they are not stored in the db and they are computed each time records are extracted from the database. They can be used to simplify the user's code without using additional storage but they cannot be used for searching.

New style virtual fields (experimental)

web2py provides a new and easier way to define virtual fields and lazy virtual fields. This section is marked experimental because the APIs may still change a little from what is described here.

Here we will consider the same example as in the previous subsection. In particular we consider the following model:

db . define_table ( 'item' , Field ( 'unit_price' , 'double' ), Field ( 'quantity' , 'integer' ))

One can define a total_price virtual field as

db . item . total_price = Field . Virtual ( lambda row : row . item . unit_price * row . item . quantity )

i.e. by simply defining a new field total_price to be a Field.Virtual . The only argument of the constructor is a function that takes a row and returns the computed values.

A virtual field defined as the one above is automatically computed for all records when the records are selected:

for row in db ( db . item ) . select (): print row . total_price

It is also possible to define method fields which are calculated on-demand, when called. For example:

db . item . discounted_total = \ Field . Method ( lambda row , discount = 0.0 : row . item . unit_price * row . item . quantity * ( 100.0 - discount / 100 ))

In this case row.discounted_total is not a value but a function. The function takes the same arguments as the function passed to the Method constructor except for row which is implicit (think of it as self for objects).

The lazy field in the example above allows one to compute the total price for each item :

for row in db ( db . item ) . select (): print row . discounted_total ()

And it also allows to pass an optional discount percentage (say 15%):

for row in db ( db . item ) . select (): print row . discounted_total ( 15 )

Virtual and Method fields can also be defined in place when a table is defined:

db . define_table ( 'item' , Field ( 'unit_price' , 'double' ), Field ( 'quantity' , 'integer' ), Field . Virtual ( 'total_price' , lambda row : ... ), Field . Method ( 'discounted_total' , lambda row , discount = 0.0 : ... ))

Mind that virtual fields do not have the same attributes as regular fields (length, default, required, etc). They do not appear in the list of db.table.fields and in older versions of web2py they require a special approach to display in SQLFORM.grid and SQLFORM.smartgrid. See the discussion on grids and virtual fields in Chapter 7.

Old style virtual fields

In order to define one or more virtual fields, you can also define a container class, instantiate it and link it to a table or to a select. For example, consider the following table:

db . define_table ( 'item' , Field ( 'unit_price' , 'double' ), Field ( 'quantity' , 'integer' ))

One can define a total_price virtual field as

class MyVirtualFields ( object ): def total_price ( self ): return self . item . unit_price * self . item . quantity db . item . virtualfields . append ( MyVirtualFields ())

Notice that each method of the class that takes a single argument (self) is a new virtual field. self refers to each one row of the select. Field values are referred by full path as in self.item.unit_price . The table is linked to the virtual fields by appending an instance of the class to the table's virtualfields attribute.

Virtual fields can also access recursive fields as in

db . define_table ( 'item' , Field ( 'unit_price' , 'double' )) db . define_table ( 'order_item' , Field ( 'item' , 'reference item' ), Field ( 'quantity' , 'integer' )) class MyVirtualFields ( object ): def total_price ( self ): return self . order_item . item . unit_price * self . order_item . quantity db . order_item . virtualfields . append ( MyVirtualFields ())

Notice the recursive field access self.order_item.item.unit_price where self is the looping record.

They can also act on the result of a JOIN

rows = db ( db . order_item . item == db . item . id ) . select () class MyVirtualFields ( object ): def total_price ( self ): return self . item . unit_price * self . order_item . quantity rows . setvirtualfields ( order_item = MyVirtualFields ()) for row in rows : print row . order_item . total_price

Notice how in this case the syntax is different. The virtual field accesses both self.item.unit_price and self.order_item.quantity which belong to the join select. The virtual field is attached to the rows of the table using the setvirtualfields method of the rows object. This method takes an arbitrary number of named arguments and can be used to set multiple virtual fields, defined in multiple classes, and attach them to multiple tables:

class MyVirtualFields1 ( object ): def discounted_unit_price ( self ): return self . item . unit_price * 0.90 class MyVirtualFields2 ( object ): def total_price ( self ): return self . item . unit_price * self . order_item . quantity def discounted_total_price ( self ): return self . item . discounted_unit_price * self . order_item . quantity rows . setvirtualfields ( item = MyVirtualFields1 (), order_item = MyVirtualFields2 ()) for row in rows : print row . order_item . discounted_total_price

Virtual fields can be lazy; all they need to do is return a function and access it by calling the function:

db . define_table ( 'item' , Field ( 'unit_price' , 'double' ), Field ( 'quantity' , 'integer' )) class MyVirtualFields ( object ): def lazy_total_price ( self ): def lazy ( self = self ): return self . item . unit_price * self . item . quantity return lazy db . item . virtualfields . append ( MyVirtualFields ()) for item in db ( db . item ) . select (): print item . lazy_total_price ()

or shorter using a lambda function:

class MyVirtualFields ( object ): def lazy_total_price ( self ): return lambda self = self : self . item . unit_price * self . item . quantity

One to many relation

one to many

To illustrate how to implement one to many relations with the DAL, define another table "thing" that refers to the table "person" which we redefine here:

>>> db . define_table ( 'person' , ... Field ( 'name' )) <Table person (id, name)> >>> db . person . insert ( name = 'Alex' ) 1 >>> db . person . insert ( name = 'Bob' ) 2 >>> db . person . insert ( name = 'Carl' ) 3 >>> db . define_table ( 'thing' , ... Field ( 'name' ), ... Field ( 'owner_id' , 'reference person' )) <Table thing (id, name, owner_id)>

Table "thing" has two fields, the name of the thing and the owner of the thing. The "owner_id" field is a reference field, it is intended that the field reference the other table by its id. A reference type can be specified in two equivalent ways, either:

Field ( 'owner_id' , 'reference person' )

or:

Field ( 'owner_id' , db . person )

The latter is always converted to the former. They are equivalent except in the case of lazy tables, self references or other types of cyclic references where the former notation is the only allowed notation.

Now, insert three things, two owned by Alex and one by Bob:

>>> db . thing . insert ( name = 'Boat' , owner_id = 1 ) 1 >>> db . thing . insert ( name = 'Chair' , owner_id = 1 ) 2 >>> db . thing . insert ( name = 'Shoes' , owner_id = 2 ) 3

You can select as you did for any other table:

>>> for row in db ( db . thing . owner_id == 1 ) . select (): ... print row . name ... Boat Chair

Because a thing has a reference to a person, a person can have many things, so a record of table person now acquires a new attribute thing, which is a Set, that defines the things of that person. This allows looping over all persons and fetching their things easily:

referencing

>>> for person in db () . select ( db . person . ALL ): ... print person . name ... for thing in person . thing . select (): ... print ' ' , thing . name ... Alex Boat Chair Bob Shoes Carl

Inner joins

Another way to achieve a similar result is by using a join, specifically an INNER JOIN. web2py performs joins automatically and transparently when the query links two or more tables as in the following example:

Rows

inner join

join

>>> rows = db ( db . person . id == db . thing . owner_id ) . select () >>> for row in rows : ... print row . person . name , 'has' , row . thing . name ... Alex has Boat Alex has Chair Bob has Shoes

Observe that web2py did a join, so the rows now contain two records, one from each table, linked together. Because the two records may have fields with conflicting names, you need to specify the table when extracting a field value from a row. This means that while before you could do:

row . name

and it was obvious whether this was the name of a person or a thing, in the result of a join you have to be more explicit and say:

row . person . name

or:

row . thing . name

There is an alternative syntax for INNER JOINS:

>>> rows = db ( db . person ) . select ( join = db . thing . on ( db . person . id == db . thing . owner_id )) >>> for row in rows : ... print row . person . name , 'has' , row . thing . name ... Alex has Boat Alex has Chair Bob has Shoes

While the output is the same, the generated SQL in the two cases can be different. The latter syntax removes possible ambiguities when the same table is joined twice and aliased:

db . define_table ( 'thing' , Field ( 'name' ), Field ( 'owner_id1' , 'reference person' ), Field ( 'owner_id2' , 'reference person' )) rows = db ( db . person ) . select ( join = [ db . person . with_alias ( 'owner_id1' ) . on ( db . person . id == db . thing . owner_id1 ), db . person . with_alias ( 'owner_id2' ) . on ( db . person . id == db . thing . owner_id2 )])

The value of join can be list of db.table.on(...) to join.

Left outer join

Rows

left outer join

outer join

Notice that Carl did not appear in the list above because he has no things. If you intend to select on persons (whether they have things or not) and their things (if they have any), then you need to perform a LEFT OUTER JOIN. This is done using the argument "left" of the select. Here is an example:

>>> rows = db () . select ( db . person . ALL , db . thing . ALL , ... left = db . thing . on ( db . person . id == db . thing . owner_id )) >>> for row in rows : ... print row . person . name , 'has' , row . thing . name ... Alex has Boat Alex has Chair Bob has Shoes Carl has None

where:

left = db . thing . on ( ... )

does the left join query. Here the argument of db.thing.on is the condition required for the join (the same used above for the inner join). In the case of a left join, it is necessary to be explicit about which fields to select.

Multiple left joins can be combined by passing a list or tuple of db.mytable.on(...) to the left parameter.

Grouping and counting

When doing joins, sometimes you want to group rows according to certain criteria and count them. For example, count the number of things owned by every person. web2py allows this as well. First, you need a count operator. Second, you want to join the person table with the thing table by owner. Third, you want to select all rows (person + thing), group them by person, and count them while grouping:

grouping

>>> count = db . person . id . count () >>> for row in db ( db . person . id == db . thing . owner_id ... ) . select ( db . person . name , count , groupby = db . person . name ): ... print row . person . name , row [ count ] ... Alex 2 Bob 1

Notice the count operator (which is built-in) is used as a field. The only issue here is in how to retrieve the information. Each row clearly contains a person and the count, but the count is not a field of a person nor is it a table. So where does it go? It goes into the storage object representing the record with a key equal to the query expression itself.

The count method of the Field object has an optional distinct argument. When set to True it specifies that only distinct values of the field in question are to be counted.

Many to many

many-to-many

In the previous examples, we allowed a thing to have one owner but one person could h