13 min read

Over 70 simple but incredibly effective recipes for taking control of automated testing using powerful Python testing tools

Introduction

A coverage analyzer can be used while running a system in production, but what are the pros and cons, if we used it this way? What about using a coverage analyzer when running test suites? What benefits would this approach provide compared to checking systems in production?

Coverage helps us to see if we are adequately testing our system. But it must be performed with a certain amount of skepticism. This is because, even if we achieve 100 percent coverage, meaning every line of our system was exercised, in no way does this guarantee us having no bugs. A quick example involves a code we write and what it processes is the return value from a system call. What if there are three possible values, but we only handle two of them? We may write two test cases covering our handling of it, and this could certainly achieve 100 percent statement coverage. However, it doesn’t mean we have handled the third possible return value; thus, leaving us with a potentially undiscovered bug. 100 percent code coverage can also be obtained by condition coverage but may not be achieved with statement coverage. The kind of coverage we are planning to target should be clear.

Another key point is that not all testing is aimed at bug fixing. Another key purpose is to make sure that the application meets our customer’s needs. This means that, even if we have 100 percent code coverage, we can’t guarantee that we are covering all the scenarios expected by our users. This is the difference between ‘building it right’ and ‘building the right thing’.

In this article, we will explore various recipes to build a network management application, run coverage tools, and harvest the results. We will discuss how coverage can introduce noise, and show us more than we need to know, as well as introduce performance issues when it instruments our code. We will also see how to trim out information we don’t need to get a concise, targeted view of things.

This article uses several third-party tools in many recipes.

Spring Python (http://springpython.webfactional.com) contains many useful abstractions. The one used in this article is its DatabaseTemplate, which offers easy ways to write SQL queries and updates without having to deal with Python’s verbose API. Install it by typing pip install springpython .

. Install the coverage tool by typing pip install coverage . This may fail because other plugins may install an older version of coverage. If so, uninstall coverage by typing pip uninstall coverage , and then install it again with pip install coverage .

. This may fail because other plugins may install an older version of coverage. If so, uninstall coverage by typing , and then install it again with . Nose is a useful test runner.

Building a network management application

For this article, we will build a very simple network management application, and then write different types of tests and check their coverage. This network management application is focused on digesting alarms, also referred to as network events. This is different from certain other network management tools that focus on gathering SNMP alarms from devices.

For reasons of simplicity, this correlation engine doesn’t contain complex rules, but instead contains simple mapping of network events onto equipment and customer service inventory. We’ll explore this in the next few paragraphs as we dig through the code.

How to do it…

With the following steps, we will build a simple network management application.

Create a file called network.py to store the network application. Create a class definition to represent a network event. class Event(object):

def __init__(self, hostname, condition, severity, event_time):

self.hostname = hostname

self.condition = condition

self.severity = severity

self.id = -1

def __str__(self):

return “(ID:%s) %s:%s – %s” % (self.id, self.hostname,

self.condition, self.severity) hostname : It is assumed that all network alarms originate from pieces of equipment that have a hostname.

: It is assumed that all network alarms originate from pieces of equipment that have a hostname. condition : Indicates the type of alarm being generated. Two different alarming conditions can come from the same device.

: Indicates the type of alarm being generated. Two different alarming conditions can come from the same device. severity : 1 indicates a clear, green status; and 5 indicates a faulty, red status.

: indicates a clear, green status; and indicates a faulty, red status. id: The primary key value used when the event is stored in a database. Create a new file called network.sql to contain the SQL code. Create a SQL script that sets up the database and adds the definition for storing network events. CREATE TABLE EVENTS (

ID INTEGER PRIMARY KEY,

HOST_NAME TEXT,

SEVERITY INTEGER,

EVENT_CONDITION TEXT

); Code a high-level algorithm where events are assessed for impact to equipment and customer services and add it to network.py. from springpython.database.core import*

class EventCorrelator(object):

def __init__(self, factory):

self.dt = DatabaseTemplate(factory)

def __del__(self):

del(self.dt)

def process(self, event):

stored_event, is_active = self.store_event(event)

affected_services, affected_equip = self.impact(event)

updated_services = [

self.update_service(service, event)

for service in affected_services] updated_equipment = [

self.update_equipment(equip, event)

for equip in affected_equip] return (stored_event, is_active, updated_services,

updated_equipment) The __init__ method contains some setup code to create a DatabaseTemplate. This is a Spring Python utility class used for database operations. See http://static.springsource.org/spring- python/1.2.x/sphinx/html/dao.html for more details. We are also using sqlite3 as our database engine, since it is a standard part of Python. The process method contains some simple steps to process an incoming event. We first need to store the event in the EVENTS table. This includes evaluating whether or not it is an active event, meaning that it is actively impacting a piece of equipment.

table. This includes evaluating whether or not it is an active event, meaning that it is actively impacting a piece of equipment. Then we determine what equipment and what services the event impacts.

Next, we update the affected services by determining whether it causes any service outages or restorations.

Then we update the affected equipment by determining whether it fails or clears a device.

Finally, we return a tuple containing all the affected assets to support any screen interfaces that could be developed on top of this. Implement the store_event algorithm. def store_event(self, event):

try:

max_id = self.dt.query_for_int(“””select max(ID)

from EVENTS”””)

except DataAccessException, e:

max_id = 0

event.id = max_id+1

self.dt.update(“””insert into EVENTS

(ID, HOST_NAME, SEVERITY,

EVENT_CONDITION)

values

(?,?,?,?)”””,

(event.id, event.hostname,

event.severity, event.condition))

is_active =

self.add_or_remove_from_active_events(event)

return (event, is_active) This method stores every event that is processed. This supports many things including data mining and post mortem analysis of outages. It is also the authoritative place where other event-related data can point back using a foreign key. The store_event method looks up the maximum primary key value from the EVENTS table.

method looks up the maximum primary key value from the table. It increments it by one.

It assigns it to event.id .

. It then inserts it into the EVENTS table.

table. Next, it calls a method to evaluate whether or not the event should be add to the list of active events, or if it clears out existing active events. Active events are events that are actively causing a piece of equipment to be unclear.

Finally, it returns a tuple containing the event and whether or not it was classified as an active event. For a more sophisticated system, some sort of partitioning solution needs to be implemented. Querying against a table containing millions of rows is very inefficient. However, this is for demonstration purposes only, so we will skip scaling as well as performance and security. Implement the method to evaluate whether to add or remove active events. def add_or_remove_from_active_events(self, event):

“””Active events are current ones that cause equipment

and/or services to be down.”””

if event.severity == 1:

self.dt.update(“””delete from ACTIVE_EVENTS

where EVENT_FK in (

select ID

from EVENTS

where HOST_NAME = ?

and EVENT_CONDITION = ?)”””,

(event.hostname,event.condition))

return False

else:

self.dt.execute(“””insert into ACTIVE_EVENTS

(EVENT_FK) values (?)”””,

(event.id,))

return True When a device fails, it sends a severity 5 event. This is an active event and in this method, a row is inserted into the ACTIVE_EVENTS table, with a foreign key pointing back to the EVENTS table. Then we return back True, indicating this is an active event. Add the table definition for ACTIVE_EVENTS to the SQL script. CREATE TABLE ACTIVE_EVENTS (

ID INTEGER PRIMARY KEY,

EVENT_FK,

FOREIGN KEY(EVENT_FK) REFERENCES EVENTS(ID)

); This table makes it easy to query what events are currently causing equipment failures.

Later, when the failing condition on the device clears, it sends a severity 1 event. This means that severity 1 events are never active, since they aren’t contributing to a piece of equipment being down. In our previous method, we search for any active events that have the same hostname and condition, and delete them. Then we return False, indicating this is not an active event. Write the method that evaluates the services and pieces of equipment that are affected by the network event. def impact(self, event):

“””Look up this event has impact on either equipment

or services.”””

affected_equipment = self.dt.query(

“””select * from EQUIPMENT

where HOST_NAME = ?”””,

(event.hostname,),

rowhandler=DictionaryRowMapper())

affected_services = self.dt.query(

“””select SERVICE.*

from SERVICE

join SERVICE_MAPPING SM

on (SERVICE.ID = SM.SERVICE_FK)

join EQUIPMENT

on (SM.EQUIPMENT_FK = EQUIPMENT.ID

where EQUIPMENT.HOST_NAME = ?”””,

(event.hostname,),

rowhandler=DictionaryRowMapper())

return (affected_services, affected_equipment) We first query the EQUIPMENT table to see if event .hostname matches anything.

table to see if event matches anything. Next, we join the SERVICE table to the EQUIPMENT table through a many-to many relationship tracked by the SERVICE_MAPPING table. Any service that is related to the equipment that the event was reported on is captured.

table to the table through a many-to many relationship tracked by the table. Any service that is related to the equipment that the event was reported on is captured. Finally, we return a tuple containing both the list of equipment and list of services that are potentially impacted. Spring Python provides a convenient query operation that returns a list of objects mapped to every row of the query. It also provides an out-of-the-box DictionaryRowMapper that converts each row into a Python dictionary, with the keys matching the column names. Add the table definitions to the SQL script for EQUIPMENT, SERVICE, and SERVICE_MAPPING. CREATE TABLE EQUIPMENT (

ID INTEGER PRIMARY KEY,

HOST_NAME TEXT UNIQUE,

STATUS INTEGER

);

CREATE TABLE SERVICE (

ID INTEGER PRIMARY KEY,

NAME TEXT UNIQUE,

STATUS TEXT

);

CREATE TABLE SERVICE_MAPPING (

ID INTEGER PRIMARY KEY,

SERVICE_FK,

EQUIPMENT_FK,

FOREIGN KEY(SERVICE_FK) REFERENCES SERVICE(ID),

FOREIGN KEY(EQUIPMENT_FK) REFERENCES EQUIPMENT(ID)

); Write the update_service method that stores or clears service-related even and then updates the service’s status based on the remaining active events. def update_service(self, service, event):

if event.severity == 1:

self.dt.update(“””delete from SERVICE_EVENTS

where EVENT_FK in (

select ID

from EVENTS

where HOST_NAME = ?

and EVENT_CONDITION = ?)”””,

(event.hostname,event.condition))

else:

self.dt.execute(“””insert into SERVICE_EVENTS

(EVENT_FK, SERVICE_FK)

values (?,?)”””,

(event.id,service[“ID”]))

try:

max = self.dt.query_for_int(

“””select max(EVENTS.SEVERITY)

from SERVICE_EVENTS SE

join EVENTS

on (EVENTS.ID = SE.EVENT_FK)

join SERVICE

on (SERVICE.ID = SE.SERVICE_FK)

where SERVICE.NAME = ?”””,

(service[“NAME”],))

except DataAccessException, e:

max = 1

if max > 1 and service[“STATUS”] == “Operational”:

service[“STATUS”] = “Outage”

self.dt.update(“””update SERVICE

set STATUS = ?

where ID = ?”””,

(service[“STATUS”], service[“ID”]))

if max == 1 and service[“STATUS”] == “Outage”:

service[“STATUS”] = “Operational”

self.dt.update(“””update SERVICE

set STATUS = ?

where ID = ?”””,

(service[“STATUS”], service[“ID”]))

if event.severity == 1:

return {“service”:service, “is_active”:False}

else:

return {“service”:service, “is_active”:True} Service-related events are active events related to a service. A single event can be related to many services. For example, what if we were monitoring a wireless router that provided Internet service to a lot of users, and it reported a critical error? This one event would be mapped as an impact to all the end users. When a new active event is processed, it is stored in SERVICE_EVENTS for each related service.

Then, when a clearing event is processed, the previous service event must be deleted from the SERVICE_EVENTS table. Add the table defnition for SERVICE_EVENTS to the SQL script. CREATE TABLE SERVICE_EVENTS (

ID INTEGER PRIMARY KEY,

SERVICE_FK,

EVENT_FK,

FOREIGN KEY(SERVICE_FK) REFERENCES SERVICE(ID),

FOREIGN KEY(EVENT_FK) REFERENCES EVENTS(ID)

); It is important to recognize that deleting an entry from SERVICE_EVENTS doesn’t mean that we delete the original event from the EVENTS table. Instead, we are merely indicating that the original active event is no longer active and it does not impact the related service. Prepend the entire SQL script with drop statements, making it possible to run the script for several recipes DROP TABLE IF EXISTS SERVICE_MAPPING; DROP TABLE IF EXISTS SERVICE_EVENTS; DROP TABLE IF EXISTS ACTIVE_EVENTS; DROP TABLE IF EXISTS EQUIPMENT; DROP TABLE IF EXISTS SERVICE; DROP TABLE IF EXISTS EVENTS; Append the SQL script used for database setup with inserts to preload some equipment and services. INSERT into EQUIPMENT (ID, HOST_NAME, STATUS) values (1, 'pyhost1', 1); INSERT into EQUIPMENT (ID, HOST_NAME, STATUS) values (2, 'pyhost2', 1); INSERT into EQUIPMENT (ID, HOST_NAME, STATUS) values (3, 'pyhost3', 1); INSERT into SERVICE (ID, NAME, STATUS) values (1, 'service-abc', 'Operational'); INSERT into SERVICE (ID, NAME, STATUS) values (2, 'service-xyz', 'Outage'); INSERT into SERVICE_MAPPING (SERVICE_FK, EQUIPMENT_FK) values (1,1); INSERT into SERVICE_MAPPING (SERVICE_FK, EQUIPMENT_FK) values (1,2); INSERT into SERVICE_MAPPING (SERVICE_FK, EQUIPMENT_FK) values (2,1); INSERT into SERVICE_MAPPING (SERVICE_FK, EQUIPMENT_FK) values (2,3); Finally, write the method that updates equipment status based on the current active events. def update_equipment(self, equip, event):

try:

max = self.dt.query_for_int(

“””select max(EVENTS.SEVERITY)

from ACTIVE_EVENTS AE

join EVENTS

on (EVENTS.ID = AE.EVENT_FK)

where EVENTS.HOST_NAME = ?”””,

(event.hostname,))

except DataAccessException:

max = 1

if max != equip[“STATUS”]:

equip[“STATUS”] = max

self.dt.update(“””update EQUIPMENT

set STATUS = ?”””,

(equip[“STATUS”],))

return equip

Here, we need to find the maximum severity from the list of active events for a given host name. If there are no active events, then Spring Python raises a DataAccessException and we translate that to a severity of 1.

We check if this is different from the existing device’s status. If so, we issue a SQL update. Finally, we return the record for the device, with its status updated appropriately.

How it works…

This application uses a database-backed mechanism to process incoming network events, and checks them against the inventory of equipment and services to evaluate failures and restorations. Our application doesn’t handle specialized devices or unusual types of services. This real-world complexity has been traded in for a relatively simple application, which can be used to write various test recipes.

Events typically map to a single piece of equipment and to zero or more services. A service can be thought of as a string of equipment used to provide a type of service to the customer. New failing events are considered active until a clearing event arrives. Active events, when aggregated against a piece of equipment, define its current status. Active events, when aggregated against a service, defines the service’s current status.