I was recently looking into using a neural network for a project so I started looking into some of the available Python libraries. The one I ended up using was Pylearn2 which is a fast and powerful library for machine learning that is mainly built upon Theano.

Pylearn2 is under development and is still a bit rough around the edges and the documentation is limited and in some instances not correct. The recommended way of using it is by writing YAML scripts and if you are ok with that you can probably manage with the existing documentation. But if you, like me, want to use it as a standard Python library you have better be prepared to read the code. One thing that would have saved me some time was a complete example of how to use Pylearn2 as a standalone library, so what follows is a simple example of creating a neural network for solving the XOR problem.

The XOR problem is stated as follows, create a neural network that given two binary inputs, 0 or 1, the output should be a 1 if exactly one of the inputs are 1 and 0 otherwise.

Pylearn2 has a dataset implementation that in its simplest form needs a collection of datapoints in a 2D Numpy array named X and a 2D array named y containing the answers. We can create a dataset by creating a new class that inherits from DenseDesignMatrix:

class XOR(DenseDesignMatrix): def __init__(self): self.class_names = ['0', '1'] X = [[randint(0, 1), randint(0, 1)] for _ in range(1000)] y = [] for a, b in X: if a + b == 1: y.append([0, 1]) else: y.append([1, 0]) X = np.array(X) y = np.array(y) super(XOR, self).__init__(X=X, y=y) ds = XOR()

Note that we are using two columns in the target variable y, a 1 in the first column signifies a output of 0 and a 1 in the second columns signifies a output of 1.

Next we need to create the layers in the neural net. To be able to solve the XOR problem we need a hidden layer with at least two neurons:

hidden_layer = mlp.Sigmoid(layer_name='hidden', dim=2, irange=.1, init_bias=1.)

The hidden layer uses a standard sigmoid activation function and the weights are initialized in the range -0.1 to 0.1 (using the irange argument). We also add a bias to the two neurons with value 1.0.

We use a softmax layer with two nodes as output layer. The output from the two nodes is between 0 and 1 and the sum of the output from all nodes in the layer is 1.

output_layer = mlp.Softmax(2, 'output', irange=.1)

To train the network we use a Stochastic Gradient Descent (SGD) method which we initialize like this:

trainer = sgd.SGD(learning_rate=.05, batch_size=10, termination_criterion=EpochCounter(400))

We use a simple termination criterion that runs for 400 epochs, more advanced termination criteria are of course available.

To initialize the neural network and setup the training we do like this:

layers = [hidden_layer, output_layer] ann = mlp.MLP(layers, nvis=2) trainer.setup(ann, ds)

We put the layers in the Multi-Layer Perceptron class with two inputs and then setup the trainer with the class and the dataset.

We then train the neural network until the termination criteria is reached:

while True: trainer.train(dataset=ds) ann.monitor.report_epoch() ann.monitor() if not trainer.continue_learning(ann): break

After the training is complete we of course wants to test that it works. We do this by using the fprop-method that takes the inputs as Theano variables:

inputs = np.array([[0, 1]]) print ann.fprop(theano.shared(inputs, name='inputs')).eval()

This should yield a answer like this:

[[ 0.00526688 0.99473312]]

Meaning that the network correctly predicts that the output should be a 1.

See here for the complete source code of the example.