Data classification is the central data-mining technique used for sorting data, understanding of data and for performing outcome predictions. In this small blog we will use a library Smilecthat includes many methods for supervising and non-supervising data classification methods. We will make a small Python-like code using Jython top build a complex Multilayer Perceptron Neural Network for data classification. It will have large number of inputs, several outputs, and can be easily extended for cases with many hidden layers. We will write a few lines of Jython code (most of our codding will deal with how to prepare an interface for reading data, rather than with Neural Network programming).

First of all, let us copy data samples. One sample will be for training, another file is for testing Copy these files to your local directory. We will import import the necessary classes to be used in this example using the usual Python style:



from smile.data import AttributeDataset,NominalAttribute

from smile.data.parser import DelimitedTextParser,IOUtils

from smile.classification import NeuralNetwork

from smile.math import Math

from jarray import zeros,array

from jhplot import *

import java



We import several classes from several packages: "smile", "jhplot" and Java. An additional package "jarray" is used to work with Java arrays using Jython. The NeuralNetwork class is the most important for our example. It creates Multilayer perceptron neural network that consists of several layers of nodes, interconnected through weighted acyclic arcs from each preceding layer to the following. We will call it as:

nn=[Nr_input, Nr_hidden, Nr_output] # define structure of NN net=NeuralNetwork(NeuralNetwork.ErrorFunction.LEAST_MEAN_SQUARES, NeuralNetwork.ActivationFunction.LOGISTIC_SIGMOID,nn)

net.learn(x, y) # train neural net on input double array y, with the outcome given by x

Here we use logistic sigmoid function and the error function "LEAST_MEAN_SQUARES". Before start using this NN, we will read data from the files, and create input arrays x,y. We will write a small function "getJavaArrays" that returns two arrays, double[][] and int[] (using Java style). After NN was trained, we will use the test sample and call the method "predict(x)" to verify that our prediction are close to the expected values from the test sample.

The full code looks as this:

from smile.data import AttributeDataset,NominalAttribute