Neural Networks

Neural networks are one approach to machine learning that attempts to deal with the problem of large data dimensionality. The neural network approach uses a fixed number of basis functions - in contrast to methods such as support vector machines that attempt to adapt the number of basis functions - that are themselves parameterized by the model parameters. This is a significant departure from linear regression and logistic regression methods where the models consisted of linear combinations of fixed basis functions, $\phi(\mathbf{x})$, that dependend only on the input vector, $\mathbf{x}$. In neural networks, the basis functions can now depend on both the model parameters and the input vector and thus take the form $\phi(\mathbf{x} | \mathbf{w})$.

Here we will cover only feed-forward neural networks. One can envision a neural network as a series of layers where each layer has some number of nodes and each node is a function. Each layer represents a single linear regression model. The nodes are connected to each other through inputs accepted from and outputs passed to other nodes. A feed-forward network is one in which these connections do not form any directed cycles. See here for more detail.

As a matter of convention, we will refer the model as a $N$ layer model where $N$ is the number of layers for which adaptive parameters, $\mathbf{w}$, must be determined. Thus for a model consisting of an input layer, one hidden layer, and an output layer, we consider $N$ to be 2 since parameters are determined only for the hidden and output layers.