Functions

Functions are at the core of machine learning.

At the beginning of a project, the business objectives are specified in terms of functions.

During the project, functions are used at just about every step to process the data, to discover patterns and to evaluate the performance of the system.

At the end of the project, what is ultimately delivered to the client, is another function. The deliverable may be coded in a programming language or take the form of a composition of several lower-level functions, but it is a function nonetheless.

Functions map inputs to outputs

Many readers will be familiar with functions from mathematics and/or programming.

Mathematically, a function specifies how elements in one set are related to elements of another set.

Programmatically, a function processes an input to generate an output.

The following graphic combines aspects from both definitions:

Fig. 4.1: Edited version of a graphic released into the public domain by Wvbailey

These two definitions are consistent with each other.[1] The acceptable input to a function belongs to one set and the output that the function generates belongs to another another set. Functions define how we get from one to the other.

And that’s all there is to know about the essence of functions.

As a preview of what is to come, I will mention that the input to the functions at the center of machine learning are the representations of objects that were discussed last time (e.g., pixel intensities and the presence or absence of certain words). The output of functions are the predicted values for the targets that we are interested in.

Notation

To define a function mathematically, we need to specify two things:

the sets that are involved the rule by which the elements from one set are mapped on to elements of the other set.

Sets in functions

The fact that a function f maps elements from set A to elements in set B is written as f: A → B.

The two sets, A and B, are called domain and co-domain, respectively. They can and, in many cases, do refer to the same set.

The letters used in this notation (f, A and B) can be understood as placeholders. We will often use different names.

Mapping

The input to a function f is usually denoted with the letter x. In the context of machine learning, x usually refers to the array that represents an object.

The letter y designates the output of a function. We will use the letter y to denote predictions.

Combining these building blocks, the following equation simply states that y is the output of the application of function f to the input x: f(x) = y.

Function names

In many cases, we will need to refer to more than one function. To keep them apart, one of the following naming strategies can be employed:

Other letters in the alphabet, especially g and h

Subscripts (or superscripts), such as f_1, f_2, etc.

Words or abbreviations of words: e.g., increment or inc

A simple example

Consider the self-descriptive example of a function named fahrenheit_to_celsius.

Given that temperatures of trillions of degrees of Fahrenheit have been achieved in the laboratory[2], I think it’s fair to use the set of real numbers for this function. At the level of sets, we can write:

At the level of individual elements, we have:

Using a style that emphasizes the equivalence between mathematics and programming, this function corresponds to the following Python code[3]: