Learning (Subsystem of AIMA Code)

attribute-name function (attribute)

attribute-values function (attribute)

attribute-value function (attribute example)

random-examples function (n attributes)

classify function (unclassified-examples goals h performance-element)

consistent function (examples goals h performance-element)

Coded examples have goal values (in a single list) followed by attribute values, both in fixed order code-examples function (examples attributes goals)

code-example function (example attributes goals)

code-unclassified-example function (example attributes goals)

print-learning-problem function (problem &optional stream depth)

File learning/algorithms/learning-curves.lisp Functions for testing induction algorithms Tries to be as generic as possible Mainly for NN purposes, allows multiple goal attributes A prediction is correct if it agrees on ALL goal attributes learning-curve function (induction-algorithm examples -> hypothesis performance-element hypothesis + example -> prediction examples attributes goals trials training-size-increment &optional error-fn)

this version uses incremental data sets rather than a new batch each time incremental-learning-curve function (induction-algorithm examples -> hypothesis performance-element hypothesis + example -> prediction examples attributes goals trials training-size-increment &optional error-fn)

accuracy function (h performance-element test-set goals &optional error-fn)

File learning/algorithms/dtl.lisp decision tree learning algorithm - the standard "induction algorithm" returns a tree in the format (a1 (v11 . ) (v12 . )), bottoming out with goal values. currently handles only a single goal attribute decision-tree-learning function (problem)

dtl function (examples attributes goal &optional prior)

distribution function (examples goal)

majority function (examples goal)

select-attribute function (examples attributes goal)

information-value function (a examples goal)

bits-required function (d)

dtpredict is the standard "performance element" that interfaces with the example-generation and learning-curve functions dtpredict function (dt example)

File learning/algorithms/dll.lisp decision list learning algorithm (Rivest) returns a decision list, each element of which is a test of the form (x .term), where each term is of the form ((a1 . v1) (a2 . v2) ... (an . vn)). The last element is the test (0). only works for purely boolean attributes. decision-list-learning function (k problem)

dll function (k examples attributes goal)

select-test finds a test of size at most k that picks out a set of examples with uniform classification. Returns test and subset. select-test function (k examples attributes goal)

select-k-test function (k examples attributes goal test-attributes)

generate-terms function (attributes)

uniform-classification function (examples goal)

passes function (example test)

dlpredict is the standard "performance element" that interfaces with the example-generation and learning-curve functions dlpredict function (dl example)

File learning/algorithms/nn.lisp Code for layered feed-forward networks Network is represented as a list of lists of units. Inputs assumed to be the ordered attribute values in examples Every unit gets input 0 set to -1 unit type (parents sequence of indices of units in previous layer children sequence of indices of units in subsequent layer weights weights on links from parents g activation function dg activation gradient function g' (if it exists) a activation level in total weighted input gradient g'(in_i) )

make-connected-nn returns a multi-layer network with layers given by sizes make-connected-nn function (sizes &optional previous g dg)

step-function function (threshold x)

sign-function function (threshold x)

sigmoid function (x)

nn-learning establishes the basic epoch struture for updating, Calls the desired updating mechanism to improve network until either all correct or runs out of epochs nn-learning function (problem network learning-method &key tolerance limit)

nn-error function (examples network)

network-output function (inputs network)

nn-output is the standard "performance element" for neural networks and interfaces to example-generating and learning-curve functions. Since performance elements are required to take only two arguments (hypothesis and example), nn-output is used in an appropriate lambda-expression nn-output function (network unclassified-example attributes goals)

unit-output computes the output of a unit given a set of inputs it always adds a bias input of -1 as the zeroth input unit-output function (inputs unit)

get-unit-inputs function (inputs parents)

random-weights function (n low high)

print-nn prints out the network relatively prettily print-nn function (network)

File learning/algorithms/perceptron.lisp

perceptron learning - single-layer neural networks make-perceptron returns a one-layer network with m units, n inputs each make-perceptron function (n m &optional g)

majority-perceptron function (n &optional g)

perceptron-learning is the standard "induction algorithm" and interfaces to the learning-curve functions perceptron-learning function (problem)

Perceptron updating - simple version without lower bound on delta Hertz, Krogh, and Palmer, eq. 5.19 (p.97) function (perceptron actual-inputs predicted target &optional learning-rate)

File learning/algorithms/multilayer.lisp

back-propagation learning - multi-layer neural networks backprop-learning is the standard "induction algorithm" and interfaces to the learning-curve functions backprop-learning function (problem &optional hidden)

Backprop updating - Hertz, Krogh, and Palmer, p.117 function (network actual-inputs predicted target &optional learning-rate)

backpropagate function (rnetwork network in reverse order inputs the inputs to the network deltas the "errors" for current layer learning-rate)

function (layer all-inputs deltas learning-rate)

compute-deltas propagates the deltas back from layer i to layer j pretty ugly, partly because weights Wji are stored only at layer i compute-deltas function (jlayer ilayer ideltas)

File learning/algorithms/q-iteration.lisp learning/algorithms/q-iteration.lisp Data structures and algorithms for calculating the Q-table for an MDP. Q(a,i) is the value of doing action a in state i. q-entry function (q a i)

all-q-entries function (i q)

q-actions function (s q)

Given an MDP, determine the q-values of the states. Q-iteration iterates on the Q-values instead of the U-values. Basic equation is Q(a,i) where Q(a',j) MUST be the old value not the new. q-iteration function (mdp &optional qold &key epsilon)

average-successor-q function (a i q m)

Compute optimal policy from Q table q-optimal-policy function (q)

Choice functions select an action under specific circumstances Pick a random action q-random-choice function (s q)

Pick the currently best action q-dmax-choice function (s q)

Pick the currently best action with tie-breaking q-max-choice function (s q)

File learning/domains/restaurant-multivalued.lisp learning/domains/restaurant-multivalued.lisp Restaurant example from chapter 18, encoded using multivalued input attributes suitable for decision-tree learning. *restaurant-multivalued* variable

*restaurant-multivalued-problem* variable

File learning/domains/restaurant-real.lisp learning/domains/restaurant-real.lisp Restaurant example from chapter 18, encoded using real-valued input attributes suitable for decision-tree learning or neural network learning. *restaurant-real* variable

*restaurant-real12-problem* variable

*restaurant-real100-problem* variable

File learning/domains/restaurant-boolean.lisp learning/domains/restaurant-boolean.lisp Restaurant learning problem encoded using boolean attributes only, as appropriate for decision-list learning. Target is encoded as a decision list. *restaurant-boolean* variable

*restaurant-boolean-problem* variable

File learning/domains/majority-boolean.lisp *majority-boolean* variable

*majority-boolean-problem* variable

File learning/domains/ex-19-4-boolean.lisp learning/domains/ex-19-4-boolean.lisp Inductive learning example from exercise 19.4 *ex-19-4-boolean-problem* variable

File learning/domains/and-boolean.lisp learning/domains/and-boolean.lisp Data for Boolean AND function *and-boolean-problem* variable

File learning/domains/xor-boolean.lisp learning/domains/xor-boolean.lisp Data for Boolean XOR function *xor-boolean-problem* variable

File learning/domains/4x3-passive-mdp.lisp Passive, stochastic 4x3 environment for chapter 20. Just one possible action (no-op), uniformly arrives at neighbour square. *4x3-passive-m-data* variable

*4x3-passive-r-data* variable

*4x3-passive-mdp* variable

File learning/agents/passive-lms-learner.lisp learning/agents/passive-lms-learner.lisp Passive LMS learning agent. When a given training sequence terminates, update the utility of each state visited in the sequence to reflect the rewards received from then on. make-passive-lms-learner function ()

function (u e percepts n)

File learning/agents/passive-adp-learner.lisp learning/agents/passive-adp-learner.lisp Reinforcement learning agent that uses dynamic programming to solve the Markov process that it learns from its experience. Thus, the main job is to update the model over time. Being a passive agent, it simply does no-op at each step, watching the world go by. make-passive-adp-learner function ()

Updating the transition model according to oberved transition i->j. Fairly tedious because of initializing new transition records. function (j current state (destination of transition) percepts in reverse chronological order m transition model, indexed by state )

(passive-policy M) makes a policy of no-ops for use in value determination passive-policy function (m)

File learning/agents/passive-td-learner.lisp learning/agents/passive-td-learner.lisp Passive temporal-difference learning agent. After each transition, update the utility of the source state i to make it agree more closely with that of the destination state j. *alpha* variable

initial learning rate parameter make-passive-td-learner function ()

function (u e percepts n)

current-alpha function (n)

File learning/agents/active-adp-learner.lisp learning/agents/active-adp-learner.lisp Reinforcement learning agent that uses dynamic programming to solve the Markov decision process that it learns from its experience. Thus, the main job is to update the model over time. make-random-adp-learner function (actions)

make-maximizing-adp-learner function (actions)

make-active-adp-learner function (actions choice-function)

Update current model to reflect the evidence from the most recent action function (mdp current description of envt. percepts in reverse chronological order action last action taken )

File learning/agents/active-qi-learner.lisp learning/agents/active-adp-learner.lisp Reinforcement learning agent that uses dynamic programming to solve the Markov decision process that it learns from its experience. Thus, the main job is to update the model over time. make-random-qi-learner function (actions)

make-maximizing-qi-learner function (actions)

make-active-qi-learner function (actions choice-function)

File learning/agents/exploring-adp-learner.lisp learning/agents/exploring-adp-learner.lisp Reinforcement learning agent that uses dynamic programming to solve the Markov decision process that it learns from its experience. Thus, the main job is to update the model over time. Unlike the active-adp-learner, this agent uses an "intelligent" exploration policy to make sure it explores the state space reasonably quickly. *r+* variable

*ne* variable

exploration-function function (u n)

make-exploring-adp-learner function (actions)

Given an environment model M, determine the values of states U. Use value iteration, with initial values given by U itself. Basic equation is U(i) where f is the exploration function. Does not applyt to terminal states. exploratory-value-iteration function (mdp &optional uold &key epsilon)

exploration-choice function (s u m r)

File learning/agents/exploring-tdq-learner.lisp learning/agents/exploring-tdq-learner.lisp Exploratory reinforcement learning agent using temporal differences. Works without a model by using the stochastic sampling to mirror the effect of averaging using the model. make-exploring-tdq-learner function (actions)

function (q a i j n ri)