In this post, I begin to outline an intelligent agent architecture that utilizes concepts from intuitionistic type theory to learn, retain, and reason about its knowledge of the world.

1. Information

Composite types are sets of strings that identify primitive constructors, i.e. functions that return objects of specific data types that are fundamentally built-in (e.g. integers, strings, bools, etc. in a programming language). Unlike these primitive objects, composite objects can only be formed by interpreters, i.e. functions that return nested objects, i.e. sets with elements that are also sets, the deepest layer of course containing primitive objects, from which all composite objects are formed.

Therefore a composite type is a hierarchical tree structure with each leaf containing a primitive type and each branch a composite type. The interpretation of such a structure results in a new object, where each leaf is a primitive object created by a primitive constructor, and each branch a composite object created by the recursive interpretation of each element (i.e. depth-first execution of each primitive constructor contained by a given object).

2. Construction

A sufficient memory system with access to the kinds of information described above, i.e. types and objects, along with the ability to perform specific computations on said information, i.e. construction and interpretation, could hypothetically manage a certain intelligent subsystem, the functions of which include forming new types using those previously created. Such a system requires specialized functions for combining known types into new, greater types. The first function is subsume(A, B), where types A and B are merged such that A contains B as one of its elements. The second function is combine(A, B) , which yields a new type containing both A and B as its elements.

subsume(A, B)

combine(A, B)

These two functions allow stored types to be merged into larger trees, which are then stored and used in even greater structures. This system places no theoretical limit on depth of a given type or the number of items it can contain, which can lead to the generation of infinite hierarchies if left to run without constraints. These limitations are actually necessary, for a few reasons. The most obvious is the physical limitation on energy flow, which places an inherit restriction on the size of memory, i.e. the amount of information one is allowed to store. The second reason these constraints on our system are actually a good thing is more subtle. Its the fact that they present problems by which to teach, adapt, and ultimately increase the performance of an intelligent agent acting to achieve some set of goals. The balance between an agent’s objectives and the worldly limitations that restrict behavior is found through the adaptation of actions in response to given scenarios, such that the change is believed to increase the probability of success for the agent.

3. Adaptation

But what is the process of adaptation? What concrete changes are occurring that allow an agent to change its behavior? To answer these, we need some kind of object that gives instructions to the agent when building new types. Then, we need to way to mutate these objects such that the instructions guiding the agent’s behavior (i.e. creation of types) are now different than they were before, leading to new sequences of actions executed by the agent.

But then what is an action? Let’s consider an agent an autonomous structure, with interfaces to the external world that take in information and produce “physical” events (i.e. any change that occurs externally in the agent’s environment). The specific outputs that lead to change are “chosen” by activation signals flowing between nodes in the brain. When a given “motor” node receives input, it has a certain probability of firing, effectively triggering a physical event by executing a motor function. Although “motor” is typically associated with the material world, we are discussing in the abstract what defines an action, event, or even the environment.

So for instance a “motor” would execute subsume(A, B) or combine(A, B) in the intelligent system we’ve been discussing, each requiring two filled “slots”, or places in which data can be held, therefore requiring some ability to sense when the necessary variables A and B contain nonempty data. This means that one of the sensory nodes providing input to the system is actually an indication of whether A and B are nonempty. We then have a basic rule: if A and B are filled, then create a new type C. This basic if-then rule states that whenever two variables are assigned to types, then a merge function is performed to create a new type from those stored in A and B.

4. Motivation

We now have something to work with that makes logical sense in the way in results in behavior, and which can be stated using simple language. Rules are objects, which guide behavior, and which can technically be changed such that the agent’s actions are instructed differently. For a simple example, take the rule given above. It requires that A and B are nonempty, making it a conjunction. But what if we switched the “and” with “or”, making it a disjunction? The change is immediately evident in the agent’s behavior. It no longer waits for both variables to be filled, so if just one contains a type, the merge action is called. This causes an attempt to form a new type with only one type. The subsume function would do nothing, since an empty variable cannot be stored within a type. The only option is therefore combine, which simply generates a new type, with one element containing whatever is stored in the nonempty variable.

This example shows how a change in the rules an agent follows are directly manifested in the actions carried out. To require that A and B are nonempty is a rule that immediately reveals its own structure. It has two variables and a relation, the variables being A and B and the relation being and. Relations are represented by propositional functions that return either true or false. Just as types use strings to indicate constructor functions, relations use strings to indicate a wide array of propositional functions which, when stored with respect to a set of variables, are effectively executed by an interpreter. In this sense a relation is very similar to a type, in that it is a representation of something that is created in concrete form via interpretation.

Rules are used in all kinds of ways, from guiding behavior to defining requirements on elements of a type. They allow structures to directly influence the function of agents, and enable those structures to change over time. A system of rules can then be built up from successful applications of smaller rules, where success can be determined in a variety of ways. The most basic definition of success is that of increasing performance. This is given by a utility function that triggers automatic adaptation of rules, driven by the desire to maximize cumulative reward over time. This is called reinforcement.

Another, less direct way of determining success is by testing whether a given rule holds true over multiple examples. If a certain rule tends to hold true, given the context, it is said to be fit (i.e. successful). If a rule fails in many cases, it is said to be unfit and is less likely to stick around in memory. Over time, bad rules are removed and good rules are maintained, leading to the maximization of success over time as defined by a lack of contradictory examples to a given rule.

5. Evolution

as structures form on top of the group, the structure-depth increases until it hits a ceiling. at this point, the root node of a given structure is stored as a base variable in a group one level up. observations can then be made in isolation rather than with respect to lower-level nodes.

as structures develop, certain elements will be chosen that ‘fit’ the the real-world observations of the environment better than others. those with less correspondence to reality will eventually decay, as more and more contradictions are detected between the actual world and the functional identity of the template. the fitness of such a structure will decay until it reaches a point where the information is discarded, or rather deconstructed to make room for new templates. this is important because it increases the likelihood that any large-enough structure to reach the depth ceiling will have to have a relatively consistent background in its correspondence with the actual world. this means that the group at each level contains elements that have spent more time developing than those at the previous level, effectively maximizing the efficiency of memory-use by discriminating more and more between the success and failure of objects at each new level.

the development of structures between a group of elements and the next level up is extremely similar (in process) to the way plants in a forest compete for sunlight. the most successful being is likely to be taller than all the rest, given it’s had a more stable growing experience and therefore was able to live long enough to increase in height, leading to an greater likelihood of receiving more sunlight, since it overtowers the other plants around it. weaker beings can’t make it as far, which can cause them to fall behind the rest until they receive so little sunlight that they begin to whither away.

this inherently competitive process is due to natural selection through genetic mutations which increase or decrease a given animal’s fitness with respect to its environment. it’s intuitively simple to conclude then how the most successful beings would eventually replace those less fit for the current surroundings. the same phenomenon occurs with the development of templates, an idea that may seem odd given the abstract and logical nature of these structures, in contrast to the natural biology that leads to the evolutionary processes from which we are drawing inspiration. but hidden behind the complex systems that drive life forward in nature is an elegant logic, where the rules in-and-of themselves lead to the creation and adaptation of new structures that are more complex than those previously in existence. through careful planning and execution, such a process can be used to adapt a memory system such that the information stored within the structures can be used to the agent’s benefit, i.e. to accomplish a set of goals. inferential reasoning, i believe, can be produced naturally if the rules determining construction behavior (i.e. building of templates) are set up in such a way that repeated observed relations are preserved through a given template by its structural properties. in other words, an agent creates new templates directly in response to observable patterns. for instance if two variables are repeatedly seen activating together, then that relationship can be described by a conjunctive function, meaning it returns true if both variables are active.

how does this occur in a natural setting? well, an agent must first be equipped with some a priori knowledge. specifically, it must have some notion of equivalence between patterns of observable data and structural properties of a template. for example it must know that the appearance of two variables activating simultaneously somehow “maps” to the idea of a conjunction.

this knowledge can be obtained by a kind of reverse-engineering of the function, where we take a set of examples (i.e. objects, or instantiated templates) from previous time steps and observe the constant features across all samples. for instance a conjunction relation might be learned by observing when a valid conjunction exists between a and b, they both equal ‘true’ all of the time.

our observable variables are also boolean, and considered to be active when they equal true. given that access to memory is finite and time dependent, it makes sense that conjunctions would be used to represent the link between two simultaneous events, since they are more likely to appear together in the subset of elements that are active and readily accessible by the agent at any given time. when two variables spend time in this working memory, they are perceived as being more statistically significant than two variables that infrequently occur together. this probabilistic knowledge is adapted to fit the most accurate depiction of relevance that exists between sets of variables. we can therefore say, without additional consideration, whether two variables are considered “relevant” with respect to one another, simply based on the topological structure of an associative network.

clusters of variables are then likely to depict groups of highly interrelated concepts. we can extract groups of variables using cluster analysis, providing a foundation on which to built templates.

each group is analyzed independently from he rest, effectively reducing the search space of possible relations that one must look through when building new structures. each variable is only compared to those with which it shares a high-enough statistical relationship, and thus is more likely to share other connections such as logical or semantic which require more computation to analyze but also generate much more meaningful information from which the agent can learn.

6. Decision-making

Thought is always guided by certain rules, of which we are the creators. these rules determine what concepts are (perceived as) necessary and what should be avoided, effectively laying out a blueprint for making decisions such as choosing between two or more conceptual “paths” in order to reach a goal. decision-making at this stage manifests as directed excitatory and inhibitory signals, for example if path a is chosen over path b, then a is stimulated and b is inhibited, causing a to increase in activity and b to decrease, leading to greater activity around the selected path which drives behavior in that direction, and (hopefully) toward a goal.