There is an email discussion group at yahoo.com that you may subscribe to if you wish to discuss Latejami with other people. To subscribe, send an email message to: Latejami-subscribe@yahoogroups.com

The latest version of this document can be found at http://www.rickmor.x10.mx/lexical_semantics.html . A tutorial (including audio wav files) and dictionaries for the interlingua can be found at http://www.rickmor.x10.mx/Latejami/index.html . These files are also works in progress. As time goes by, the dictionaries will be expanded and more self-study lessons will be added.

This monograph is a reference manual for a machine translation interlingua. It is still in the draft stage, and will be undergoing continuous revision as the software based on it is developed and tested. If you have any helpful comments or suggestions, please feel free to contact me . If you do contact me, please quote sparingly from the monograph.

[ Table of Contents ]

In this monograph, I would like to discuss word design for an artificial language designed specifically for use as an interlingua in machine translation. Such a language must be designed to meet two primary goals: first, it must be easier to accurately translate from the source natural language into the interlingua than into another natural language; and, second, it must be almost trivially easy (i.e., requiring simple computer programming) to accurately translate from the interlingua into the target language. In other words, mapping between natural languages and the interlingua must be both accurate and made as easy as possible.

The interlingua achieves these goals by means of its simple but powerful derivational morphology which makes word design rigorous yet straight-forward, while at the same time greatly reducing the number of basic morphemes (i.e. primitives) required by the language.

Initially, I will not try to describe this method in abstract terms, since this discussion is intended for the non-linguist. Instead, I will present the reader with many examples of various kinds of linguistic constructions, discuss the semantics of these constructions, introduce linguistic terminology where and as needed, and finally, try to derive some useful generalizations.

[ Table of Contents ]

I'll start this exposition by looking first at verbs. Specifically, I will look at two of the most important criteria that go into defining a verb: its valency (i.e. the number of basic arguments that it requires) and its case requirements (i.e. the semantic roles played by the basic arguments). When combined, the valency and case requirements of a verb are usually referred to as the argument structure of the verb.

Before proceeding, though, let me give you a quick review of valency and case. Consider the following English sentence:

In this example, the verb "break" has a valency of two, since it requires two arguments: the subject "the chimpanzee" and the object "the window". The arguments are required because, if either were missing, the resulting sentence would be ungrammatical (or, in the case of some verbs, would have a different meaning):

[Please note that I am using the standard linguistic convention of indicating an unacceptable item by preceding it with an asterisk.]

But the following is okay:

For the verb "break", the case role of the subject is agent, and indicates the entity responsible for the event. The case role of the object is patient, and indicates the entity which experiences the state or change of state described by the verb. In other words, the argument structure of the English verb "break" requires two arguments: the first argument (i.e. the subject) must be a semantic agent, and the second argument (i.e. the object) must be a semantic patient.

Arguments required by a verb are called core arguments.

The phrase "with a coconut" is what is called an oblique argument since it is not essential for the sentence to be grammatical. It simply provides additional peripheral information about what happened. In this sentence, it indicates the instrument of the event. In other words, "a coconut" is the instrument used in carrying out the act indicated by the verb. If the sentence had been:

then "in Boston" would be a locative oblique argument, and "on Tuesday" would be a temporal oblique argument.

[The case terminology that I am using here is fairly common, but not universal. Linguists who work with case grammar and thematic relations have yet to agree on the number and nature of case roles needed to adequately describe natural language. As it turns out, this lack of agreement is irrelevant to what we are trying to accomplish here. We will, in effect, create our own internally consistent, semantically precise, and easily expandable implementation of a case system.]

In English, oblique arguments are usually marked by preceding them with a preposition. Thus, the preposition is the marker which tells us the case role of whatever follows it. Agent and patient are usually unmarked. The most common exception to this in English is in passive constructions, where the original subject is preceded by the preposition "by", as in "the window was broken BY the chimpanzee" or "the thieves were seen BY the children". Some verbs, such as English "put", have a third, required argument (i.e., it is part of the valency of the verb), which is marked by a preposition. For example:

Here, the preposition "on" marks a destination case role.

Incidentally, natural languages often allow a speaker to omit a core argument if it is obvious from context. For example, a Japanese speaker often omits the agent of a verb as a sign of politeness. This usage, however, performs a discourse function - not a grammatical function - and the omitted argument is still assumed to be present.

An additional case role that occurs within the valency of many verbs is what I will call focus. Linguists often call this case role theme, object, or topic, but there is no consensus, and their definitions often overlap other roles, especially patient. In all of the following examples, the direct object is the focus:

Note that in each of the above sentences, the direct object provides a reference point or focus for the event, without causing or being changed by the event. It does this by pinpointing, narrowing down, or providing a reference for (i.e. 'focusing') the state or change of state indicated by the verb. Note that a focus does not play an active role in the event described by the verb, and is not obviously changed by the event. Thus, a focus can be best described as one of the following:

Note that the concepts can overlap, as in "to need", "to avoid", "to know", and "to hate", since the object of such verbs can be considered the focus of a relationship or of a mental state. In fact, without stretching the second definition too much, one could say that it applies to all focused events, even those involving perception or elaboration. For example, the sentence "John sees the forest" describes a relationship between "John" and "the forest", and the sentence "Louise sang a little ditty" describes a relationship between "Louise" and "a little ditty".

Thus, we can say that the patient experiences a relationship whose referent is the focus. If the verb has an agent, then the agent is responsible for the relationship. The nature of the relationship is indicated by the meaning of the verb. It is important to keep in mind that the focus does not directly modify or interact with the patient. Perhaps the best and most useful generalization we can make is that the focus is the referent of a relationship with the patient, it is not affected by the event, and it is not responsible for the event. However, the precise meaning of the focus will ultimately depend on the meaning of the verb itself.

Thus, it would appear that focus is not really a pure case role. Both agent and patient can be defined with semantic precision, while focus seems somewhat vague or even 'out-of-focus'. The reason for the vagueness is that it is possible to differentiate among the various senses of focus; e.g. the perceived entity ("to see"), the missing/lacking entity ("to need"), the locative reference point ("to surround"), an elaboration of the event itself ("to sing"), etc. But these senses never overlap for a particular verbal concept, and we would end up making distinctions that are never made in natural languages. Thus, focus is a vague and general-purpose case role, but it is an essential one.

In summary, the three major case roles that are capable of being included within the valency of a verb are:

Thus, the agent is responsible for the event, the patient experiences the event, and the focus provides the referent for the state or change of state indicated by the event. [We will discuss the semantics of focus in more detail later on. First, though, we need to acquire a more substantial background in the semantics of verbs.]

Note that an argument does not have to be a physical entity. It can also be an event. In the following examples, the direct object is the patient:

There are other case roles in addition to the ones I just mentioned, but they are all oblique (i.e., they are never required by a verb). I will discuss them as the need arises. For now, though, we have enough background to proceed with the discussion.

In the following sections, I will discuss and classify a large number of English verbs, based on their semantics and their argument structures. While doing so, I will also introduce some of the terminology and the formal descriptive notation that I will be using throughout the remainder of this monograph.

[ Table of Contents ]

Probably the largest group of verbs in English (or any language, for that matter) are called state verbs, since they describe either an unchanging state of affairs or a change of state. Verbs which describe an unchanging or static situation are often called stative verbs (do not confuse "stative" verbs with "state" verbs). Verbs which describe a changing or dynamic situation are often called either process or accomplishment verbs. Because linguists do not agree on the precise meanings of these terms, I will immediately abandon them and use the more generic expressions "static state verbs" and "dynamic state verbs".

Let's start by looking at some static state verbs; i.e. verbs which describe a steady or ongoing state:

These verbs are all intransitive; i.e. they have a subject but no object. Also, each one describes the steady, ongoing state of the subject. Thus, the subject is the patient. From now on, I will refer to verbs of this type as "P-s", where "P" represents "patient" and "-s" indicates that the verb is a static verb.

Here are some more static state verbs with the form P-s:

English speakers may be surprised to see adjectives and past participles being treated as descriptive verbs. However, words which describe steady states have just as much of a verbal nature as words which describe changes of state. The English verbs "to sleep", "to stink", "to twinkle", etc. illustrate this very well. In fact, many natural languages (e.g. Japanese, Korean, several Sino-Tibetan languages such as Mandarin Chinese, some Siouan languages, several Austronesian languages, and many native languages of Africa, Central America and South America) do not have true adjectives. Instead, these languages use words that are essentially intransitive verbs, and which can be inflected or otherwise used in the same way as any other intransitive verbs.

Now, the above examples represent intransitive static state verbs. Here are some examples of intransitive dynamic state verbs:

The only difference between these and the previous examples is that the patient experiences a change of state rather than a steady state. Thus, these verbs are the dynamic counterparts of the intransitive static state verbs.

From now on, I will refer to these verbs as "P-d", where "-d" indicates that the verb is a dynamic verb.

Next, let's look at some verbs which describe events in which the subject causes something to happen to the object. These verbs are all transitive; i.e. they have both a subject and an object. Here are a few examples:

In all of the above, the subject "He" is responsible for the event described by the verb. Also, in all cases, the event causes a change of state to occur in the object. Thus, the subject is the agent and the object is the patient. In other words, these verbs are transitive dynamic state verbs.

For verbs like these, I will use the notation "A/P-d", where "A" represents "agent", "P" represents "patient", a slash "/" separates subject from object, and "-d" indicates that the verb is a dynamic verb.

Note that English, unlike almost all other languages, uses exactly the same word for some of its P-d and A/P-d verbs:

Note though, that this usage is highly idiosyncratic, and many words that you would expect to follow the pattern do not:

So far, we've seen P-s, P-d, and A/P-d verbs. Thus, an obvious question is: are there such things as A/P-s verbs?

Yes. And as the designation implies, these verbs always indicate that the agent maintains the patient in some kind of steady state. Thus, all of these verbs imply that the agent somehow "controls" the patient. Here are some examples:

Note that, although these verbs may imply both an entry into and an exit from the event or situation, the major emphasis is on the process BETWEEN the endpoints. For these reasons, these verbs are static rather than dynamic.

Now, for states that are normally rendered using adjectives, English uses the particle "keep" to distinguish between A/P-s and A/P-d verbs. Here are some examples:

All of the above are effectively A/P-s verbs. English simply uses the particle "keep" to achieve the desired effect. A good paraphrase of these 'verbs' is "agent causes patient to remain in a steady state".

Next, let's look at some verbs that use the focus case role that we discussed earlier. Here are some examples:

In all of the above, the subject experiences a steady state relative to the object. Thus, the subject is a patient, the object is a focus, and the verb is a static state verb. For these verbs, I will use the notation "P/F-s", where "F" represents the focus.

It is also possible to have verbs like these which also have an agent. Here are some examples:

In the above examples, the subject not only experiences the steady state indicated by the verb, but is also responsible for the state; i.e., the subject is also in control. Thus, the subject is both the agent and the patient, and the object is the focus. I will refer to these verbs as AP/F-s.

Incidentally, notice how some of the above complex verbs become simple verbs when they are defocused:

Thus, the unfocused verbs would be described as AP-s.

It is also possible for AP/F verbs to indicate a change of state. Here are some examples:

These verbs describe a situation in which the agent causes himself to undergo a change of state relative to the focus. Thus, they are all AP/F-d.

Since all of this may be confusing, let me paraphrase the relationships in a way that illustrates the states and how they are focused:

Overall then, verbs in this group can be generalized as follows:

Note that in all of the above paraphrases, the words "focused on" could be replaced by the words "relative to", emphasizing that the focus is the referent of a relationship with the patient.

Now, some verbs involve the exchange of one item for another, usually between two people. Here are some examples:

In each case, two transfers of possession take place. John loses possession of one item while gaining possession of another, and the reverse change of possession occurs for Bill. Thus, we have, in effect, two patients and two foci, where the foci are the items being exchanged.

We can also regard these verbs as composites; i.e. useful abbreviated versions of two distinct verbs, as in "John gave me his apple and I gave him my orange".

Since both patients are equally responsible for the exchange, each one functions as both agent and patient. However, the subject in the above exchanges plays a more important or 'primary' role as agent than the other patient, and the first item plays a more important or 'primary' role as focus. Thus, for example, in the case of "sell", the seller is the primary agent-patient, while the buyer is the 'secondary' agent-patient. The object sold is the primary focus, and the amount paid is the 'secondary' focus.

[This is not the only possible analysis, but I feel that it is the most practical. It also eliminates the need for any special treatment of exchange verbs that do not need a secondary focus, such as "to lend/borrow".]

Finally, there are some cases where the subject is the only agent-patient, as in "John swapped his brown tie for a blue one". Here, John causes himself to undergo a change of relationship with two different items, without the involvement of anyone else. In this example, "a blue one" is the secondary focus.

There are also state verbs which are used to describe the weather and other environmental phenomena. Here are some examples:

In this group of verbs, the subject is the null place holder "it". English verbs always require a subject in the indicative, but this is not true of most languages.

Note that verbs in this class can be either static or dynamic. Also note that, since these verbs describe states or changes of state, they have an implied patient which is obvious from the context (i.e. the local environment or current situation). In effect, English uses the pronoun "it" to represent the implied patient.

I will not describe the argument structure of these verbs right now, because we do not yet have a sufficient background to treat them properly. Instead, I will postpone their discussion until after we discuss grammatical voice changes.

[ Table of Contents ]

So far, all of the verbs we have discussed are state verbs. That is, the basic concept represented by such a verb is some kind of state, and that this state applies only to the patient. The states can be focused or unfocused, and they can be brought about or maintained with or without an agent.

Also, the states themselves can be categorized by their dynamism; i.e. a state can be "energetic" (e.g. 'alive', 'twinkling', 'sleeping', 'smelly', etc.) or "non-energetic" (e.g. 'dead', 'green', 'tall', etc.). In general, an energetic state can be described using an English present participle, and a non-energetic state can be described using an English adjective or past participle, but there are many exceptions.

Verbs may also be categorized according to their telicity. Dynamic verbs that have a built-in endpoint are called telic, as in "The violinist played a dirge". Dynamic verbs that do not have a built-in endpoint are called atelic, as in "The violinist played with the local orchestra".

Unfortunately, distinctions in dynamism and telicity are not very useful, and I know of no natural languages that mark these distinctions. Whether a concept is energetic or not is a basic part of the nature of the concept and has nothing to do with how the concept is applied. In other words, it is an inherent part of meaning of the verb root, and there is no need to mark it or express it externally.

Also, the telicity of a verb often depends on the meaning of its arguments rather than on the form of the verb. Thus, in a derivational system such as I am presenting here, telic distinctions are useless.

[Incidentally, this entire section is 'for your information only'. I felt that it was important to mention dynamism and telicity only because linguists attribute so much importance to these concepts in their theoretical discussions about verbs. In my opinion, distinctions in dynamism and telicity are interesting but useless for our purposes. And, as I will illustrate below, there is a much more important and useful distinction: the distinction between agent-oriented concepts and patient-oriented concepts.]

[ Table of Contents ]

State verbs are not the only kind of verbs that languages employ. There is one other class of verbs, which I will refer to as action verbs, which differ significantly from state verbs. Let's look at a few examples and then see if we can deduce some useful generalizations:

In each of the above examples, the subject "Louise" is clearly the agent. Also, in the first example, the second object is clearly the focus. But what is the object "Bill"?

In each case, Louise is trying to have some kind of effect on Bill, but the final result is not clear. For example, when Louise kicks Bill, we know that something happens to Bill, but Bill's final state depends on many things that are left unstated, such as how hard she kicked, what kind of shoes she was wearing, where she kicked Bill, and so on. This is quite different from state verbs, where the final state is always clearly indicated by the meaning of the verb. For example, the sentence "He broke the window" makes it very clear what the final state of the window is; i.e. 'broken'. It doesn't tell us anything about the act itself or how it was accomplished. Now, we could say that Bill's final state is 'kicked', but this does not tell us about his condition - it simply tells us how it was accomplished.

The reason why the final outcome of the above examples is not clear is because these verbs tell us about the act itself rather than the outcome of the act. In other words, these verbs emphasize what the agent is doing rather than emphasizing what is happening to the patient. Another way of putting it is that an action verb tells us how a patient was affected, but does not tell us what the resulting state is. A state verb is exactly the opposite - it tells us the state of the patient without telling us how the state was achieved.

Thus, state verbs are patient-oriented, since they highlight what the patient experiences. Action verbs are agent-oriented, since they emphasize what the agent is doing.

If a root concept is patient-oriented, then the verb will indicate what the patient experiences. Patient-oriented verbs may or may not have agents. If the root concept is agent-oriented, then the verb will indicate what the agent is doing. An agent-oriented verb must have an agent. All patient-oriented verbs are state verbs. All agent-oriented verbs are action verbs.

The most common action verbs are speech acts. Here are some examples:

In all of the above the first object is the patient, since it is the entity which the agent is trying to affect. For the verbs which have two objects, the second object is the focus. Thus, in the sentence "He told me a joke", "He" is the agent, "me" is the patient, and "a joke" is the focus.

Verbs which have two objects are called ditransitive.

Finally, we mentioned earlier that the focus of a verb can be one of the following:

There is another group of action verbs that are typically referred to as activities. Here are some examples:

These verbs describe situations in which the agent maintains itself in an ongoing, energetic state. As a result, these verbs are all static AP/F-s verbs, and can be paraphrased as "Agent does something to maintain itself in a steady, active state". In effect, since the agent and the patient are the same, and since an action verb tells us what the agent is doing, it also tells us the state of the patient. In other words, the action and the state are essentially the same.

Now, many activity verbs can take an explicit patient that is not also the agent. Here are some examples:

In these examples, we are still saying what the agent is doing while placing more emphasis on what is being done to someone/something else. Thus, these verbs are the A/P versions of the basic activities. And in all of them, the patient takes a direct part in the activity.

[Incidentally, the word "threadbare" in the "run" example, and the expressions "into a stupor" in the "dance" example and "out of the house" in the "smoke" example are called resultatives, since they indicate the final or 'result' state of the patient. Also, the first example using "play" could also be analyzed as a reciprocal construction. We'll have more to say about resultatives and reciprocals later.]

It's important to emphasize that, when dealing with action concepts, we cannot treat AP derivations as we did with state verbs. In an AP state derivation, the agent is causing itself to experience the state that normally applies only to the patient. In an AP action derivation, the agent is causing the patient to perform the action that is normally performed only by the agent.

In other words, in an AP state derivation, the agent experiences the same thing (i.e. state) as the patient. In an AP action derivation, the patient does the same thing (i.e. action) as the agent.

Thus, an AP-s version of a verb such as "to kick" does not mean that the agent kicks himself. Instead, it means that the agent is simply "kicking"; i.e., he is involved in the activity of "kicking" with no specified or discernible target. This is a subtle distinction, but it is an extremely important one.

[Incidentally, this distinction could also be handled by designating the above verb as simply A-s rather than AP-s. However, I have chosen to keep the AP notation because of the inherent symmetry of the distinction, and because it emphasizes that the agent is causing itself to experience what is essentially an energetic "state".]

[ Table of Contents ]

Now, let's look at some of the distinctions that exist among these categories, and see if we can make some generalizations about verbs. In looking over the above groupings, we can draw the following conclusions:

As mentioned earlier, there are a few odd-balls which have unusual argument structures, but these are rare and tend to be irregular or idiosyncratic. For the time being, we will limit our discussion to the larger, more regular categories. [Actually, as we will see throughout this monograph, the so-called 'odd-balls' can always be derived from more regular verbs via some form of grammatical voice change or derivational modification.]

From the above list, we might be tempted to create a matrix of 2x2x4x3x2 = 96 elements. However, most combinations never appear. Note, for example, that the orientation of the verb is an inherent part of the meaning of the root, and we will never find two verbs that differ only in this characteristic. Also, a patient can be the subject OR the object - not both - which, of course, makes sense. And if the first argument is both agent and patient, then the second argument cannot be a patient. Also, it serves no useful purpose to have a verb with an object but with no subject. And so on.

With all of the above in mind, we can construct a chart of the possible forms that verbs can take:

Note that I have excluded verbs that take instrumental subjects (e.g. "The hammer broke the window"). English is one of the very few languages that allows constructions like this. And those few that do allow this generally mark the verb to indicate that the subject is instrumental (e.g. Malagasy, many Bantu languages, many Philippine languages, etc.).

[ Table of Contents ]

So, how do we apply these generalizations to the practical problem of verb design? Answer: we do it by classifying and marking our verbs (in some way or other) to indicate their valency, case requirements, and whether or not they reflect a steady state or change of state. The easiest way to do this is to design the morphology of the language to reflect these differences. For example, the following English verbs will all be derived from the same root but will have different markers to indicate their different argument structures:

And so on. For all of the above, we can use a state root with the meaning 'free/unrestrained', and can apply a different marker to indicate whether the result is AP-s, A/P-d, etc.

[ Table of Contents ]

[If you have difficulty understanding the formal description that follows, I suggest that you read my separate essay entitled "Morphology". The essay provides a brief and simple tutorial on how to describe the shapes of words and morphemes. However, it is not necessary to understand how words are shaped in order to understand the lexical semantic system discussed in this monograph.]

Here is a formal description of the morphology of the interlingua:

Definitions:

A vocalic nucleus N has the following form:

More precisely, a vocalic nucleus can consist of one or more vowels, and, if there is more than one vowel, then 'i' or 'u' is converted to the corresponding semi-vowel 'y' or 'w'. For example, "eua" becomes "ewa". I'll have more to say about this later.

A prefix has the form:

A suffix has the form:

A suffix changes the syntax and semantics of a word in a precise (i.e., totally predictable) way. For example, if we add the A/P-d suffix "-ap" and the final verb marker "-a" to the root "bodam" (meaning 'duck'), the result "bodamapa" means 'to turn P into a duck', which is a dynamic state verb. In other words, we have changed both the syntax and meaning from a 'duck' noun to a 'change-of-state' verb.

In summary, a prefix modifies the meaning of the entire word that follows it without changing its syntax. A suffix changes both meaning and syntax of the root plus any intervening suffixes. In other words, we start with the root, add the suffixes, and then add the prefixes to obtain the final meaning.

There are two kinds of root morphemes: modifiers and classifiers.

A classifier has the form:

A modifier has the form:

Thus, a root morpheme and a root are defined as follows:

Note that a classifier may be preceded by zero or more modifiers but may not be followed by one. Thus it automatically terminates a root.

Finally, a word has the following form:

As for pronunciation, vowels are cardinal, although laxer versions are acceptable (i.e., pronounce vowels as in Italian or Swahili). Pronounce /w/ as in "awake", /y/ as in "soybean", /c/ like "ch" in "chin", /j/ as in "judge", /x/ like "sh" in "ship", /q/ like "s" in "measure", and /r/ as any rhotic (flap, trill, retroflex, uvular, etc). The consonant /h/ may be pronounced like 'h' in "house", as a glottal stop (i.e., like "tt" in "button"), or as [x] (i.e., like "ch" in German "acht"). [More generally, /h/ may be pronounced as a glottal stop or as any unvoiced velar, uvular, pharyngeal, or glottal fricative.]

Geminates (i.e., two or more consecutive, identical vowels, semivowels, or consonants) are not allowed. For example, "kk", "bb", "uu", and "yy" are not allowed. The sequences /uw/, /wu/, /iy/, /yi/, /ou/, /ow/, /ei/, /ey/, /ao/, /ae/, /wy/, and /yw/ are also not allowed. However, it is always legal to pronounce /e/ as either [e] or [ey], and /o/ as either [o] or [ow]. For example, /ea/ may be pronounced [ea] or [eya], and /oa/ may be pronounced [oa] or [owa].

The vowels 'i' and 'u' may never appear adjacent to another vowel - use 'y' or 'w' instead. For example, the roots "foidam" and "kuentis" are illegal, but "foydam" and "kwentis" are legal. If 'i' and 'u' are adjacent, convert the first to a semi-vowel. Thus, "ui" becomes "wi" and "iu" becomes "yu".

Although stress is not necessary, we will adopt the following convention for the sake of consistency:

The above provides almost all of the morphotactic system that I will be using throughout this monograph. (One additional feature will be introduced later in the chapter on Anaphora.) The appendices contain a complete description of the morphology and a list of all of the morphemes that will be created and used in this monograph.

Note that with these word-formation rules, every morpheme and every word is unambiguously started and terminated. Thus, any word with this morphology can always be parsed unambiguously into its component morphemes, and a stream of words can always be divided unambiguously into individual words even if there are no spaces or pauses between words. In fact, even spaces or pauses within a word cannot confuse the parser. Thus, the boundaries between morphemes and words are never in doubt.

This feature of word morphology is usually called either self-segregation or auto-isolation.

As we will see later in Appendix E, the syntax of the interlingua will also ensure self-segregation at the constituent and sentence levels.

[ Table of Contents ]

In the interlingua described in this monograph, each root will have a default argument structure associated with its classifier. (For a complete list of classifiers, refer to Appendix C.) We can change the default by using a suffix that will indicate the new argument structure.

Here are the suffixes used to change the argument structure of a word:

The above suffixes should only be used if the default argument structure of the root is being changed. To change just the part-of-speech of a root without changing its default argument structure, use an appropriate part-of-speech marker instead (see below).

Now, before proceeding, let's briefly review the semantics behind the notation we are using.

All verbs have a patient, whether stated or implied. If a verb has an agent, then the agent is responsible for the event described by the verb. If a verb has a focus, then the focus is the referent of a relationship with the patient. This referent can be either another entity, as in "John needs a pencil", or an elaboration of the event itself, as in "John told a joke".

A verb is either an agent-oriented action verb or a patient-oriented state verb. An action verb emphasizes what the agent is doing rather than what the patient is experiencing. A state verb emphasizes the ongoing or final state of the patient rather than how it came about or how the agent, if any, brought it about. An action verb must have an agent. A state verb may or may not have an agent.

For these examples, I'm going to start with an English verb, analyze it to determine its argument structure, and create a word for it in the interlingua. I will then try to create as many other verbs as possible from the same root by using different suffixes.

Let's start with the verb "to know", in the sense of 'having knowledge'. Typical sentences using this verb could be:

Here, the subject is the patient and the object is the focus. The subject experiences a steady state of 'knowledgeable' focused on the object. Thus, this verb is a patient-oriented state verb and its argument structure is P/F-s.

Now, in the interlingua, the root "kop" will represent the state concept that means 'knowing' or 'knowledgeable'. And since 'knowing' is inherently relational, its argument structure will be P/F-s by default. In addition, the final marker "-a" will set the part-of-speech to verb. Thus, the word "kopa" is the P/F-s verb meaning 'to know'.

Note that we are not using the P/F-s suffix "-unz", even though it is technically correct (i.e., it has the correct argument structure). For the sake of consistency, we will only use an argument structure suffix to change an argument structure. And since the default argument structure of "kop" is already P/F-s, we can use it without "-unz".

Next, let's take the same root and see what happens when we apply different argument structure suffixes to it. We will deal first with focused verbs, since the concept of 'knowing' is inherently focused:

Keep in mind that the above English glosses are approximations, and that the real meaning should be determined from the root plus its argument structure. With the precisely defined semantics used above, there is no doubt. Also, keep in mind that the paraphrases cannot capture the immediacy of the involvement of the participants. This immediacy can only be represented by the single word - not by the paraphrase. For example, a paraphrase of the verb "to kill" is 'to cause to die', even though the two are not synonymous. The paraphrase is simply the closest we can get to the true meaning using multiple words. Please keep this in mind, since we will be using paraphrases throughout this monograph.

Note that all of the above derivations are focused. Focused derivations are the most useful simply because the concept 'knowing' is most often applied this way. But the unfocused derivations are also very useful, as we'll see later when we discuss Grammatical Voice . Before we can discuss these differences, though, we need to acquire a little more background in verbal semantics.

[ Table of Contents ]

The semantics of a verb that is converted to a noun will be as follows:

Now, in the interlingua, we will use the final marker "-i" to change the part-of-speech of a word to 'noun' without changing its argument structure. For example, the noun form of the P/F-s verb "kopa" is simply "kopi". If the argument structure must also be changed, then an argument structure suffix and a part-of-speech suffix will be needed.

Here are some examples:

[ Table of Contents ]

The semantics of verbs that are converted to adjectives will be as follows:

In the interlingua, the word-final "-o" will indicate that the part-of-speech of a word is 'adjective'. Here are some sample derivations from the root "kop":

It is important to note that the use of present participles (e.g. "informing") to represent the actual meanings is somewhat misleading, because English participles have strong implications of tense and aspect. For non-participial renderings, this is not a problem as in "the man in the know". Also, for similar reasons, do not confuse adjectives with relative clauses. For example, a "learning geologist" is not quite the same as a "geologist who is learning" since the relative clause definitely specifies tense and aspect, whereas "learning geologist" could also be used if the learning occurred in the past or future.

[ Table of Contents ]

To continue along the same lines as above, we will use final "-e" to indicate that the part-of-speech of a word is 'adverb'. However, before we can put this to use, we must first digress for a while and discuss the semantics of case tags and adverbs.

[ Table of Contents ]

In this section, I would like to discuss the semantics of adverbs (especially those that correspond to English adverbs that end in "-ly") and most case tags (such as English prepositions, Japanese post-positions, Hungarian case inflections, etc.), and I will try to show how verbs can be converted to adverbs and case tags. The final result will be a system that can replace many complex, idiosyncratic and periphrastic constructions of natural languages with constructions that are syntactically simple and semantically transparent.

[ Table of Contents ]

First, let me illustrate how verbs can, in fact, represent the semantics of English prepositions, adverbs, and particles by giving examples from other languages. In these languages, some verbs are actually used in the same way as English prepositions, adverbs, and particles. Consider the following from Vietnamese:

In the first example, the word "lai" is actually the verb 'to come'. When used transitively, it takes a destination as a direct object (like the English verb 'to enter'). In the second example, the word "o" is actually the verb 'to be located at' and takes a location as a direct object. (Thus, the second example could also stand alone as a complete sentence meaning 'The bank is in Hanoi'.) Many other languages, such as Igbo, Ewe, Twi, and Yoruba (Niger-Congo languages of west Africa), Indonesian, Chinese, Cambodian, and many pidgins and creoles have similar constructions. Also, these constructions are not limited to locatives. In Chinese, for example, the word "yung" is the verb meaning 'to use'. It is also the preposition meaning instrumental 'with', as in the sentence "He broke the window with a hammer".

It's also possible to create adverbs, particles, and completely new verbs in this manner. In Hindi, for example, "to run go" means 'to run away', and "to cook take" means 'to cook for oneself'. In Yoruba, "to carry come" means 'to bring', and "to carry go" means 'to take away'.

Linguists have a name for this type of construction, in which two or more verbs are linked without the use of coordinating conjunctions or subordinators. They are called serial verbs.

There are two major types of serial verb constructions: the events indicated by the verbs are either simultaneous or consecutive. In this discussion, we are only interested in the first category, where the two verbs represent events that occur simultaneously.

Other useful serial verb constructions are those in which two or more verbs are linked, all taking the same subject and object. In these cases, the lack of a conjunction or subordinator often implies a certain 'immediacy'; i.e., that the event is a single entity, rather than a combination of unrelated or sequential events. Some languages, such as Chinese and Yoruba, allow any combinations that make semantic sense, and even allow noun phrases to split the verbs, creating an effect similar to relative clauses, but where the events indicated by the verbs are often much more tightly linked. Note that these types of constructions are not idiomatic - they are actually quite productive and their meanings are predictable from syntax and context. What most serial verb constructions have in common is that they are taken by speakers as representing parts of the same event.

English has a few verbs that can be used in this way, such as "to go visit", "to come play", "to let go", "to stir-fry", and "to test-fly" but note that the first two represent consecutive events, which is not what we are interested in here. Most of the time, English uses participles to achieve a simultaneous effect. Here are some examples, where the first sentence of each triplet indicates simultaneity:

What is happening here is that the participial phrase is more closely linked to the verb rather than to the noun it ostensibly modifies. As a result, we can create what are essentially compound verbs without subjects, and the results make perfectly good sense:

In effect, the words "screaming" and "shivering" behave exactly like adverbs, and the words "knocking over" and "dreaming of" behave exactly like case tags (i.e. English prepositions) that introduce phrases that modify the verb.

Thus, we should be able to create adverbs and case tags from verbs by applying the same semantic logic. Here's are some examples:

Additionally, if English had a verb like Vietnamese "o", Chinese "zai", Cambodian "niw", or Hausa "yana" (all of which mean 'to be located at or in'), we could create the locative senses of the prepositions "in" and "at" from it. For example, if the English word "bain" meant 'to be located in/at', we would have:

In summary, speakers of languages with serial verb constructions effectively make up new 'prepositions' as they are needed. If a preposition with a desired literal meaning is not available, English speakers will either use existing prepositions metaphorically, or will use participial constructions as illustrated above. In this monograph, we will implement a system that has the flexibility of the serial verb constructions (but which is semantically and morphologically precise), and thus avoid the need for potentially untranslatable metaphor.

[ Table of Contents ]

As an example of the adverb/case tag creation process, let's continue where we left off when we started this digression, and create a set of adverbs and case tags from the state concept of 'knowledgeable'. As mentioned earlier, we will use the part-of-speech marker "-e" to mark the part-of-speech. Those whose verb forms do not take objects (i.e. intransitive verbs) will become adverbs, and those which do take objects (i.e. transitive verbs) will become case tags (i.e. English prepositions) adding a new oblique argument to the main verb. Thus, in effect, the case tag will link its argument to the verb. In the following examples, I will use English for all words except the new case tag/adverb. I will also use English word order. Here are the results:

In all cases, note how the derived case tag modifies the whole sentence, just as if it were an oblique argument of the main verb. Note also that, in the above examples, the case tag is tightly bound to the subject of the main verb. For example, in the sentence:

the subject of the case tag "kope" is P and links to the subject of the main verb "to leave" which itself is AP/F-d. Thus, the effective subject of the case tag "kope" is "Joe". And in the sentence:

the subject of the case tag "kopambe" links to the subject of the main verb "to stand" which is AP-s. Thus, the effective subject of the case tag "kopambe" is "the policeman".

[Incidentally, note that "kopambe" is A/P/F-d and must be followed by two arguments, "us" and "the robbery". No preposition can appear between them. The English translation, however, requires the preposition "about" or "of" to precede the focus of the verb "inform".]

In this section, we discussed how to convert existing verbs into adverbs and case tags. Later, we will discuss how to systematically create the many case tags required by a language, such as those needed to represent English prepositions.

[ Table of Contents ]

In the interlingua, we will use the root "xum" to represent a vague but useful relational state, with the meaning 'having an unspecified relationship with', 'having something to do with', and so on.

Note that "xum" is the 'other' classifier for the scalar relational state group and that it is P/F-s by default. [See Appendix C for a complete list of classifiers in the interlingua.

The P/F-s verb form "xuma" will indicate that a relationship exists between patient and focus, but will imply nothing about the nature of the relationship. Thus, its meaning can be paraphrased as 'to have an unspecified relationship with' or 'to have something to do with'.

Here are a few other derivations using "xum":

Now, we can also derive several useful but vague action verbs using the action classifier "bus". Note that "bus" is the 'other' classifier for the action classifier group and that it is A/P-d by default.

Here are some of the more useful derivations using "bus":

As we will see later, many of the above verbs can undergo additional derivations to produce some very useful words.

Since actions always imply agents, non-agentive derivations will not be very useful.

[ Table of Contents ]

So far, we've only talked about verbs in the active voice; i.e., where all of the arguments of a verb are present and appear in the proper order. For example, the A/P-d verb "to break" has an agent subject and a patient direct object. However, natural languages have many ways of changing the relative importance or topicality of a verb's arguments. Languages can also remove arguments from the argument structure, while implying that they still exist, and make the missing arguments either obliquely expressable or not expressable at all. Finally, languages can also incorporate normally oblique arguments, making them part of the argument structure of the verb. For example, consider the following:

Different languages handle these distinctions in different ways. As you can see from the above examples, English uses combinations of syntax, morphology, periphrasis, and even poetic license. Other languages are more regular, some using inflections for some voices, while others may use derivations or a combination of both. In addition, some languages allow the incorporation of other case roles into the argument structure of a verb. In fact, the number of possible voice variations among the world's languages is quite large.

Since grammatical voice has different meanings to different people (with middle voice being the most confused/confusing), let me precisely define the meaning that I am using here. Specifically,

An argument that increases in relative topicality is said to be promoted, and an argument that decreases in relative topicality is said to be demoted. Demoted arguments continue to play their original semantic roles, but are somehow less important or less involved. The following examples illustrate this effect:

Although the number of possible voice combinations is large, there are a few that crop up often among the world's languages. Here are the most common ones:

Active - transitive: The subject is slightly more important or topical than the object. Both must be expressed. This is by far the most common form used in almost all languages. [The only exceptions I know of are Fijian and the Salish languages of northwestern North America. In these languages, all transitive verbs are derived by addition of an affix to the intransitive form. Also, in Fijian, the most commonly used verb form is active INTRANSITIVE.] Passive: The original object becomes the subject and becomes considerably more topical than the original subject. The original subject is no longer part of the verb's argument structure, and does not have to be expressed. However, it is always implied and may be expressed obliquely (in English, typically using the preposition "by"). Middle: The original object is made more topical and becomes the subject. The original subject is deleted from the verb's argument structure and may not be expressed at all even though it is implied. Anti-passive: The subject is made considerably more salient than the object. The original object is no longer part of the verb's argument structure, and does not have to be expressed. However, it is always implied and may be expressed obliquely. Inverse: The arguments of the active verb are simply reversed. The original object becomes the subject, gaining in importance; and the original subject becomes the object, losing importance. Unlike passive, the original subject is not oblique and MUST appear.

Keep in mind that the above are generalizations. Individual languages vary both in the ways that the various voices are implemented as well as in their semantics. Also, keep in mind that the list contains just the most common voice systems. Many other combinations are possible, especially those involving normally oblique case roles.

As we saw above, a language like English, which does not have this ability, must resort to complex and idiosyncratic constructions to achieve the same effect. Always keep in mind, though, that a voice change simply re-arranges the topicality of some of the participants in a sentence. Our goal should be to achieve the same results in a consistent and easy-to-understand manner.

Also, English rarely uses the same strategies to handle these needs. For example, an effect similar to the passive and anti-passive can be achieved by using impersonal constructions: "Johnson punched someone" (anti-passive) or "Someone is at the door" (passive). An effect similar to the inverse can often be accomplished by fronting or left dislocation, as in "(As for) the car, John wrecked it". However, true inverse effects can sometimes be obtained by periphrasis, as in:

Active: The cup is full of water. Inverse: Water fills the cup.

Finally, inverse and middle effects are sometimes achieved in English by using completely different root morphemes, as in "I enjoyed the show" vs. "The show pleased me" (inverse), or by use of metaphor or idiom, as in "He remembered the answer" vs. "The answer came to mind" (middle).

[Incidentally, the inverse voice comes in two varieties. In the first, which is sometimes called a semantic inverse, an inverse operation may be required in order to properly assign case roles to the arguments of a verb. Semantic inverse constructions are especially common in the native languages of North America. For example, in Plains Cree (Algonquian), a more animate argument is inherently more topical than a less animate argument, and neither word order nor case marking of nouns can change the interpretation. Thus, if "man" and "dog" appear as the main arguments of the verb "bite", then it will always be interpreted as "man bites dog", regardless of word order. An inverse marking on the verb simply reverses the relative topicality, making "dog" more topical than "man", and is required to obtain the sense "dog bites man". I do not consider this usage a true voice alteration. It is simply an uncommon way of marking semantic case roles in a sentence. Similarly, some Sino-Tibetan languages have an inverse voice based on the relative topicality of 1st, 2nd, and 3rd person, rather than animacy. Note though, that although an inverse operation may at times be required, it can also be used when it is not required in order to achieve the changes in topicality that we are describing here. In these cases, such an operation is called a pragmatic inverse.

True pragmatic inverses can be found in languages such as Maasai (Nilo-Saharan), Sahaptian languages (Penutian, western North America; e.g. Nez Perce), Caucasian languages (e.g. Georgian), and Chamorro (Austronesian, Guam). (In fact, Maasai and Sahaptian languages have both semantic and pragmatic inverses.) Finally, a combination of word order changes and direct case marking of nouns can sometimes be used to achieve an inverse effect (e.g. Korean). However, other languages which have this ability (e.g. Russian) frequently use it for quite different purposes. As for true inverse systems, recent research indicates that such systems are actually much more common among the world's languages than had been previously supposed.]

[ Table of Contents ]

2.7.1 Implementation of a Grammatical Voice System

Most European languages (including English) use cumbersome rules involving auxiliaries, participles, reflexives, context, word-order, and even complete lexical changes to indicate voice. More heavily inflected languages (Arabic, Latin, Japanese, Ainu, etc.) use the very simple expedient of inflecting the verb for most indications of voice. Many South American lowland languages and some isolating (i.e. uninflected) languages such as Chinese and Vietnamese do not have a formal morphology or syntax to cover voice, although they can achieve similar effects via explicit topicalization and/or periphrasis.

Finally, other languages such as the Bantu languages of Africa (e.g. Swahili) and Austronesian languages (e.g. Indonesian) use derivational morphemes (which is essentially what we are doing here) to achieve most voice effects. In other words, they create a completely different verb from the same root as the active verb, but the new verb has a different topicalization and argument structure.

So, how should an MT interlingua implement grammatical voice? Ideally, we would like to create a system that can handle any voicing needs, while being both simple and consistent.

I do not feel that grammatical voice change should be implemented in syntax - syntax is not nearly as flexible as morphology. Instead, grammatical voice changes can be best implemented using derivational morphology. In other words, we will allocate a single suffix for each voice. The resulting verbs will, of course, have a different argument structure.

For the interlingua, we will allocate the following suffixes for these voice morphemes:

Middle voice: -em Passive voice: -es Anti-passive voice: -os Inverse voice: -ang

Voice suffixes do not change an existing part-of-speech.

For example, if the state root meaning 'open/unshut/unblocked' is "doykav" (default = P-s), then the word for the A/P-d verb 'to open/unshut' is simply "doykavapa". We can implement the other voices as follows:

middle: doykavapema e.g. The window doykavapema easily = The window opened easily. passive: doykavapesa e.g. The window doykavapesa (by the thief) = The window was opened (by the thief). anti-passive: doykavaposa e.g. The thief doykavaposa (of the window) = The thief did the opening (of the window) or = The thief was the opener (of the window) or = The thief opened something. [The third gloss applies only if the argument is not expressed obliquely.] inverse: doykavapanga e.g. The window doykavapanga the thief = The window - the thief opened it.

where optional oblique arguments are shown in parentheses. [We'll discuss how to implement these oblique arguments later.]

In the above examples, the inverse paraphrase is only approximate, and actually increases the topicality of the fronted item more than it should. Here are some better examples of true inverse effects in English:

Active: John owns the book. Inverse: The book belongs to John. Active: This bolt is part of the device. Inverse: The device includes this bolt. Active: We experienced many strange things. Inverse: Many strange things happened to us. Active: This alliance will result in much misery. Inverse: Much misery will come of this alliance.

A useful notational scheme will be to put an implied case role in square brackets, with a plus "+" or minus "-" sign to indicate whether it can be expressed obliquely. Thus,

middle: changes A/P-x to P-x [-A] AP/F-x to F-x [-AP] P/F-x to F-x [-P] passive: changes A/P-x to P-x [+A] AP/F-x to F-x [+AP] P/F-x to F-x [+P] anti-passive: changes A/P-x to A-x [+P] AP/F-x to AP-x [+F] P/F-x to P-x [+F] inverse: changes A/P-x to P/A-x AP/F-x to F/AP-x P/F-x to F/P-x

where "-x" represents either "-s" or "-d".

For verbs that take three arguments, we will do the following:

middle: changes A/P/F-x to P/F-x [-A] e.g. *The students taught French easily. [This is ungrammatical in English with the intended meaning, but grammatical in the interlingua.] passive: changes A/P/F-x to P/F-x [+A] e.g. The students were taught French (by Mr. Johnson). anti-passive: changes A/P/F-x to A/F-x [+P] e.g. He shouted obscenities (at the crowd). [Note that the English verb "to shout" is inherently anti-passive. Thus, we must start by creating an A/P/F-d version of this verb, and then perform an anti-passive operation to derive an exact equivalent of the English verb "to shout".] inverse: changes A/P/F-x to P/A/F-x e.g. The student - John taught him geometry.

In addition, some languages, such as Latin, Shona (Bantu), Turkish, Classical Greek, and German allow impersonal passives, in which an intransitive verb is passivized becoming a zero-argument verb. For example, the AP-s activity verb "to run" could undergo a passive or middle transformation into 0-s [+AP] or 0-s [-AP], depending on the language, where "0" is used to indicate that the verb has no arguments. It is interpreted as something like 'running took place' or 'there was running'. A verb like P-d "to grow" could become 0-d [+/-P], and would mean something like 'growing took place' or 'there was growth'. The interlingua allows all of these variations.

Another useful derivation would be to take an A/P/F verb and reduce the topicality of the third argument. (Remember, the anti-passive discussed above reduces the topicality of the second argument.) We will refer to this as an anti-anti-passive operation. However, I know of no natural language that has a distinct way of marking such an operation, so we will not do so in the interlingua. Instead, we can achieve the same effect by simply changing the argument structure of the word using an A/P suffix. We'll see examples of how to do this later.

As we saw with the verb meaning 'to shout (at)', grammatical voice alterations are useful for creating speech act verbs which never take a patient as a direct object, such as the A/F-s [+P] verb "to dictate", as in "He dictated the letter (to his aide)". For verbs like these, we can create a verb that does allow a direct object patient, and promote a focus to first object by means of the anti-passive alteration.

[ Table of Contents ]

2.7.2 More on Middle Voice

The passive, anti-passive, and inverse voices are easy to understand, and I'll say no more about them. Middle voice, however, is so frequently confused with basic intransitivity that I'd like to say a little more about it.

English does not have a formal morphosyntax for middle constructions, unlike many other languages (Persian, Swahili, Basque, Somali, Hausa, Turkish, and many, many others - middle forms in these languages often go by other names, such as statives or agentless passives, but they often function semantically as middles). English does not even have a reflexive clitic construction, as do several other European languages, which often performs additional duty for middle voice. This is unfortunate, since, as we will see, it can be extremely useful and productive.

English sometimes allows an active verb to be used without modification in a middle construction, as long as the context forbids an active interpretation. Thus, we can say "The joke did not translate well", or "The plane landed ten minutes ago", or "The library closed early". But even when the meaning is clear, English can be quite idiosyncratic as in "*The mountains see in the distance" or "*The boxes are covering in the storeroom". Sometimes, if the verb has an agent, an indefinite construction can be used, as in "They don't make cars like they used to". And in cases where context and semantics do not make it clear, English is often forced to use periphrastic or passive constructions, completely different words, metaphors, or even idioms. Consider the following examples:

ACTIVE MIDDLE I see the mountains. *The mountains see. The mountains are in view. Thus, from the verb "to see", P/F-s, we can derive: "to be in view", F-s [-P] The gang terrorized the *The neighborhood terrorized for three neighborhood for three years. years. The neighborhood lived in a state of of terror for three years. Thus, from the verb "to terrorize", A/P-s, we can derive: "to live in a state of terror", P-s [-A] That woman buys caviar only *Caviar buys only when it's on sale. when it's on sale. Caviar sells only when it's on sale. Thus, from the verb "to buy", AP/F-d, we can derive: "to sell (intransitive sense only)", F-d [-AP] He threw the rock at the window. *The rock threw at the window. The rock went flying at the window. Thus, from the verb "to throw", A/P-d, we can derive: "to go flying (metaphorically)", P-d [-A] I remembered her face. *Her face remembered. Her face came to mind. Thus, from the verb "to remember", P/F-d, we can derive: "to come to mind", F-d [-P] He swallowed the pills *The pills swallowed with difficulty. with difficulty. The pills went down with difficulty. Thus, from the verb "to swallow", A/P-d, we can derive: "to go down", P-d [-A]

And so forth. The number of possible examples is almost unlimited. Thus, English can deal with middle concepts, although the forms are usually highly irregular, unpredictable, periphrastic, and often either metaphoric or idiomatic.

Some English verbs that can be used both transitively and intransitively, such as "open", "cook", and "fill", have gerund forms that refer to the state of the object rather than the subject. For example, "the opening door" means 'the door that is being opened', not 'the door that is doing the opening'. In these cases, the English gerund is equivalent to the interlingua's middle form. For example, the adjective "doykavapo" means 'doing the opening' while "doykavapemo" means simply 'opening' as in "the opening door". Also, "doykavapemo" implies that someone or something is causing the door to open; i.e., an agent. If no agent is implied, then the P-d form "doykavupo" should be used instead. [Note that the final marker "-o" is needed in all three cases to convert the result to an adjective.]

Middle verbs are often confused with basic P-s or P-d state verbs. The reason is that the patient is the subject of an intransitive verb, and it is often uncertain whether or not a transitive subject is implied. In languages which have a formal middle voice, however, there is never any doubt. Unfortunately, speakers of languages like English will have to be a little more careful. When in doubt, the basic P-s or P-d form should always be used instead of the middle form unless an agent is clearly implied. Middle verbs are also often confused with reciprocals and reflexives because some languages (especially European languages) use the same forms for more than one voice. In the semantic system used by the interlingua being discussed here, middles, reflexives, and reciprocals are completely different. [We will discuss reflexives and reciprocals later.]

It's important to keep in mind the difference in semantics between middle and passive derivations (including anti-middle and anti-passive). A middle derivation is used when the demoted argument cannot be specified, which is always the case in generic situations and when the demoted argument is known from general knowledge (e.g. "Mice kill easily"), as well as when the demoted argument is generic in the current context (e.g. "The mountains finally came into view"). However, a passive derivation without an oblique argument implies that the speaker is intentionally omitting some information that is not known to the listener, most likely because the speaker does not consider the actual argument to be very important, or perhaps because the speaker does not know it himself. A passive derivation with an oblique argument implies that the speaker considers the argument to be less important than the non-oblique arguments.

For example, compare "The library closed at 6 o'clock" (middle) with "The library was closed at 6 o'clock" (passive). The middle construction gives the impression that the closure was normal, while the passive construction implies that the closure was unusual and that unknown information was omitted, as in "The library was closed at 6 o'clock by the mayor because of the emergency".]

Thus, the middle and passive derivations represent three distinct degrees of relevance:

Middle: Argument cannot be specified because it is too general, is common knowledge, or is generic in the current context. Specifying it would be redundant or excessively verbose. Passive without oblique: Argument is not specified because the speaker does not consider it important or does not know it. Passive with oblique: Argument is provided but is less important than would otherwise be implied if it were not oblique.

As usual, though, language is rarely so precise and there will be some overlap. In other words, a speaker can at times use a middle derivation when a passive one would be more technically correct, or vice-versa.

Finally, since the middle voice makes the subject generic, the noun version of a middle voice alteration has the meaning of a prototypical, generic object of the unmodified verb. This allows us to create many new and useful words. Here is an example using a root we already know:

"kop" = 'to know' -> "kopemi" = 'datum', 'fact', 'item of knowledge'

Compare the above with the passive form "kopesi", which would have the meaning 'something which is known'. With the passive form, the original subject (i.e. the "knower") still has a strong presence. In the middle form, however, the original subject is almost completely eliminated.

[ Table of Contents ]

2.7.3 Incorporating Oblique Case Roles

Some natural languages can make almost any case role a subject or object of the verb (e.g. Malagasy, some Mayan languages, and most Philippine languages). In fact, among the Philippine languages, verbs almost always have an explicit morpheme that indicates the case role of the subject, and almost any case role can be promoted to subject. Many Bantu languages of Africa (e.g. Swahili) and some Australian languages (e.g. Dyirbal) allow an instrumental case role to be promoted to object. Many Bantu languages also allow a locative case role to be promoted to subject. Indonesian allows a beneficiary case role to be promoted to object. And so on.

Obviously, the above system could be easily extended to add normally oblique case roles to the argument structure of a verb. However, we will not be doing this in the interlingua for the following reasons:

1. It is extremely rare among natural languages. 2. The number of possible combinations of argument position and case role is very large, and would require a large number of special morphemes that would rarely be used. 3. Most (all?) languages that allow promotion of normally oblique case roles have special reasons for doing so. For example, many languages allow relativization of only certain core arguments, and thus a voice change is required before other arguments can be relativized. 4. If the syntax of the interlingua is designed properly, then any argument can be promoted or demoted by simply changing its position relative to the other arguments. For example, consider the following, greatly simplified VSO syntax: sentence ::= verb { argument } argument ::= core_argument | oblique_argument core_argument ::= noun_phrase oblique_argument ::= case_tag ( noun_phrase ) [A case tag that is not followed by a noun phrase is an adverb.] The above syntax allows oblique arguments to be placed after, between, or even before the core arguments, which can have the same effect as explicit, morphological promotion or demotion. For example, if we need to promote an instrumental case role, we can do something like this: "Broke with a hammer John the window" or "Broke John with a hammer the window". Note though, that we must modify the verb itself if we want to promote or demote a core argument.

For all of the above reasons, there is no need to implement grammatical voice changes that would promote normally oblique case roles to core positions. Thus, while there must be a way to modify the relative topicalities of core arguments, there is simply no need to create special morphemes to promote normally oblique arguments.

Incidentally, core arguments are not limited to noun phrases. They can also be embedded clauses. Here are some examples:

John wanted the book vs. John wanted Bill to leave. I saw the soldiers vs. I saw the soldiers marching. They liked her vs. They liked her portrayal of Juliet. We know the answer vs. We know that he likes her.

A clause which appears as the argument of a verb is called a complement.

Note that the English embedded clauses are idiosyncratic in that they require either infinitives, participles, nominalizations, or complete finite clauses, depending on the particular verb. By using an embedded clause with the same form as a normal sentence (i.e. a complete finite sentence), you can achieve the same effect with a simpler morphology and syntax. Here is how the above examples would look (the complete embedded clause is in parentheses):

John wanted (Bill leave). I saw (the soldiers were marching). They liked (she portrayed Juliet). We know (he wants (she buy the car)).

They seem awkward in English, but they're linguistically sound, syntactically simpler, and totally lacking in idiosyncracy. Also, this approach is used in many natural languages.

[Incidentally, a complete description of the syntax of the interlingua has been provided in Appendix E of this manual.]

[ Table of Contents ]

2.7.4 Summary of Grammatical Voice Nomenclature

There are two voice changing operations that demote an argument: passive and middle. A passive voice change demotes an argument but allows it to be expressed obliquely. A middle voice change demotes an argument but does not allow it to be expressed obliquely. If the prefix "anti-" is not used, the first argument (i.e., the subject) is demoted. If the prefix "anti-" is used, then the second argument (i.e., the first object) is demoted.

For example, a passive demotes the first argument, and allows it to be expressed obliquely. An anti-middle demotes the second argument and suppresses its salience so much that it cannot be expressed obliquely.

Here is a complete list of middle and passive suffixes:

-es passive -os anti-passive -em middle -om anti-middle

As stated earlier, if it is necessary to demote the third argument of a ditransitive verb, an appropriate argument structure suffix should be used. For example, if we wish to perform a middle operation on the third argument of an A/P/F verb (i.e., an "anti-anti-middle" operation), then we will use either the A/P-s suffix "-as" or the A/P-d suffix "-ap", whichever is appropriate.

Obviously, this implies that the focus of these verbs can never be expressed obliquely and that we can not make a semantic distinction between anti-anti-passive and anti-anti-middle. However, I do not consider this a disadvantage because I know of no natural language that can do these things.

[ Table of Contents ]

2.7.5 Disjuncts

When using verbs, we must be careful not to confuse case roles. It is sometimes easy to mistake a focal event for a patient. Consider the following example:

It's sad that John died.

It is tempting to treat the embedded sentence "John died" as if it were the patient in a P-s state verb formed from the root meaning 'sad'. However, an event cannot be "sad" in the sense that it can experience sadness. What we are really describing are the feelings of the speaker (and perhaps others) towards the situation. Thus, when we say "it's sad that ...", we are really describing our feelings or beliefs about the situation. In effect, the speaker and those he may be speaking to are the real patients.

Thus, in a sentence like the above, the real patient is implied, and the mental state of the patient is 'focused' on the event indicated by the embedded sentence. Thus, the embedded sentence is the focus of the main state verb meaning 'to be sad about'.

We can easily create a basic P/F-s verb meaning 'to be sad about', as in the sentence "Bill is sad about his parents' divorce". Using this basic verb, we can perform a middle voice alteration to create the F-s [-P] form meaning 'it is sad that'.

It is also possible for an event to be the agent or cause of the sadness. For this, we would need an A/P-s verb, since the event itself causes the patient to be sad. Thus, we really have several possible forms, as illustrated below:

A/P-s John's death makes (i.e. keeps) me sad. A/P-d John's death saddened me. P/F-s I am sad that John died. F-s [+P] It's sad (for everyone) that John died. F-s [-P] It's sad that John died. OR Sadly, John died.

A similar analysis can be done using the state concept 'hoping':

P/F-s I hope that I'll win. F-s [-P] Hopefully, I'll win.

where both "sadly" and "hopefully" are actually verbs that take a complete embedded sentence as an argument - they are not adverbs as in English.

Words and expressions like these are called disjuncts, and many other examples can be derived in the same way: "to presume" -> "presumably", "to be interesting" -> "interestingly", "to be possible" -> "possibly", "to be incidental" -> "incidentally, by the way", "to be necessary" -> "necessarily", "to be fortunate" -> "fortunately", etc.

Finally, the unspecified arguments to many disjuncts are often provided by the speech situation, such as who is speaking, who is listening, where the speech is occurring, and so on. These are called deictic disjuncts, and I'll have more to say about them later.

[ Table of Contents ]

2.7.6 Voice Derivations

Here are some examples of derivations using voice suffixes and the root "bus", which we introduced earlier:

Middle P-s [-A] "busasema" - 'to be under control/in hand' e.g. The runaway budget is now UNDER CONTROL. Inverse P/A-s "busasanga" - 'to be under the control of' e.g. The project IS now UNDER THE CONTROL OF the engineering department. Anti-passive A-s [+P] "busasosa" - 'to be in control/charge' e.g. John IS IN CHARGE here. Middle F-s [-AP] noun "businzemi" = 'deed', 'act', 'action', Anti-passive AP-s [+F]: "businzosa" - 'to be doing something' e.g. He IS DOING SOMETHING right now.

[ Table of Contents ]

2.7.7 Voice Combinations

It is important to emphasize that the basic voice operations (middle, passive, and inverse) are not sequential. They act independently, as if each operation were the only one operating on the original argument structure. For example, if we apply middle and anti-passive to an A/P/F verb, the middle operation converts A to [-A], the anti-passive operation converts P to [+P], and the result is F [-A] [+P]. This combination is legal, and the order of application of the voice suffixes is irrelevant.

The net effect of this rule is that a core argument can only be affected once. For example, it is illegal to apply a passive and an inverse, since, independently, the passive would convert A/P/F to P/F [+A], while the inverse would convert it to P/A/F, and the two results are not compatible. In other words, the agent argument would have been affected twice, and the result would be ambiguous. If our goal is A/F [+P] (i.e., inverse followed by passive), then we should use a simple anti-passive (suffix "-os"). If our goal is F/P [+A] (i.e., passive followed by inverse), then we're out of luck - there is no way of accomplishing this in the interlingua. Fortunately, I have not been able to find any use for it, and I doubt that any natural language has such a capability.

Later, we will learn of other voice operations ( reflexive and reciprocal) that actually combine two separate core arguments into a single core argument. These voice operations are not considered basic and are not affected by the above rule. For example, it is possible to apply a passive after a reciprocal. In effect, a non-basic voice operation creates a new verb that can undergo normal basic voice operations.

[ Table of Contents ]

2.8 More on Causation

In many of our verb derivations, we used the word "cause" in our paraphrases of the semantics of verbs which have an agent in their argument structure. Unfortunately, these paraphrases are approximate and often imply some distance between the agent and the event. However, I must emphasize that the agent argument of a verb is the entity that is directly responsible for the event indicated by the verb. Thus, there is a definite semantic difference between 'kill' and 'cause to die', even though our paraphrases may imply otherwise.

If we wish to intentionally put distance between an agent and an event, we must design words that are equivalent to English "cause", "make", etc. Consider the following sentences:

He MADE his son wash the dishes. I HAD Bill deliver the package. He CAUSED his wife to have a miscarriage.

In the above examples, the patient (if that is what it really is) cannot be expressed directly:

*He made his son. *I had Bill. *He caused his wife.

However, the English verb "to cause" can be used without this quasi patient:

John caused the accident.

Thus, these verbs indicate that an indirect agent is responsible for an event which itself may have a direct agent - the quasi patient is not at all a true patient of the verb "cause/make/have" (although it may be the true patient of the embedded sentence). Also, the English distinction between "cause", "have", and "make" is somewhat idiosyncratic. Semantically, there is no significant difference between them. [Actually, "to have" is a more polite version of "to make", but this distinction is not important to us here. We will discuss how to derive more polite forms of words in the section on register variations.]

The most neutral paraphrase of indirect causation is simply the static 'to keep in existence' or the dynamic 'to cause to become real/actual/existent'. In the interlingua, I will use the state root "kav" to represent this concept (default = P-s adjective). Here are some useful derivations:

A/P-d: "kavapa" - 'to cause/make/create/produce', 'to bring into being', 'to cause to come into existence', 'to bring about/on', 'to cause to become real/actual', 'to make a reality' e.g. John CAUSED the accident. John MADE Billy wash the dishes. John MADE some apple cider. A/F-d [-P]: "kavamboma" - 'to implement/execute/carry out/bring about/put into effect or practice/accomplish/ etc' e.g. They CARRIED OUT your orders. We have to IMPLEMENT the new plan by Monday. [The focus provides additional information about the unspecified patient without itself being affected. Cf. "We made the boat according to these plans" vs. "We implemented these plans".] A/P-s: "kavasa" - 'to ensure/insure/guarantee', 'to to make sure that ...', 'keep/maintain a reality' e.g. Skilled teamwork ENSURES high quality results. John will MAKE SURE that there's enough food. [Incidentally, the opposite of "kavasa" is "juvasa" and means 'to prevent or preclude'; i.e. to ensure that something remains non-existent.] P-s: "kavo" - 'real', 'actual', 'existent' e.g. John said he saw a REAL unicorn. P-s verb: "kava" - 'to be real/actual', 'there be', 'the reality is that', 'In reality ...', 'Actually, ...' e.g. THERE ARE ten people at the party. THE REALITY IS THAT they're all gone. P-d: "kavupa" - 'to come into existence', 'there came to be', 'it came to be that', 'to become a reality', 'to become real', 'to come about', 'to happen', 'it happened that', etc. e.g. The new policy CAME INTO BEING after he resigned. The accident HAPPENED because of poor visibility. THERE CAME TO BE fewer people willing to help. IT CAME TO BE THAT fewer people were willing to help.

[ Table of Contents ]

2.9 Focused versus Unfocused

In the interlingua, an unfocused derivation will have exactly the same semantics as the corresponding anti-middle derivation if the root is focused by default. If it is not focused by default, then the semantics will be different, as we will discuss later. Thus:

AP-x is equivalent to AP-x [-F] P-x is equivalent to P-x [-F]

For example, P-s "kopusa", meaning 'know (intransitive)', is equivalent to P-s [-F] "kopoma".

Note that, since we have not implemented an anti-anti-middle or an anti-anti-passive, A/P-d "kopapa", meaning 'to inform (transitive - not ditransitive!) is equivalent to either the anti-anti-middle or the anti-anti-passive of "kopamba".

This approach has an important implication that may not be immediately obvious. Since middle derivations indicate that the demoted argument is generic, the lack of a middle voice change indicates that the argument must either be explicitly specified or is intentionally being withheld by the speaker. And if it is being withheld, then it is equivalent to an appropriate passive operation. Here are some examples that should help illustrate this point:

kopa = P/F-s verb meaning 'to know' kopuso = kopomo = P-s [-F] adjective meaning 'knowing', 'cognizant', 'in the know', etc. kopo = ???

Since "kopo" is focused but does not have an explicit focal argument, the argument is being explicitly withheld. In other words, it is equivalent to an anti-passive:

kopo = koposo = 'knowing something that the speaker doesn't know or isn't telling'

Note that even though the form "kopo" is effectively anti-passive, it is still more general than the unfocused form "kopuso/kopomo", and will be applicable in all situations. The unfocused adjective should only be used to emphasize that the focus is generic. And since English rarely (if ever) makes this distinction, both forms of such derivations will generally have the same English translation.

[ Table of Contents ]

3.0 Nouns

By now, it should be obvious that word design can be extremely productive in a language possessing a rich classificational morphology. This kind of morphology allows the language designer to create a large vocabulary with semantic precision, while minimizing the number of root morphemes needed. However, so far we've only used this approach to design basic verbs. We now need to see if a similar approach can be used to design basic nouns.

I began my discussion of verbs by providing a large number of examples that I placed into groups based on their argument structures. I felt that this was necessary because my approach to classifying verbs is unusual (and probably unique).

For nouns, though, I don't think that large numbers of examples will be needed, simply because the classes and their semantics are fairly obvious.

[Incidentally, I am not aware of any other work that classifies verbs as I have done here. Initially, I was tempted to adopt the more widely accepted Vendlerian analysis which classifies all verbs into the four major categories: state (e.g. "to know", "to love"), activity (e.g. "to run", "to sing"), accomplishment (e.g. "to sing a song", "to write a book") and achievement (e.g. "to die", "to find"). However, although I experimented with these four categories, I was very unhappy with the results. The standard categories seemed too vague, and I often had difficulty deciding which category a verb belonged to. An even greater disadvantage is that they provide almost no information about the semantics of the words. In any case, I felt that I needed a more productive system, and eventually ended up with the approach that I am using here.]

[ Table of Contents ]

3.1 Basic Noun Classes

Before starting, let's precisely define what we mean by the expression "basic noun". Here is the definition that I will use:

A basic noun will represent an entity that has an actual physical existence (including extinct entities as well as entities from fantasy, mythology, etc.). Thus, such an entity must be composed of matter, energy, a combination of both, or time. Furthermore, characteristics which distinguish it from other entities must be verifiably physical (as opposed to functional, social, cognitive, etc).

I will classify most basic nouns as follows:

1. An entity represented by a basic noun must consist of matter, energy, a combination of both, or time. 2. An entity of matter and/or energy represented by a basic noun must be either living or non-living. 3. A non-living entity represented by a basic noun must be either natural or artificial.

So, using this approach, we can create the following basic noun classes:

matter & energy: living, species -> man, lizard, clam, tree, bacteria living, organs -> hand, leaf, branch, liver, acorn living, diseases -> arthritis, pneumonia, claustrophobia non-living, natural -> storm, tide, geyser, rainbow non-living, artificial -> computer, airplane, oven, fountain matter: natural -> salt, rock, cliff, river, island artificial -> key, statue, ax, book, wharf, house energy: living -> ghost, angel, genie, demon, banshee non-living -> heat, thunder, sunshine, photon time: -> winter, midnight, equinox, childhood

I am not making a distinction between natural and artificial, non-living energy because we would be forced to make useless distinctions. For example, "light" from the sun would require a different classifier than "light" from a light-bulb.

The 'living, organs' category includes all parts of living organisms that themselves contain life. Thus, "acorn" is considered an organ, while "shell" (e.g. clam shell) and "hair" are considered 'matter, natural'.

The 'living energy' category includes anything related to the supernatural, including mythological creatures that are primarily spirit-like (such as banshees and fairies). Mythological creatures that are primarily physical will be placed in an appropriate physical class. For example, the word meaning 'dragon' will be in the lizard class, 'minotaur' will be in the mammal class, and so on.

I believe that the above classes are fundamental, and that any useful system should contain at least these ten classes. However, we will also provide additional sub-classes for classes that have a large number of members. For example, in the 'matter & energy, living, species' class, it will be useful to distinguish between plants and animals. In fact, we will create even finer distinctions, such as between 'mammal', 'bird', 'fish', 'insect', etc. In the 'matter, artificial' class, it will be useful to distinguish between substances (e.g. "plastic"), locatives (e.g. "wharf") and others (e.g. "hammer"). The same substance/locative/other distinction will also be applied to the 'matter, natural' class to allow us to distinguish between words such as "water" (substance), "cliff" (locative), and "boulder" (other).

If we make these additions, our chart will look like this:

matter & energy: living, species vertebrates: mammals -> man, tiger, mouse, deer, dolphin birds -> hawk, ostrich, canary, penguin reptiles -> lizard, snake, turtle, crocodile other vertebrates (i.e., fish) -> trout, halibut, perch, lamprey arthropods -> ant, bee, crab, mosquito, grasshopper other animals -> clam, jellyfish, snail, worm plants (including kingdoms Monera, Protoctista, and Fungus): trees -> tree, oak, shrub, apple, juniper bush other plants -> grape, morning glory, horsetail, moss living, organs -> hand, leaf, branch, liver, ear living, illnesses -> smallpox, rheumatism, cancer, flu non-living, natural -> tornado, geyser, rainbow, earthquake non-living, artificial -> lathe, telephone, pump, robot, clock matter: natural, substance -> water, sand, bauxite, ivory, urine, air locative -> planet, river, island, mountain, bay other -> boulder, fang, stalagmite, shell artificial, substance -> plastic, benzene, steel, cloth, glue locative -> wharf, city, road, school, stadium other -> window, statue, desk, book, nail energy: living -> ghost, jinni, god, devil, banshee non-living -> heat, thunder, photon, noise, light time: -> winter, sunset, equinox, infancy

The non-living, artificial matter & energy class will represent powered items that typically do not run on only human or animal power; e.g., an electric drill, but not a hand-powered drill.

Note that I use the word "locative" in the following sense: a locative noun represents an entity which typically is built in place or evolves naturally in a single location, which is extremely difficult (if not impossible) to move to a different location, which is relatively permanent, and which is typically considered a place where humans can go to, remain at, or depart from. Again, the choice may seem subjective. For example, "wharf", "staircase", "bleachers", and "gallows" will be artificial locatives, but "beehive", "den/burrow", and "nest" will not be locatives. Instead, they will belong to the 'natural other' class.

[ Table of Contents ]

3.2 Noun Design Algorithm and Examples

In the interlingua, we will create classifiers for all of the above classes and sub-classes. In addition, since there are many more possible classifiers than will be needed, we will sub-categorize the classes even further. For example, the 'natural substance' sub-class will have the following classifiers and associated sub-categories:

civ elements and compounds (hydrogen, oxygen, sodium, chlorine, uranium, sodium chloride, potassium sulfate, biochemicals (including drugs), insulin, DNA, nucleotide, amino acid, methane, butanol, polybutadiene, benzoic acid, chlorobenzene, dimethylamine) zop plant/animal substances and mixtures (blubber, frankincense, beeswax, beef, honey, blood, wood, marrow, milk, feces, coral, tears, spit/spittle, urine) jav other natural substances (air, coal, soil, clay, bauxite, dust, sand, ore, ruby, snow, gypsum, poison)

A complete list of all the classifiers is provided in Appendix C.

Each classifier has a default class; i.e., a default semantics and syntax. For example, as we saw earlier, the classifier "kop" (meaning 'know') is a P/F-s mental state by default. The class of a root that has more than one morpheme is determined by the rightmost morpheme, and this morpheme is referred to as the classifier.

A stand-alone classifier (i.e., one that is not preceded by a modifying morpheme) will represent a specific, prototypical member of the class, rather than the entire class. For example, when the 'bird' classifier is used alone, it will actually represent the particular category of birds called 'pigeon/dove' rather than the more general meaning 'bird'. This classifier can then be modified by other modifiers to represent other birds, such as 'eagle', 'gull', 'ostrich', and so on. If we need to create a root representing the entire class, we will modify the classifier with the modifier "bye", meaning 'member'. For example, the 'member' modifier plus the 'bird' classifier means simply 'bird', and can refer to any bird.

The member modifier "bye" will not be applied to a classifier unless the result is useful and has a counterpart in many natural languages. For example, there is a classifier for 'abstract attributes and qualities'. Since I doubt that any natural language has a single word to represent this concept, we will not create a word using "bye" plus this classifier.

Note that a specific member of a class does not have to represent a single species or a single kind or type of entity. For example, there are several species of pigeon.

The classifier morpheme of a root is semantically and syntactically precise. However, the modifiers to the left of a classifier will provide no syntactic information at all and may not necessarily be semantically precise, but will provide semantic clues that will help the student remember the meaning of the complete root. In other words, the modifiers to the left of the classifier will be used for their mnemonic value to modify the classifier. The classifier, however, will always be semantically precise. For example, the root meaning "bicycle" consists of the numeric modifier meaning 'two' plus the 'vehicle' classifier.

Also, some modifiers can have completely different meanings in different contexts. For example, the modifier with the meaning 'six' would be useless with most classifiers except the numeric classifier and certain shapes (such as the hexagon). In cases like this, the modifier will have one or more completely different meanings that will be more useful in other contexts. Even so, however, we will always try to assign multiple meanings that are at least somewhat reminiscent of or related to each other. For example, the modifier meaning 'two' will have the alternate meanings 'divided/opposition'.

In summary, a classifier is used in three ways:

1. as a stand-alone root which represents a specific member or sub-group of its class (eg. 'pigeon') 2. as a classifier that can be modified by modifying morphemes to represent other specific members of its class (eg. 'ostrich' of the 'bird' class) 3. as a classifier modified by the 'member' morpheme "bye" to represent any member of the class (eg. a single root meaning 'bird')

Thus, the approach used here will allow an entire, easily learned vocabulary of roots to be flexibly designed using a relatively small number of root morphemes.

Now, let's design some words. We'll start with the modifying root morpheme "bo", which will have the vague senses 'fish/water/liquid/swim' and apply it to several classifiers (refer to Appendix C for the complete list of classifiers):

matter & energy: living, species mammal -> bozovi - otter birds -> bodami - duck fish -> bobomi - puffer/blowfish insects -> bokagi - mosquito trees -> bojigi - tupelo/black gum/sour gum living, organs -> bocesi - bladder (e.g. urinary or gall) non-living, natural -> bofepi - rain(fall) non-living, artificial -> botimi - boat -> bobisi - washer/washing machine matter: natural, substance -> bocivi - water locative -> botisi - oasis other -> boxami - drop(let) artificial, substance -> bofupi - drink/beverage locative -> bozegi - reservoir -> bodepi - bathroom other -> bozipi - cup energy: living -> bodevi - undine, water spirit non-living -> boxogi - hydropower time: -> bofemi - monsoon, rainy season

[ Table of Contents ]

3.3 From Basic Noun to Other Parts-of-Speech

The simplest kind of derivation is to change the part-of-speech. In the interlingua, the verb form will have the meaning 'to be X', and the other forms will be interpreted in the usual way. Thus, for example, the word "bodama" would be a P-s verb meaning 'to be a duck'. The adjective form, "bodamo", would be used in expressions such as "Billy the duck", "duck egg", and any other modification that is inalienably 'duck'. The adverb form "bodame" would have the meanings 'being duck', 'since it is (a) duck', 'since they are duck', etc. Note that this approach is perfectly consistent with the rules we adopted for basic verbs.

We can also change the argument structure to something other than P-s. When doing so, the basic noun will represent the state, and the verb suffix will apply to the state in the usual way. For example, P-d "bodamupa" would mean 'to become a duck', A/P-d "bodamapa" would mean 'to change P into a duck', and so on.

[ Table of Contents ]

3.4 Modifiers Plus Verb and Adjective Classifiers

In the previous section, we used the modifier "bo" to modify several noun classifiers. We should also be able to derive useful words by applying it to verb and adjective classifiers. Here are some examples (refer to Appendix C for a complete list of classifiers):

bokemo -> "-kem" = a scalar, non-relational state classifier, P-s adjective: wet bokema -> P-s verb: to be wet bokemapa -> A/P-d verb: to wet/make wet bokemupa -> P-d verb: to get wet bokasa -> "-kas" = involuntary act classifier, P-d verb: to sweat bocala -> "-cal" = activity classifier, AP-s verb: to swim

Here are some more examples using the modifier "ko-", which is reminiscent of the root "kop", meaning 'to know' or 'to have knowledge of'. The modifier "ko-" will represent the concepts 'knowledge', 'education', 'wisdom', and so on. (refer to Appendix D for a complete list of modifiers and their meanings):

kokigi = school - "-kig" = building(s) or place of business kodepi = classroom - "-dep" = room kojisi = desk - "-jis" = furniture kotega = explain - "-teg" = speech act, A/P/F-d verb kobegi = scholar - "-beg" = person kobisi = computer - "-bis" = powered device

[ Table of Contents ]

3.5 Abstract Nouns

There are many nouns that are difficult to classify because of their inherent abstractness. Some of these nouns refer to concepts such as language (e.g. French), culture (e.g. Arab), race (e.g. Caucasian), nationality (e.g. Swiss), and religion or ideology (e.g. Christian). These, however, are all proper nouns, and I will postpone discussion of them until later, in the chapter on Proper Names, Borrowed Words, Abbreviations, and Vocatives.

There are also concepts that are more general in nature and which typically describe human activities, the abstract products of such activities, the components of such products, and so on.

The question, though, is: What are these words? Are they nouns? Are they verbs? Or are they something else?

To answer this question, consider the English words "mathematics", "opera", and "adjective". If they are inherently verbs, then why do we never use them as verbs? They are always used as nouns. And if they are inherently stative, then why can we never use them as adjectives? In fact, if they were inherently stative, we would not need to derive such words as "operatic", "mathematical", and "adjectival".

The only conclusion that makes any sense is that these words are inherently nouns.

So, if they are indeed nouns, then how do we classify them?

Consider the word "opera". We might be tempted to classify it as non-living, artificial matter & energy. However, this would put it into the same category as "jacuzzi", "computer", and "automobile". For some reason or other, my mind rejects the idea that "computer" and "opera" are in the same class.

And what about "mathematics", "adjective", and "poem"? Should they be placed in the non-living energy class? If so, they would be classified along with "electricity", "light", and "thunder". Again, my mind rejects this categorization.

One thing that should be fairly obvious by now is that noun classification is inherently arbitrary, and that there is no way to avoid this arbitrariness. We can see logic and structure in the design of verbs, but nouns resist any truly logical classification. The reason for this is simply that nouns represent the products of an essentially random universe. For example, if you look at a diagram that cl