An Algorithmic Approach to English Pluralization

Damian Conway

School of Computer Science and Software Engineering

Monash University

Clayton 3168, Australia

Abstract

The problem of English plurals

The use of English plurals in synthetic sentences is a case in point. In computing applications, for example, it is quite common to encounter error messages which jar because they do not correctly inflect for grammatical number:

Compilation aborted: 1 errors were detected.

print "Compilation aborted: $count ", ($count==1 ? "error was" : "errors were"), " detected.

";

Her criterion differs from mine. The Major General met the Governor General. Analysis of this aquarium's fish failed to determine its genus. That phalanx suffered a trauma.

Coping with English plurals in synthetic text

Ignoring the problem

"There were 1 errors"

One might argue that this approach is economically rational, in that the extra cost and complexity involved in identifying and coding around that one special case outweighs the benefit of correctly handling it. This, of course, is the perennial excuse for ugly and ungainly interfaces, and quite unassailable in the estimation of the utilitarian mind.

Avoiding the problem

Number of errors: 1 Number of errors: 10

1 error(s) found. 10 error(s) found.

A "manual" scheme

sub select_pl($$) { my ($word, $count) = @_; $word =~ s#\(([^)/]*)/([^)]*)\)# $count==1 ? $1 : $2 #ge; return $word }

print select_pl("$count error(/s) (was/were) found", $count);

Pluralizing algorithms

-s

clam

->

clams

storey

->

storeys

bag

->

bags

class

->

classes

story

->

stories

box

->

boxes

criterion

->

criteria

stigma

->

stigmata

ox

->

oxen

classifies

->

classify

stores

->

store

bobs

->

bob

my

->

our

her

->

their

Bob's

->

Bobs'

More complex algorithms that cope with specific suffixes ( -ss -> -sses , -y -> -ies , etc.) can be specified, but pure suffix-based approaches will still be prone to exceptions and meta-exceptions. For example: -y becomes -ies , except after a vowel (when it becomes -ys ), except for soliloquy (which uses -ies ).

A usable pluralization algorithm must therefore cope with three categories of plural formation: universal defaults, general suffix-based rules, and specific exceptional cases. The following section examines each of these categories in more detail.

Categories of English plurals

Universal rules

The rules themselves are well-known and need no elaboration. By default:

Nouns are made plural by appending -s .

. Verbs are made plural by removing any trailing -s (and otherwise do not change).

any trailing (and otherwise do not change). Adjectives and adverbs do not change when made plural.

Suffix categories

-ss

-sses

-y

ies

Certain types of adjectives also inflect in this way. For example, possessive adjectives that end in -'s or -' in the singular are made plural by forming the plural of the root word and appending an apostrophe (unless the root's plural does not itself end in -s , in which case -'s is appended). Hence cat's becomes cats' , axis' becomes axes' , whilst child's becomes children's .

Other suffix categories arise because words of foreign origin (most commonly Ancient Greek or Latin) have retained a non-anglicized plural inflection. Hence criterion becomes criteria , nucleus becomes nuclei , and matrix becomes matrices . Dealing with such categories is complicated by the fact that many other imports have been wholly or partially anglicized. Hence although criterion always forms its plural with -a , ganglion may take either -s or -a ( ganglions or ganglia ), whilst bastion is always inflected with -s . Occasionally the anglicized and "classical" plural forms of a word may both be in common use, but with distinct meanings. Thus a copy-editor might remove appendices , whereas a surgeon would remove appendixes .

The correct inflection of words derived from Latin can be particularly complex, since the same suffix may form different Latinate plurals depending on the declension (or sometimes the part of speech) of the original. Thus the plural of stimulus (second declension) is stimuli , and that of genus (third declension) is genera . Status (fourth declension) is traditionally unchanged in the plural, whilst ignoramus (a first person plural Latin verb) has been wholly anglicized and becomes ignoramuses .

The only practical way to deal with such complexities in an algorithm is to categorize words by both suffix and inflection, and to allow for both anglicized and classical variants. Table 1 illustrates such categories.



Singular

suffix Anglicized

plural Classical

plural Example

(see Appendix A for comprehensive lists

of words in each category) -a (none) -ae alga -> algae -a -as -ae nova -> novas/novae -a -as -ata dogma -> dogmas/dogmata -an -en (none) woman -> women -ch -ches (none) church -> churches -eau -eaus -eaux chateau -> chateaus/chateaux -en -ens -ina foramen -> foramens/foramina -ex (none) -ices codex -> codices -ex -exes -ices index -> indexes/indices -f(e) -ves (none) wolf -> wolves

life -> lives -ieu -ieus -ieux milieu -> mileus/milieux -is (none) -es basis -> bases -is -ises -ides iris -> irises / irides -ix -ixes -ices matrix -> matrixes/matrices -nx -nxes -nges phalanx -> phalanxes / phalanges -o -oes (none) potato -> potatoes -o -os (none) photo -> photos -o (none) -i graffito -> graffiti -o -os -i tempo -> tempos/tempi -on (none) -a aphelion -> aphelia -on -ons -a ganglion -> ganglions/ganglia -oo- -ee- (none) foot -> feet

tooth -> teeth -oof -oofs -ooves hoof -> hoofs/hooves -s -s (none) series -> series -s -ses (none) atlas -> altases -sh -shes (none) wish -> wishes -um (none) -a bacterium -> bacteria -um -ums -a medium -> mediums/media -us (none) -era genus -> genera -us (none) -i stimulus -> stimuli -us -uses -era opus -> opuses/opera -us -uses -i radius -> radiuses/radii -us -uses -ora corpus -> corpuses/corpora -us -uses -us status -> statuses/status -x -xes (none) box -> boxes -y -ies (none) ferry -> ferries -zoon (none) -zoa protozoon -> protozoa (none) -s -im cherub -> cherubs/cherubim

Table 1: Major English suffix categories.

General and user-defined exceptions

Singular form Anglicized plural Classical plural beef beefs beeves brother brothers brethren child (none) children cow cows kine ephemeris (none) ephemerides genie genies genii money moneys monies mongoose mongooses (none) mythos (none) mythoi octopus octopuses octopodes ox (none) oxen soliloquy soliloquies (none) trilby trilbys (none)

Table 2: Irregular English plurals

This table is surprisingly comprehensive, though certainly not exhaustive. Indeed, specific dialects of English may define much larger sets of irregular plurals and may not recognize some of the entries in Table 2. Hence it is important that any algorithmic approach to pluralization be both extensible and adjustable, so that its output may be easily expanded or trimmed for a specific audience.

A pluralizing algorithm for English

The algorithms are based on the rules of English inflection described in the Oxford English Dictionary [5] (OED), Fowler's Modern English Usage [6], and A Practical English Grammar [1] . Where these sources disagree, the OED is taken to be definitive.

A note about user-defined inflections

<singular form>

->

<plural form>

VAX -> VAXen

-x

-xen

oxen

boxen

suffixen

(.*)x -> $1xen

(.*)x -> $1xen fox -> foxes

|

(.*)x -> $1xes | $1xen fox -> foxes ox -> oxen

Nomenclature

suffix(<suffix>) This predicate returns true if the word being inflected ends in <suffix> . Note that standard regular expression conventions are used after the " - " that introduces the suffix.

category(<singular suffix>,<plural suffix>) This predicate returns true if the word being inflected belongs to the set of English words whose suffixes inflect from <singular suffix> to <plural suffix> when pluralized.

inflection(<singular suffix>,<plural suffix>) This function returns the word being inflected, after replacing its current suffix (which must be <singular suffix> ) with the suffix <plural suffix> .

stem(<suffix>) This function removes the specified suffix ( <suffix> ) from the word being inflected and returns the remaining stem. If the word does not originally end in the specified suffix, a special "undefined" value is returned.

"the (user-)specified plural form" This phrase is used whenever a word has been found to belong to an enumerated category. The "specified plural form" is the appropriate anglicized or classical plural form of the word, as it appears in the category table.

An algorithm for forming plural nouns

Check if the user has defined an inflection for the noun, and , if so, accept that... if the word matches a user-defined noun, return the user-specified plural form Handle words that do not inflect in the plural (such as fish , travois , chassis , nationalities ending in -ese etc. - see Tables A.2 and A.3)... if suffix(-fish) or suffix(-ois) or suffix(-sheep) or suffix(-deer) or suffix(-pox) or suffix(-[A-Z].*ese) or suffix(-itis) or category(-,-), return the original noun Handle pronouns in the nominative, accusative, and dative (see Tables A.5), as well as prepositional phrases... if the word is a pronoun, return the specified plural of the pronoun if the word is of the form: "<preposition> <pronoun>", return "<preposition> <specified plural of pronoun>" Handle standard irregular plurals ( mongooses , oxen , etc. - see table A.1)... if the word has an irregular plural, return the specified plural Handle irregular inflections for common suffixes ( synopses , mice and men , etc.)... if suffix(-man), return inflection(-man,-men) if suffix(-[lm]ouse), return inflection(-ouse,-ice) if suffix(-tooth), return inflection(-tooth,-teeth) if suffix(-goose), return inflection(-goose,-geese) if suffix(-foot), return inflection(-foot,-feet) if suffix(-zoon), return inflection(-zoon,-zoa) if suffix(-[csx]is), return inflection(-is,-es) Handle fully assimilated classical inflections ( vertebrae , codices , etc. - see tables A.10, A.14, A.19 and A.20, and tables A.11, A.15 and A.21 if in "classical mode)... if category(-ex,-ices), return inflection(-ex,-ices) if category(-um,-a), return inflection(-um,-a) if category(-on,-a), return inflection(-on,-a) if category(-a,-ae), return inflection(-a,-ae) Handle classical variants of modern inflections ( stigmata , soprani , etc. - see tables A.11 to A.13, A.15, A.16, A.18, A.21 to A.25)... if in classical mode, if suffix(-trix), return inflection(-trix,-trices) if suffix(-eau), return inflection(-eau,-eaux) if suffix(-ieu), return inflection(-ieu,-ieux) if suffix(-..[iay]nx), return inflection(-nx,-nges) if category(-en,-ina), return inflection(-en,-ina) if category(-a,-ata), return inflection(-a,-ata) if category(-is,-ides), return inflection(-is,-ides) if category(-us,-i), return inflection(-us,-i) if category(-us,-us), return the original noun if category(-o,-i), return inflection(-o,-i) if category(-,-i), return inflection(-,-i) if category(-,-im), return inflection(-,-im) The suffixes -ch , -sh , and -ss all take -es in the plural ( churches , classes , etc)... if suffix(-[cs]h), return inflection(-h,-hes) if suffix(-ss), return inflection(-ss,-sses) Certain words ending in -f or -fe take -ves in the plural ( lives , wolves , etc)... if suffix(-[aeo]lf) or suffix(-[^d]eaf) or suffix(-arf), return inflection(-f,-ves) if suffix(-[nlw]ife), return inflection(-fe,-ves) Words ending in -y take -ys if preceded by a vowel ( storeys , stays , etc.) or when a proper noun ( Marys , Tonys , etc.), but -ies if preceded by a consonant ( stories , skies , etc.)... if suffix(-[aeiou]y), return inflection(-y,-ys) if suffix(-[A-Z].*y), return inflection(-y,-ys) if suffix(-y), return inflection(-y,-ies) Some words ending in -o take -os ( lassos , solos , etc. - see tables A.17 and A.18); the rest take -oes ( potatoes , dominoes , etc.) However, words in which the -o is preceded by a vowel always take -os ( folios , bamboos )... if category(-o,-os) or suffix(-[aeiou]o), return inflection(-o,-os) if suffix(-o), return inflection(-o,-oes) Handle plurals of compound words ( Postmasters General , Major Generals , mothers-in-law , etc) by recursively applying the entire algorithm to the underlying noun. See Table A.26 for the military suffix -general , which inflects to -generals ... if category(-general,-generals), return inflection(-l,-ls) if the word is of the form: "<word> general", return "<plural of word> general" if the word is of the form: "<word> <preposition> <words>", return "<plural of word> <preposition> <words>" Otherwise, assume that the plural just adds -s ( cats , programmes , trees , etc.)... otherwise, return inflection(-,-s)

Algorithm 1: Plural inflection of nouns

An algorithm for forming plural verbs

Check if the user has defined an inflection for the verb, and , if so, accept that... if the word matches a user-defined verb, return the user-specified plural form Check if the verb is being used as an auxiliary and has a known irregular inflection ( has seen , was going , etc. See Table A.8 for irregular verbs)... if the word has the form "<auxiliary> <words>" and <auxiliary> belongs to the category of irregular verbs, return "<specified plural of auxiliary> <words>" Handle simple irregular verbs ( has , is , etc. - see Table A.8)... if the word belongs to the category of irregular verbs, return the specified plural form Verbs in the regular 3rd person singular lose their -es , -ies , or -oes suffix ( she catches -> they catch , he tries -> they try , it does -> they do , etc.)... if suffix(-[cs]hes), return inflection(-hes,-h) if suffix(-[sx]es), return inflection(-es,-) if suffix(-zzes), return inflection(-es,-) if suffix(-ies), return inflection(-ies,-y) if suffix(-oes), return inflection(-oes,-o) Other 3rd person singular verbs ending in -s (but not -ss ) also lose their suffix... if suffix(-[^s]s), return inflection(-s,-) Handle ambiguous simple verbs that might also be nouns ( thought , sink , fly , etc. - see Table A.4)... if the word is in the ambiguous category, return the specified plural form All other cases are regular 1st or 2nd person verbs, which don't inflect... otherwise, return the verb uninflected

Algorithm 2: Plural inflection of verbs

An algorithm for forming plural adjectives

Check if the user has defined an inflection for the adjective, and, if so, accept that... if the word matches a user-defined adjective, return the user-specified plural form Handle indefinite articles and demonstratives... if the word is "a" or "an", return "some" if the word is "this", return "these" if the word is "that", return "those" Handle possessive pronouns ( my -> our , its -> their , etc - see Table A.7)... if the word is a personal possessive, return the specified plural form Handle genitives ( dog's -> dogs' , child's -> children's , Mary's -> Marys' , etc). The general rule is: remove the apostrophe and any trailing -s , form the plural of the resultant noun, and then append an apostrophe (or -'s if the pluralized noun doesn't end in -s )... if suffix(-'s) or suffix(-'), if suffix(-'), let the noun <owner> be inflection(-',-) otherwise, let the noun <owner> be inflection(-'s,-) let the noun <owners> be the noun plural of <owner> if <owners> ends in -s, return "<owners>'" otherwise, return "<owners>'s" In all other cases no inflection is required... otherwise, return the adjective uninflected

Algorithm 3: Plural inflection of adjectives

A unified algorithm

Handle user-defined cases... try step 1 of Algorithm 3 try step 1 of Algorithm 2 try step 1 of Algorithm 1 Handle known adjectives... try steps 2 through 4 of Algorithm 3 Handle known verbs... try steps 2 through 5 of Algorithm 2 Handle singular nouns ending in -s ( ethos , axis , etc. - see Tables A.2, A.3, A.16, A.22, and A.23)... if word is a noun ending in -s, try steps 2 through 13 of Algorithm 1 Handle 3rd person singular verbs (that is, any other words ending in -s )... try steps 4 and 5 of Algorithm 2 Treat the word as a noun... try steps 2 through 13 of Algorithm 1

Algorithm 4: Unified plural inflection of nouns, verbs, and adjectives

Note that this sequence represents a particular compromise in the face of inherently ambiguous input. Other compromises (which might perhaps more heavily favour the verb sense of a word) may also be defined, by selecting different subsets of the three algorithms or by changing the order in which the various subsets are used.

Issues and limitations

Homographs of heterogeneous case

it

It ate it -> They ate them

it

it

->

they

Of course, where the necessary context is already provided (for example, when forming the plural of a dative or ablative: to it , from it , with it , etc.), the noun algorithm detects this (in step 3) and correctly returns the accusative plural form: to them , from them , with them , etc.)

Homographs of heterogeneous person

I eat

you eat

I see

you see

we eat

you eat

we see

you see

However, if a verb were to take common singular forms but different plurals (for example, the atrophying British usage: I will -> you shall , you will -> you will ), then the algorithms presented above would be unable to determine the correct inflection without additional context (such as an extra "person" parameter).

The author is not currently aware of any other verbs in English which present this problem, but is not willing to assume ipso facto that none exist.

Other homographs with heterogeneous plurals

I put the mice next to the cheese. I put the mouses next to the keyboards. Three basses were stolen from the band's trailer. Three bass were stolen from the band's fishpond. Several thoughts about leaving crossed my mind. Several thought about leaving across my lawn.

If both meanings of the word are the same part of speech (for example, bass is a noun in both sentences above), then one meaning is chosen as the "usual" meaning, and only that meaning's plural is ever returned by any of the inflection subroutines.

is a noun in both sentences above), then one meaning is chosen as the "usual" meaning, and only that meaning's plural is ever returned by any of the inflection subroutines. If each meaning of the word is a different part of speech (for example, thought is used as both a noun and a verb), then the noun's plural is returned by the noun and unified algorithms, and the verb's plural is returned only by the verb algorithm.

Finally, if the choice of a particular "usual inflection" is considered inappropriate for a particular application, it can always be changed by specifying an overriding user-defined inflection.

"Number-insensitive" comparisons

The need for "number-insensitive" comparisons

Child An accident to the occurrence of which all the forces and arrangements of nature are specially devised and accurately adapted. Genius Any degree of mental superiority that enables its possessor to live acceptably upon his admirers, and without blame be unbrokenly drunk. Self The most important person in the universe.

aborigines

footprints

kine

relations

kine

cow

cows)

An algorithm

the two words are identical, or

one word is a plural form of the other, or

the two words are distinct plural forms of some other word.

base

basis

bases

opus

operas

opus

opera

opera

operas

Check for simple equality... if <word1> equals <word2>, return true Check for number disparity using standard inflection... using anglicized plurals... if the appropriate plural of <word1> equals <word2>, return true if the appropriate plural of <word2> equals <word1>, return true Check for number disparity using "classical" inflection... using classical plurals... if the appropriate plural of <word1> equals <word2>, return true if the appropriate plural of <word2> equals <word1>, return true Handle two variant plurals for the same noun ( brothers and brethren , for example) by checking if there exists a category <c> and a word <w> , such that <word1> and <word2> end in the distinct plural suffixes of category <c> , and word <w> can inflect to both <word1> and <word2> ... if the words are nouns, for each noun category <c>... let <ss> be the singular suffix for category <c> let <sa> be the anglicized plural suffix for <c> let <sc> be the classical plural suffix for <c> if <sa> differs from <sc>, let <stem1> be stem(<sa>) of <word1> if <word2> equals inflect(-,<sc>) of <stem1>, return true let <stem2> be stem(<sa>) of <word2> if <word1> equals inflect(-,<sc>) of <stem2>, return true Handle distinct plural genitives ( cows' and kine's , for example) by removing any -'s , -s' , or -' inflection and comparing the underlying nouns... if the words are adjectives, let <word1a> be stem(-'s) or stem(-') of <word1> let <word2a> be stem(-'s) or stem(-') of <word2> let <word1b> be stem(-s') of <word1> let <word2b> be stem(-s') of <word2> for each defined <w1> in (<word1a>, <word1b>)... for each defined <w2> in (<word2a>, <word2b>)... apply step 4 to <w1> and <w2> if step 4 returns true, return true All other cases corresponding to an equality... otherwise, return false

Algorithm 5: "Number-insensitive" comparison

Note that, because steps 2 and 3 do not specify which pluralizing algorithm is used, Algorithm 5 is generic and may be readily adapted to deal with only nouns, verbs, or adjectives, or with all three at once. Such adaptations merely involve selecting the appropriate algorithm (Algorithms 1 through 4 respectively) with which to generate the "appropriate plural" forms. Where the algorithm is adapted to a particular part of speech, one or both of steps 4 and 5 may be omitted entirely, if inappropriate.

A Perl implementation

Lingua::EN::Inflect

http://www.perl.com

The exportable subroutines of Lingua::EN::Inflect provide plural inflections for English words. Plural forms of most nouns, many verbs, and some adjectives are provided. Where appropriate, "classical" variants are also provided. The module also offers pronunciation-based selection of indefinite articles ( a and an ), but discussion of those facilities is beyond the scope of this paper.

Inflecting plurals - the PL_...() subroutines

Lingua::EN::Inflect

PL_

PL_

()

The PL_ ... () subroutines also take an optional second argument, which indicates the desired grammatical number of the word. If the "number" argument is supplied and is not 1 (or "one" or "a" ), the plural form of the word is returned. If the "number" argument does indicate singularity, the (uninflected) word itself is returned. If the number argument is omitted, the plural form is returned unconditionally.

The various subroutines are:

PL_N($;$) PL_N() takes a singular English noun or pronoun and returns its plural.

PL_V($;$) PL_V() takes the singular form of a conjugated verb (one which is already in the correct grammatical person and mood) and returns the corresponding plural conjugation.

PL_ADJ($;$) PL_ADJ() takes the singular form of certain types of adjectives and returns the corresponding plural form.

PL($;$) PL() takes a singular English noun, pronoun, verb, or adjective and returns its plural form. Where a word has more than one inflection depending on its sense, the (singular) noun sense is generally preferred to the (singular) verb sense. Of course, the inherent ambiguity of such cases suggests that, where the part of speech is known, PL_N() , PL_V() , and PL_ADJ() should be used in preference to PL() .

PL(" cat ")

" cats "

Modern vs classical inflections

Lingua::EN::Inflect

classical()

classical()

In classical mode, the non-anglicized plural form of a word (if one exists) is preferred.

Hence, whereas dogma is normally inflected to dogmas , if classical mode is active it becomes dogmata .

User-defined inflections - the def_...() subroutines

Lingua::EN::Inflect

def_noun($$) The def_noun() subroutine takes a pair of string arguments: the singular and plural forms of the noun being specified. The singular form specifies a pattern to be interpolated (as m/^(?:$first_arg)$/i ). Any noun matching this pattern is then replaced by the string in the second argument. The second argument specifies a string which is interpolated after the match succeeds, and is then used as the plural form. The second argument string may also specify a second variant of the plural form, to be used when "classical" plurals have been requested. The beginning of the second variant is marked by a ' | ' character: def_noun 'cow' => 'cows|kine'; def_noun '(.+i)o' => '$1os|$1i'; If no classical variant is given, the same plural form is used in both normal and "classical" modes. If the second argument is undef instead of a string, then the current user definition for the first argument is removed, and the standard (algorithmic) plural inflection is reinstated.

def_verb($$$$$$) The def_verb() subroutine takes three pairs of string arguments (that is, six arguments in total), specifying the singular and plural forms of the three grammatical persons of verb. As with def_noun() , the singular forms are specifications of run-time-interpolated patterns, while the plural forms are specifications of (up to two) run-time-interpolated strings: def_verb 'am' => 'are', 'ar(e|t)' => 'are", 'is' => 'are'; def_adj($$) The def_adj() subroutine takes a pair of string arguments, which specify the singular and plural forms of the adjective being defined. As with def_noun() and def_verb() , the singular forms are specifications of run-time-interpolated patterns, whilst the plural forms are specifications of (up to two) run-time-interpolated strings: def_adj 'dat' => 'dose'; def_adj 'red' => 'red|gules';

Numbered plurals - the NO() subroutine

PL_

()

"I saw 3 ducks"

print "I saw $N ", PL_N($animal,$N), "

";

Lingua::EN::Inflect

NO($;$)

print "I saw ", NO($animal,$N), "

";

"zero"

"nil"

"no"

$N

I saw no ducks

I saw 0 ducks

"no"

Reducing the number of counts required - the NUM() subroutine

PL_

()

print PL_ADJ("This",$errors), PL_N(" error",$errors), PL_V(" was",$errors), " fatal.

";

Lingua::EN::Inflect

NUM($;$)

PL_

()

NUM()

NUM($errors); # SET DEFAULT NUMBER print PL_ADJ("This"), PL_N(" error"), PL_V(" was"), "fatal.

"; NUM(); # CLEAR DEFAULT NUMBER

NUM()

print NUM($errors), PL_N(" error"), PL_V(" was"), " detected.

" print PL_ADJ("This"), PL_N(" error"), PL_V(" was"), "fatal.

" if $severity > 1;

Interpolating inflections in strings - The inflect() subroutine

PL_...()

To ameliorate this problem, Lingua::EN::Inflect provides an exportable string-interpolating subroutine ( inflect($) ), that recognizes calls to the various inflection subroutines within a string and interpolates them appropriately. Using inflect() plurals can be interpolated directly into a string as follows:

NUM($errors); print inflect "NO(error) PL_V(was) detected.

"; print inflect "PL_ADJ(This) PL_N(error) PL_V(was) fatal.

" if $errors && $severity > 1;

Comparing "number-insensitively" - The PL_..._eq() subroutines

Lingua::EN::Inflect

PL_eq($$)

PL_N_eq($$)

PL_V_eq($$)

PL_ADJ_eq($$)

PL()

PL_N()

PL_V()

PL_ADJ()

The actual value returned by the various PL_eq_ ... () subroutines encodes which of the three equality rules succeeded: "eq" is returned if the strings were identical, "s:p" if the strings were singular and plural respectively, "p:s" for plural and singular, and "p:p" for two distinct plurals. Inequality is indicated by returning an empty string.

Conclusion

It is possible to cater for differences in major usage patterns (for example, modern and classical inflections) and for local differences in dialect (via user-defined inflections). It is also possible to make use of the pluralization algorithms to efficiently detect pairs of words which differ only in grammatical number.

A free implementation of these algorithms is available, and provides additional features such as conditional pluralization (depending on a numerical parameter), setting of default number values, and interpolation of the various subroutines into strings.

References

[1] Wall, L., Christiansen, T., & Schwartz, R.L., Programming Perl, 2nd Edition, O'Reilly & Associates, 1996.

[2] McCrum, R., Cran, W., & MacNeil, R., The Story of English, Penguin Books, New York, 1986.

[3] Bryson, B., The Mother Tongue: English and how it got that way, William Morrow, New York, 1990.

[4] Thomson, A.J., & Martinet, A.V., A Practical English Grammar, Fourth Edition, Oxford University Press, Oxford, 1986.

[5] The Oxford English Dictionary, Second Edition, Oxford University Press, Oxford, 1989.

[6] Fowler, H.W., Modern English Usage, Second Edition, Oxford University Press, Oxford, 1965.

[7] Bierce, A. The Devil's Dictionary, Doubleday, New York, 1911.



Appendix A - Plural categories