Conventions for interlinear morpheme-by-morpheme glosses pdf version About the rules The Leipzig Glossing Rules have been developed jointly by the Department of

Linguistics of the Max Planck Institute for Evolutionary Anthropology

(Bernard Comrie, Martin Haspelmath) and by the Department of Linguistics

of the University of Leipzig (Balthasar Bickel). They consist of ten rules for the

"syntax" and "semantics" of interlinear glosses, and an appendix with a

proposed "lexicon" of abbreviated category labels. The rules cover a large part

of linguists' needs in glossing texts, but most authors will feel the need to add

(or modify) certain conventions (especially category labels). Still, it will be

useful to have a standard set of conventions that linguists can refer to, and the

Leipzig Rules are proposed as such to the community of linguists. The Rules

are intended to reflect common usage, and only very few (mostly optional)

innovations are proposed. We intend to update the Leipzig Glossing Rules occasionally, so feedback is

highly welcome. Leipzig, last change: May 31, 2015

Further updates will be managed by the Committee of Editors of Linguistics Journals. Important references:

Lehmann, Christian. 1982. "Directions for interlinear morphemic translations".

Folia Linguistica 16: 199-224.

Croft, William. 2003. Typology and universals. 2nd ed. Cambridge: Cambridge

University Press, pp. xix-xxv. The rules (revised version of February 2008) Preamble Interlinear morpheme-by-morpheme glosses give information about the

meanings and grammatical properties of individual words and parts of

words. Linguists by and large conform to certain notational conventions in

glossing, and the main purpose of this document is to make the most widely

used conventions explicit.

Depending on the author's purposes and the readers' assumed background

knowledge, different degrees of detail will be chosen. The current rules

therefore allow some flexibility in various respects, and sometimes alternative

options are mentioned.

The main purpose that is assumed here is the presentation of an example in a

research paper or book. When an entire corpus is tagged, somewhat different

considerations may apply (e.g. one may want to add information about larger

units such as words or phrases; the rules here only allow for information

about morphemes). It should also be noted that there are often multiple ways of analyzing the

morphological patterns of a language. The glossing conventions do not help

linguists in deciding between them, but merely provide standard ways of

abbreviating possible descriptions. Moreover, glossing is rarely a complete

morphological description, and it should be kept in mind that its purpose is

not to state an analysis, but to give some further possibly relevant information

on the structure of a text or an example, beyond the idiomatic translation. A remark on the treatment of glosses in data cited from other sources: Glosses

are part of the analysis, not part of the data. When citing an example from a

published source, the gloss may be changed by the author if they prefer

different terminology, a different style or a different analysis. Rule 1: Word-by-word alignment Interlinear glosses are left-aligned vertically, word by word, with the example. E.g. (1) Indonesian (Sneddon 1996:237) Mereka

di

Jakarta

sekarang. They

in

Jakarta

now 'They are in Jakarta now.' Top Rule 2: Morpheme-by-morpheme correspondence Segmentable morphemes are separated by hyphens, both in the example and

in the gloss. There must be exactly the same number of hyphens in the

example and in the gloss. E.g. (2) Lezgian (Haspelmath 1993:207) Gila

abur-u-n

ferma

hamišaluǧ

güǧüna

amuq’-da-č. now

they-OBL-GEN

farm

forever

behind

stay-FUT-NEG ‘Now their farm will not stay behind forever.’ Since hyphens and vertical alignment make the text look unusual, authors

may want to add another line at the beginning, containing the unmodified

text, or resort to the option described in Rule 4 (and especially 4C).

Clitic boundaries are marked by an equals sign, both in the object

language and in the gloss. (3) West Greenlandic (Fortescue 1984:127) palasi=lu

niuirtur=lu priest=and

shopkeeper=and 'both the priest and the shopkeeper' Rule 2A. (Optional)

If morphologically bound elements constitute distinct prosodic or

phonological words, a hyphen and a single space may be used together in the

object language (but not in the gloss).

(4) Hakha Lai a-nii -láay 3SG-laugh-FUT 's/he will laugh' Rule 3: Grammatical category labels Grammatical morphemes are generally rendered by abbreviated grammatical

category labels, printed in upper case letters (usually small capitals). A list of

standard abbreviations (which are widely known among linguists) is given at

the end of this document. Deviations from these standard abbreviations may of course be necessary

in particular cases, e.g. if a category is highly frequent in a language, so that a

shorter abbreviation is more convenient, e.g. CPL (instead of COMPL) for

"completive", PF (instead of PRF) for "perfect", etc. If a category is very rare, it

may be simplest not to abbreviate its label at all. In many cases, either a category label or a word from the metalanguage is

acceptable. Thus, both of (5) Russian My

s

Marko

poexa-l-i

avtobus-om

v

Peredelkino. 1PL

COM

Marko

go-PST-PL

bus-INS

All

Peredelkino. we

with

Marko

go-PST-PL

bus-by

to

Peredelkino. 'Marko and I went to Perdelkino by bus.' Rule 4: One-to-many correspondences When a single object-language element is rendered by several metalanguage

elements (words or abbreviations), these are separated by periods. E.g. (6) Turkish çık-mak come.out-INF 'to come out' (7) Latin insul-arum island-GEN.PL 'of the islands' (8) French aux

chevaux to.ART.PL

horse.PL 'to the horses' (9) German unser-n

Väter-n our-DAT.PL

father.PL-DAT.PL 'to our fathers' (10) Hittite (Lehmann 1982:211) n=an

apedani

mehuni

essandu. CONN=him

that.DAT.SG

time.DAT.SG

eat.they.shall 'They shall celebrate him on that date.' (CONN = connective) (11) Jaminjung (Schultze-Berndt 2000:92) nanggayan

guny-bi-yarluga? who

2DU.A.3SG.P-FUT-poke 'Who do you two want to spear?' The ordering of the two metalanguage elements may be determined by

various principles that are not easy to generalize over, so no rule will be

provided for this.

There are various reasons for a one-to-many correspondence between

object-language elements and gloss elements. These are conflated by the

uniform use of the period. If one wants to distinguish between them, one may

follow Rules 4A-E. Top Rule 4A. (Optional) If an object-language element is neither formally nor semantically segmentable and only the metalanguage happens to lack a single-word equivalent, the underscore may be used instead of the period. (12) Turkish (cf. 6) çık-mak come_out-INF 'to come out' Rule 4B. (Optional) If an object-language element is formally unsegmentable but has two clearly distinguishable meanings or grammatical properties, the semi-colon may be used. E.g. (13) Latin (cf. 7) insul-arum island-GEN;PL 'of the islands' (14) French aux

chevaux to;ART;PL

horse;PL 'to the horses' Rule 4C. (Optional) If an object-language element is formally and semantically segmentable, but the author does not want to show the formal segmentation (because it is irrelevant and/or to keep the text intact), the colon may be used. E.g. (15) Hittite (Lehmann 1982:211) (cf. 10) n=an

apedani

mehuni

essandu. CONN=him

that:DAT;SG

time:DAT;SG

eat:they:shall 'They shall celebrate him on that date.' Rule 4D. (Optional) If a grammatical property in the object-language is signaled by a

morphophonological change (ablaut, mutation, tone alternation, etc.), the

backslash is used to separate the category label and the rest of the gloss. (16) German (cf. 9) unser-n

Väter-n our-DAT.PL

father.PL-DAT.PL 'to our fathers' (cf. singular Vater) (17) Irish bhris-is PST\break-2SG 'you broke' (cf. nonpast bris-) (18) Kinyarwanda mú-kòrà SBJV\1PL-work 'that we work' (cf. indicative mù-kòrà)

Rule 4E. (Optional) If a language has person-number affixes that express the agent-like and the

patient-like argument of a transitive verb simultaneously, the symbol ">" may

be used in the gloss to indicate that the first is the agent-like argument and the

second is the patient-like argument. (19) Jaminjung (Schultze-Berndt 2000:92) (cf. 11) nanggayan

guny-bi-yarluga? who

2DU>3SG-FUT-poke 'Who do you two want to spear?' Top Rule 5: Person and number labels Person and number are not separated by a period when they occur in this order. E.g. (20) Italian and-iamo go-PRS.1PL (not: go-PRS.1.PL) 'we go' Rule 5A. (Optional) Number and gender markers are very frequent in some languages, especially

when combined with person. Several authors therefore use non-capitalized

shortened abbreviations without a period. If this option is adopted, then the

second gloss is used in (21). (21) Belhare ne-e

a-khim-chi

n-yuNNa DEM-LOC

that.DAT.SG

3NSG-be.NPST DEM-LOC

1sPOSS-house-PL

3ns-be.NPST 'Here are my houses.'' Rule 6: Non-overt elements If the morpheme-by-morpheme gloss contains an element that does not

correspond to an overt element in the example, it can be enclosed in square

brackets. An obvious alternative is to include an overt "Ø" in the objectlanguage

text, which is separated by a hyphen like an overt element. (22) Latin puer

or:

puer-Ø boy[NOM.SG]

boy-NOM.SG ‘boy’

‘boy’ Top Rule 7: Inherent categories Inherent, non-overt categories such as gender may be indicated in the gloss, but a

special boundary symbol, the round parenthesis, is used. E.g. (23) Hunzib (van den Berg 1995:46) oz#-di-g

xõxe

m-uq'e-r boy-OBL-AD

tree(G4)

G4-bend-PRET 'Because of the boy the tree bent.' (G4 = 4th gender, AD = adessive, PRET = preterite) Rule 8: Bipartite elements Grammatical or lexical elements that consist of two parts which are treated as

distinct morphological entities (e.g. bipartite stems such as Lakhota na-xʔu̧ 'hear') may be treated in two different ways: (i) The gloss may simply be repeated: (24) Lakhota na-wíčha-wa-xʔu̧ hear-3PL.UND-1SG.ACT-hear 'I hear them' (UND = undergoer, ACT = actor) (i) The gloss may simply be repeated: (25) Lakhota na-wíčha-wa-xʔu̧ hear-3PL.UND-1SG.ACT- STEM 'I hear them' Circumfixes are "bipartite affixes" and can be treated in the same way, e.g. (26) German ge-seh-en

or:

ge-seh-en PTCP-see-PTCP

PTCP-see-CIRC 'seen'

'seen' Rule 9: Infixes Infixes are enclosed by angle brackets, and so is the object-language

counterpart in the gloss. (27) Tagalog b<um>ili (stem: bili) <ACTFOC>buy 'buy' (28) Latin reli<n>qu-ere (stem: reliqu-) leave<PRS>-INF 'to leave' Infixes are generally easily identifiable as left-peripheral (as in 27) or as rightperipheral (as in 28), and this determines the position of the gloss

corresponding to the infix with respect to the gloss of the stem. If the infix is

not clearly peripheral, some other basis for linearizing the gloss has to be

found. Top Rule 10: Reduplication Reduplication is treated similarly to affixation, but with a tilde (instead of an

ordinary hyphen) connecting the copied element to the stem. (29) Hebrew yerak~rak-im green~ATT-M.PL 'greenish ones' (ATT= attenuative) (30) Tagalog bi~bili IPFV~buy 'is buying' (31) Tagalog b<um>i~bili <ACTFOC>IPFV~buy 'is buying' (ACTFOC = Actor focus) Appendix: List of Standard Abbreviations 1 first person 2 second person 3 third person A agent-like argument of canonical transitive verb ABL ablative ABS absolutive ACC accusative ADJ adjective ADV adverb(ial) AGR agreement ALL allative ANTIP antipassive APPL applicative ART article AUX auxiliary BEN benefactive CAUS causative CLF classifier COM comitative COMP complementizer COMPL completive COND conditional COP copula CVB converb DAT dative DECL declarative DEF definite DEM demonstrative DET determiner DIST distal DISTR distributive DU dual DUR durative ERG ergative EXCL exclusive F feminine FOC focus FUT future GEN genitive IMP imperative INCL inclusive IND indicative INDF indefinite INF infinitive INS instrumental INTR intransitive IPFV imperfective IRR irrealis LOC locative M masculine N neuter N- non- (e.g. NSG nonsingular, NPST nonpast) NEG negation, negative NMLZ nominalizer/nominalization NOM nominative OBJ object OBL oblique P patient-like argument of canonical transitive verb PASS passive PFV perfective PL plural POSS possessive PRED predicative PRF perfect PRS present PROG progressive PROH prohibitive PROX proximal/proximate PST past PTCP participle PURP purposive Q question particle/marker QUOT quotative RECP reciprocal REFL reflexive REL relative RES resultative S single argument of canonical intransitive verb SBJ subject SBJV subjunctive SG singular TOP topic TR transitive VOC vocative Top References Fortescue, Michael. 1984. West Greenlandic. (Croom Helm descriptive grammars) London: Croom Helm.



Haspelmath, Martin. 1993. A grammar of Lezgian. (Mouton Grammar Library, 9). Berlin - New York: Mouton de Gruyter.



Lehmann, Christian. 1983. "Directions for interlinear morphemic translations". Folia Linguistica 16: 193-224.



Schultze-Berndt, Eva. 2000. Simple and complex verbs in Jaminjung: A study of event categorization in an Australian language. Katholieke Universiteit Nijmegen Ph.D. Dissertation.



Sneddon, James Neil. 1996. Indonesian: A comprehensive grammar. London: Routledge.



van den Berg, Helma. 1995. A Grammar of Hunzib. (Lincom Studies in Caucasian Linguistics, 1.) München: Lincom Europa. Top