Missionary Dictionaries

Summary and Keywords

Missionary dictionaries are printed books or manuscripts compiled by missionaries in which words are listed systematically followed by words which have the same meaning in another language. These dictionaries were mainly written as tools for language teaching and learning in a missionary-colonial setting, although quite a few dictionaries have also a more encyclopedic character, containing invaluable information on non-Western cultures from all continents. In this article, several types of dictionaries are analyzed: bilingual-monodirectional, bilingual and bidirectional, and multilingual. Most examples are taken from an illustrative selected corpus of missionary dictionaries describing non-Western and languages during the colonial period, with particular focus on the function of these dictionaries in a missionary context, the users, macrostructure, organizational principles, and the typology of the microstructure and markedness in lemmatization.

Keywords: history of linguistics, missionary linguistics, descriptive linguistics, lexicography, colonialism

1. Defining the Genre

In Greece during antiquity, several dialects coexisted, and opaque words of some texts had to be explained to readers who were not familiar with the dialect in question. Such explanations were called γλῶσσαι‎ (‘glosses’). Γλῶσσα‎ (Attic γλῶττα‎) means ‘language,’ ‘tongue’ (as organ of speech), and this concept has also the specific meaning of ‘obsolete or foreign words which need explanation’ (Liddell & Scott, 1953). Greek lexicography starts with the so-called ‘writers of glosses’ (γλωσσογράφοι‎), who arranged their glosses according to local order (as they occur in the text) or alphabetically. Only fragments of these works survived; such dictionaries were called léxeis (λέξεις‎) (Zgusta & Demetrius, 1990, pp. 1696–1697), a single word or phrase is called léxis (λέξις‎). From this term was derived word lexikon (λεξικὸν‎), the neuter form of the adjective lexikos λεξικός‎ (‘of/for words’), which did not have yet the meaning of ‘lexicon’ in antiquity but seems to have been introduced as a translation of the Latin Lexicon graeco-latinum/Λεξικὸν κατὰ‎ στοίχείων‎ (Giovvanni Crastone, 1497 [1578]; cf. Considine, 2008, p. 27), one of the earliest printed books in Greek, containing 18,000 entries. One of the earliest and most important dictionaries of Greek written in premodern Europe has the title Thesaurus Graecae Linguae (Stephanus, 1572). The thesaurus had the aim to be as comprehensive as possible and was more learned and more expensive; shorter and cheaper versions, which appeared mostly in one volume, were called Lexicon, Glossarium, Vocabularius (also Vocabularium), or Dictionarium. Although Latin has several concepts which can be translated into English as ‘word’ (verbum, vox, vocabulum, dictio), several titles are used for ‘dictionaries’ or ‘vocabularies.’

During the Renaissance in Europe, not only the term Lexicon (Nebrija) but also Dictionarium (Calepino) and Vocabularium were used. Similarly, in the New World “Antonio” not only meant a grammar by Antonio de Nebrija but was also the generic term for “any grammar.” The term “Calepinus,” or “Calepino,” not only referred to the Dictionarium of this Italian Augustinian, but outside Europe was used as a synonym for ‘dictionary,’ some of which indeed used Calepino as a model. In Europe, different concepts were sometimes combined in the title, such as Stephanus’ Dictionarium seu Latinae linguae Thesaurus (also known by its subtitle, 1531) or Nebrija’s Lexicon hoc est Dictionarium ex sermone latino in hispaniensem (1492). In Portugal, the same author Bento Pereira (S.J.) produced both a Vocabularium (Pereira, 1647) (trilingual Latin-Portuguese-Spanish) and a Tesouro da língua portuguesa (Pereira, 1647). In Spain, there was also the Tesoro de la lengua castellana o española (Covarrubias, 1611), a monolingual Spanish dictionary with particular attention paid to etymology. In France there were the Thresor de la lengua françoyse (Nicot) and the bilingual English-French Lesclaircissement de la langue francoyse (Palsgrave, 1530), and similar works were published elsewhere in Europe (Considine, 2008, p. 46). Brief dictionaries were sometimes called Dictionariolum (Estienne, 1542; compare Lagunas’s Dictionarito). The Thesaurus had encyclopedic features where lemmas are not merely translated, but also often explained with examples.

Vocabulario is used most often in the titles of Spanish and Portuguese missionary dictionaries. In Latin the term Dictionarium seems to be the most frequent. Some dictionaries appear together with the grammar (‘gramática’ or ‘arte’) under one title: Arte y Vocabulario (Anonymous, 1586; Ruiz Montoya, 1639); Arte y dictionario (Lagunas, 1574). Combined terms appear in the titles Dictionarium siue Thesauri linguae iap. Compendium (Collado, 1632); Japanese (Cardoso, 1569–1570); Lexicon o vocabulario (Santo Thomas). In French, it seems that the term ‘vocabulaire’ is less commonly used than ‘dictionnaire’ (Sagard, 1632; de la Salle de l’Étang, 1763). In Spanish, use of the term ‘diccionario’ spread during the 18th century, probably due to the title of the dictionary of the Real Academia (Cañes, 1787; Larramendi, 1745; Neve y Molina, 1767). Tesoro is used by Ruiz Blanco (1690). Calepino is used by Vivar (ca. 1765). Few titles have totally different terms: Fabrica (Germanus, 1636 and 1639) and Intérprete (González, ca. 1707–2005).

The term Vocabulista was already in use during the Middle Ages, as in the Vocabulista ecclesiastico of father Giovanni Bernardo da Savona, printed in Venice in 1479. There were also Latin-Greek dictionaries in circulation entitled “Vocabulista,” such as Johannes [Crastonus]’s Vocabulista latino-graecus (Vicenza, ca. 1483, an alternative title of Crastonus’s Dictionarium graecum, 1597 [1478], also entitled as Lexicon graecolatinum). These ‘vocabulistas’ are word lists where the lemma is given in the left column and its translation in the right (seldom more than one equivalent), with no further comments or grammatical information.

Missionary DictionariesClick to view larger

Figure 1. Vocabulista latino-graecus (Vicenza, ca. 1483, f. 107r).

The term Vocabulista seems to disappear in the titles of dictionaries after the 16th century (only in the re-edition of Pedro de Alcalá’s dictionary of Patricio de la Torre is the same title conserved). During the 16th and 17th centuries, Vocabulario was preferred, and during the 18th century and later, Diccionario seems to gradually replace the previous terms. The only author who produced different works that used more than one of these terms was Ruiz de Montoya, who wrote both an Arte y Bocabulario and a Tesoro. The Tesoro is much “richer” than the Vocabulario (“porque procurè vestirle con algo de su riqueza, que mi corto caudal ha podido sacar de su mineral rico” [“because I did my best to dress it with some of its richness, that my short yield could have taken from its rich mineral”]: Ruiz Montoya, 1639, “Prologue,” without pagination). The Vocabulario is bilingual and mono-directional Spanish-Guarani, while the Tesoro is bilingual and mono-directional Guarani-Spanish, which means that the book where the indigenous language comes first is much larger than the one where Spanish comes first (see section 3.2). According to the author, only the translation is given of the Spanish lemma (‘hombre’ = Abá) in the Vocabulario, whereas in the Tesoro the user can search the lemma Abá and find not just the translation in the opposite direction, but anything that can be said about the man (“y alli hallare lo que dize del hombre,” Ruiz Montoya, 1640, “prologue”). The Tesoro provides first the translation ‘hombre,’ ‘persona,’ followed by adjectives such as ‘honrado’ (‘honorable’), ‘hermoso’ (‘good-looking’), ‘flojo’ (‘weak’), ‘los amujerados’ (‘womanish’), or ‘aniñado’ (‘childish’), followed by a series of phrases and expressions.

Copia is a term frequently used by Renaissance rhetoricians. According to Considine (2008, p. 47), it means a “sort of rhetorical wealth that enabled a person to expand a simple argument into a sophisticated ordering.” The term seems to be only used in the title of the grammars of Bertonio (for Aymara) and not the dictionaries. Copiousness was not only described in the dictionaries. Molina’s lexicographical work first appeared in 1555 as a bilingual and mono-directional Spanish-Nahuatl dictionary. In 1577 he published a bidirectional bilingual dictionary, and in the prologue of the second part (Nahuatl-Spanish) he tells the reader that he also adds sometimes entire phrases or manners of speaking (“maneras de hablar”), which according to his view exceed the limits of a Vocabulario (“esto parezca exceder los terminos de vocabulario”) (Molina, 1571, Prólogo,” fifth admonition).

2. Users

One of the principles of bilingual lexicography (Kromann, Riiber, & Rosbach, 1991, pp. 2713–2714) is that a distinction has to be made between mono-functional bilingual bidirectional or mono-directional dictionaries and their bifunctional counterparts. The first is conceived as a tool for only A-speakers, the second for both A + B speakers. The majority of missionary dictionaries seem to be designed for just one group, missionaries. Nevertheless, there seem to be exceptions: Pedro de Alcalá observes in his prologue (1505) that his work was written for both Old Christians (‘viejos cristianos’) and new converts (‘nuevos convertidos’). The Old Christians, also called ‘aljamiados’ (by the Arabic-speaking population, a term which means that they spoke ‘foreign,’ i.e., non-Arabic, languages—in Spain, Romance/Spanish). Old Christians could find the Arabic translation of Spanish words “coming from Spanish into Arabic” (“viniendo del romance al arauia”), and at the same time, the Arabs (“arauigos,” or “New Christians”), if they learned the Castilian letters (“letra castellana”), they could find the Spanish equivalent for an Arabic word. It is a remarkable observation, since the dictionary is only mono-directional Spanish-Arabic. A potential user from a Muslim background, motivated by mere curiosity, could leaf through the dictionary and might find the desired word, but it is doubtful if there were indeed users from a Muslim background who really learned Spanish using Alcalá’s Vocabulista. It seems more an unrealistic ideology expressed by the author. It is possible that Alcalá had some plans to compile another (or a third) dictionary in the future, as he observed in the prologue of the Vocabulista (“para que pueda hazer otra segunda o tercera obra”). Alcalá finished his linguistic studies when he was already in “the third part” of a life-span. Unfortunately, his plans were never realized.

A similar observation is documented in Domingo de Santo Tomás’ dictionary of Quechua (1560). In the Prologue, he gives some details about his target groups. In the first section, Spanish comes first, followed by equivalents in the language of the Indians (Quechua). This section is mainly written for users who know Spanish and not Quechua (“por que el que sabe la de España, y no la dellos”). The second section is the reverse: first Quechua (“lengua Indiana”), followed by Spanish, and is written for (1) those who already know Spanish and (2) those who do not know Spanish yet. In San Buenaventura’s (1613, p. 619) prologue of the second part, where Tagalog comes first, there is a similar observation; one of the objectives in writing this section was to make “the natives learn the Castilian language easily” (“porque los naturales puedan deprender con mas façilidad a hablar la lengua castellana”). These examples demonstrate that at least some lexicographers did not write dictionaries exclusively for missionaries.

The Dictionarium Latino Lusitanicum, ac Iaponicum (Anonymous, 1595), was published in 1595 in Amakusa. Since the Latin entries are given first, followed by the corresponding Portuguese and finally the Japanese equivalents, it is indeed intended for learners of Japanese who already know Latin, or speakers of Japanese who want to translate Latin words. Probably some Portuguese missionaries were not fluent in Latin, which explains why Portuguese comes between. It is not probable that Portuguese was included for the Japanese natives in order to teach them Portuguese.

Most dictionaries mention that the work has been written exclusively for priests. Some prologues are dedicated explicitly to the priests of the Aymaras, as in Bertonio “A los sacerdotes, y curas de la nacion aymara” (Bertonio, 1612), or in the prologue of the second part of Molina (1571). Molina tells his readers that he wrote his dictionary not only for those who know Latin, but also for those who do not (Molina, 1571, “prólogo”: aviso duodecimo). Not all the works analyzed in this article are written for priests. In Sagard’s Dictionnaire de la langve Hvronne (1632), the author explains that his dictionary contains a “few basic sentences useful for traders, settlers, or missionaries” (Hanzeli, 1969, p. 55). Missionaries who compiled dictionaries of languages which were also studied at academic institutions did not mention missionaries as the only target group. In their prologues, Cañes emphasizes that his work is also useful for those who want to read texts in Classical Arabic (Cañes, 1787).

Considering the dichotomy between what has been called Gebrauchslexicographie (“users’ lexicography”; mainly Grundwortschatzwörterbücher (“dictionaries for elementary vocabulary”) and Dokumentationslexikographie (“documentary lexicography”) (Kühn, 1990, pp. 1354–1355), one can conclude that the works of missionaries oscillate between these two extremes; the majority of the dictionaries under study have a mixed character. On the one hand, particularly in Asia, some dictionaries are like encyclopedias, containing information about Asian cultures, religions, flora, and fauna. Anonymous (1603–1604) is arranged alphabetically starting with Japanese, followed by the Portuguese equivalents. This dictionary was not only intended as a tool for missionaries to learn the Japanese language. It is also a huge compilation of contemporary ethnographic knowledge about Japan, including several technical terms from Buddhism and Japanese literature. At the other extreme, missionaries compiled short word lists where only the most important vocabulary is included: body parts, counting, measures and weights, etc. The more comprehensive volumes were more expensive, and many users preferred to have a pocket version for their specific needs. Bertonio’s dictionary has a mixed character. On the one hand it attempts to demonstrate the “copiousness” of Aymara, and on the other, he tells the students that they do not have to learn all the words included for the teaching of the sacred mysteries of the faith (“no es necesario saberlos todos para enseñar nuestros sagarados misterios”).

3. Macrostructure

3.1 Number of Entries

Important information about the length of dictionaries is given in Smith-Stark (2009) and Fernández Rodríguez (2014).

Table 1. Meso-American Dictionaries (Selection)


The Other Language





Nebrija (1492)



Nebrija (1495)



Alcalá (1505)



Anonymous (1545)



Olmos (1547)




≈1,000 (?)

Molina (1571)





Gilberti (1559)





Ara (ca. 1571)





Lagunas (1574)


Córdova (1578)



Alvarado (1593)



Urbano (1605)



Anon. Tzotzil



Source: Smith-Stark (2009, pp. 25–26).

Fernández Rodríguez (2014, p. 6) published a table which contains the approximate number of entries in Philippine sources. Selected here are only those included in this article’s corpus:

Table 2. Philippine Dictionaries (Selection)


Indigenous Language

San Buenaventura (1613)

16.350 (53%)

14.500 (47%)

Méntrida (1637)

10.900 (38%)

18.000 (62%)

Vivar (ca. 1797)

8.040 (100%)

Source: Fernández Rodríguez (2014, p. 6).

The Vocabulario da lingoa de Iapam (1603–1604) with 32,800 lemmas is one of the most extensive dictionaries of the corpus of the works analyzed for this article. The trilingual Dictionarium latino-lusitanicum (1595) contains 30,000, and de Rhodes’ Vietnamese-Portuguese-Latin dictionary 8,000, more or less the same as Collado’s Japanese-Latin dictionary.

3.2 Directionality and Proportionality

It has been observed that Nebrija’s two volumes are not simply reversed versions of one another. Bilingual dictionaries can have the same content when they change direction (the type of Spanish ‘montaña’ = English ‘mountain,’ and in the other direction vice versa). Nevertheless, a dictionary could also give in the second case English ‘mountain’ = Spanish ‘monte,’ ‘montaña.’ Palencia’s dictionary describes the entry word in Latin on the left column, and on the right the Spanish translation is given. Consider the entry ‘mountain.’ In the Latin column, information is given related to inflection (mons, montis [noun] of the masculine gender of the third declension), followed by the description in Latin. In the Spanish column, the grammatical information is not repeated. Other (near) synonyms are also given, such as collis and derivations, such as the diminutive monticulus, and other derived forms montanus, montuosus.

Palencia left column in Latin

Palencia right column in Spanish

Mons montis. masculini generis tertie declinationis: est altitudo magna que subiectam terram vallem efficit: vel planam respectu sui. cuius partes sunt tumulus: cliuus cacumen vertex: et collis: de quibus loco suo dicitur. Mons. per antiphrasin a motu. quia non mouetur. per diminutionem monticulus: paruus mons: et possessiue: Montanus. na. num. quod in montibus gignitur vel versatur: vel est: vt loca montana: vel homo montanus vel lupus. Et montuosus. sa. sum. Aridum asperum: incultum arduum preruptum

Mons. Es grande altura de monte que faze ser valle la tierra a el subiecta: o que sea llanura a su respecto. Las partes que tiene el monte son otero: cuesta cumbre: çima: e collado: de los quales nombres se dize en su logar. et por el contrario se dize el monte de mouer por que no se mueue. Su diminutiuo es montezillo. Y el possessiuo: es montano: lo que se cria o anda o hay en el monte como logares montanos: o ombre: o lobo. Et llamamos montuoso a lo seco: et aspero: et non labrado: et enhiesto: et roqueño.

Source: Palencia (1490, fol. cclxxxvii–v).

In Nebrija (1492) there is an entry “mons, montis. por el monte,” whereas the reverse edition not only translates Spanish ‘monte’ into Latin as mons, montis but also another word is given, collis (‘hill’). Words are given in the nominative and genitive. There is also an entry ‘montaña,’ which is translated as plural montes montium, together with Nemus, oris (‘wood, or mountainous pasture land’). Nebrija did not put all the entries in the opposite order.

In the Spanish-Nahuatl dictionary of Molina, in the first section, Spanish-Nahuatl, the entry ‘monte’ is translated as tepetl. In the second part, the entry tepetl is rendered as ‘sierra.’ Again, the lexicographer did not use the same material in reverse order. Sometimes this may have caused confusion among learners. In Martínez’s dictionary of Quechua (1604), the translations given in the two sections even give different meanings. Searching for ‘mountain’ in the second part, Spanish-Quechua, one finds the lemmas “Montaña,” Acha acha, and “Monte cerro,” Orco, but the first part has Hacha “Arbol” (“tree”) and Orco “Animal, macho” (not translated as ‘mountain’).

When dictionaries are bilingual and bidirectional, the two sections are seldom equal in length. In Molina’s dictionary, the second part, Nahuatl-Spanish, contains 162 folios and the first part 121. On the other hand, Basalenque’s bidirectional Matlalzinca-Spanish part is less extensive than the Matlalzinca-Spanish. San Buenaventura’s Vocabulario (1613) contains 707 pages, 618 in the first section, Spanish-Tagalog, and 89 in the second, Tagalog-Spanish, in three columns. The first monodirectional Tagalog-Spanish dictionary, composed before 1620 by the Franciscan Francisco de San Antonio, was never printed during the colonial period, but circulated as manuscript (Postma, 2000). Méntrida’s Bocabulario (1637) is also divided in two parts. The first, Spanish-Visayan, contains a hundred pages in two columns; the second part, Visayan-Spanish, is six times bigger. Vivar’s Calepino ylocano o vocabulario de yloco en romance . . . (ca. 1797) and San Antonio’s Vocabulario tagalo are one of the few monodirectional vocabularies written in the Philippines. In New Spain, particularly in dictionaries of Mayan languages, bilingual mono-directional dictionaries which start with the indigenous language are common (Ara’s dictionary of Tzeldal).

De Rhodes compiled a trilingual dictionary which is not tridirectional but bidirectional. The first and main section contains the dictionary proper, Vietnamese (called lingua annamitica)–Portuguese–Latin. The book has no page or folio numbers, but each page is divided in two columns, which are numbered separately. This section contains 900 columns, which corresponds to 450 pages or 225 folios. The second section is an index, called Index latini sermonis, where Latin entries are given, followed by a column number referring to the first section. The number of the column is often accompanied by the abbreviations p, m, or f., which mean ‘beginning’ (principium), ‘middle’ (medium, in medio columne), or end (finis); these are used to find the corresponding lemmas more easily. The Index contains 352 columns on 176 pages (not numbered), but on top of each page headers are added with the two initial capitals of the entries. In the index for the entry mons there are the following references: mons, montis (f. 574), mons aliquis terrae in ipso flumine, (m. 693) (‘a certain mountain of the earth in the river’). Mons altus (‘high mountain’) (p. 575) montes, & tesqua (f. 517, p. 570, p. 575) (‘wild regions’), Montes in quibus habitant Barbari, (‘mountains where the Barbarians live’) (f. 541), Monticulus (‘little mountain,’ diminutive of ‘mountain’) (p. 481), and monticulus orizae (‘little mountain where rice is cultivated’) (f. 481). The learner could find the Vietnamese translations on the corresponding pages, and it is obvious that the European models are extended here, since particular mountains (such as ‘mountains of rice’) are included as well: (món gao: ‘montinho de arròs, monticulus orizae’) (f. 481). The trilingual dictionary of de Rhodes does not have a section starting with Portuguese lemmas.

Some bilingual dictionaries in Asia are in fact not strictly bilingual. As has been demonstrated, the manuscript Marsh 696 (Vocabularium Hispanico-Sinense, anonymous, but attributed to the Dominican Francisco Díaz, contains a great number of translations from Chinese to Portuguese, and not to Spanish. The Portuguese version of Varo’s Spanish-Chinese dictionary (Borg. cin 420) includes translations into French. The Portuguese-Chinese dictionary of Ricci & Ruggieri contains not only Portuguese but also Italian entries. Such manuscripts circulated among missionaries from different nations, and manuscripts often contain sections written by different scribes (Zwartjes, 2011, p. 286).

For other parts of the world, an example is Leem’s monumental Lexicon Lapponicum bipartitum Danico-Latin-Lapponicum cum Indice latino (1768). As the title demonstrates, the book has two sections; the first starts with Sami (lingua laponica), followed by Danish and Latin translations (1,610 pages), and the second part (1781), Danish-Latin-Sami, contains 512 pages in two columns, to which an index is appended in Latin with corresponding references to the other sections.

Sagard’s dictionary of Huron has only a section starting with French entitled Les mot françois torunez en Huron and is in fact a collection of useful phrases, not a dictionary in the strict sense. Larramendi’s Diccionario trilingüe del castellano, bascuence, y latín (1745) puts Spanish first, followed by Basque, and then the Latin is given. In another section, Basque comes first, followed by Spanish and Latin. Similarly, in the Dictionarium of de Rhodes (1651) Vietnamese came first, followed by Portuguese and finally Latin. The Spanish lemma “Monte” is translated into Basque mendia and then as Lat. Mons. The pages are arranged in two columns (1745, p. 97).

Egede’s trilingual dictionary has entries arranged alphabetically in Greenlandic, followed by translations into Norwegian/Danish and then Latin (207 pages in two columns) and another index of Latin entries, with references to page numbers followed by the abbreviation “a” for the left and “b” for the right columns. As in Rhodes, the dictionary also has a Register of Norwegian words, with corresponding page numbers in the main section.

3.3 Organizational Principles

3.3.1 Alphabetical Order

Most dictionaries are organized in alphabetical order. In Nebrija’s Spanish-Latin dictionary, the model of the majority of Hispanic missionary lexicographers, the alphabet is given in the prologue: a, b, c, ç, ch, d, e, f, g, h, i, j, l, ll, m, n, ñ, o, p. r, s, t, u, v. This means that the digraphs <ch> and <ll> are considered as different letters, and the consonants with diacritics <ç> and <ñ> are also arranged separately. More or less, this is the main tendency in most Meso-American dictionaries. For reasons of space, only a few examples will be given here, cited from Smith-Stark (2009, p. 53): “Gilberti distinguishes between the graphs t, tz, ts, and their aspirated counterparts th, thz, ths. However, they are ordered first according to the following vowel and then according to the three points of articulation, but without taking into account aspiration.” The Tarascan-Spanish section is arranged alphabetically as follows: a ca ch c/ç co cu e h y m n o p q s t(h)a t(h)za t(h)e t(h)ze t(h)i t(h)zi t(h)si t(h)o t(h)zo t(h)u t(h)zu v/u x (Gilberti, 1559; for more details see Smith-Stark, 2009). The anonymous Diccionario Grande de la lengua de Michoacan (Anonymous, 1991) decided to consider <tz> a separate letter, which appears after the letter <x>. Basalenque (1642) did exactly the same in his Matlazinca-Spanish dictionary. In Michoacán, apparently a different tradition was developed, different from the Náhuatl-Spanish lexicographical tradition, starting with Molina (1555, 1571), where <tz> appears alphabetically in the chapter of the letter <t>. In the Philippines, some authors created a special chapter for the digraph <ng> and others did not (Fernández Rodríguez, 2014, p. 22).

Alphabetic sequences in tonal languages such as Chinese presented a great challenge for missionaries. Díaz arranged his dictionary alphabetically according to a system called ‘cabecillas’ (‘little heads’). This specific type of vocabulary was also used in other sources, such as the 17th-century dictionary of Ventallol (Klöter, 2007, p. 197), which has the title Cabecillas, o léxico del dialecto de Emuy, o del mandarin. Unfortunately, this work has been lost.

The design of a ‘cabecilla’ is as follows. The lemmas are arranged alphabetically according to the romanization of Chinese, which was quite detailed. Next to the romanized form, the Chinese character is given. The same romanized form in transcription is often accompanied by several different homophonous Chinese characters, each of them translated into Spanish. One example: when <kuei> is written without diacritics, the four letters could represent 58 different unrelated meanings, corresponding to 58 different characters. Written with a macron, Díaz distinguishes 7 different characters, with a grave accent 11, with an acute accent 16, with macron + superscript c 9, with circumflex and superscript c next to the circumflex 6, with a grave accent combined with superscript c 3, and finally, with an acute accent combined with the superscript 8 characters.

Missionary DictionariesClick to view larger

Figure 2. Díaz, Ms. Jagiellońska Library of Kraków (ca. 1642, f. 352 ff.).

Not all the words have all the possible combinations of diacritics, but they are always arranged in the same sequence. Rhodes’s dictionary of Vietnamese uses a similar procedure. Vietnamese words written with the same letters with different diacritics corresponding to different tones are always arranged in the same order, as in the entries gai (‘espinhos,’ spina), gài (‘amarrar,’ ligo as [=first and second person sing. indic. present of the verb ligare], gái (‘femea,’ foemina), gải (‘cocarse,’ frico, as) gây gêm (‘fazer vinagre,’ acerum conficere), gầy (‘magro, desfeito,’ macilentus), gấy, gà gấy (‘o cantar do gallo perto da menhãa,’ galli cantus), and gậy (‘bordão,’ baculus) (de Rhodes, 1651, columns 255–257). In Meso-America, there is no comparably rich system of diacritics for the description of tones in tonal languages, although some grammars were quite advanced in establishing the phonemic value of unknown consonants.

3.3.2 Ordering by Parts of Speech

Pedro de Alcalá’s dictionary is based on Nebrija’s Spanish-Latin dictionary, as is stated in the prologue. There is an important difference: Alcalá gives first all the verbs starting with the letter A, then the nouns, then the adverbs, and so on. Olmos’ glossary only contains verbs, and a dictionary of the Mayan language po3om contains a “Vocabulario de nombres que comienzan en romance” and another section entitled “vocabulario de adverbios preposiciones y conjunciones” (Morán & Zúñiga, 1991). There are also grammars called Arte which contain chapters devoted to the eight traditional parts of speech. In particular, the four indeclinable parts, the preposition, adverb, interjection, and conjunction, are often in fact brief alphabetically arranged glossaries. Sometimes the author makes a compilation starting with Spanish (or Latin) prepositions translated into the indigenous language, and in other cases vice versa, as occurs in Basalenque’s Arte of Matlazinca (1640). In this case, some lexicographical material is documented and described outside the dictionary proper.

3.3.3 Ordering by Word Endings

Olmos’ vocabulary only includes verbs, and these are grouped according to their terminations (Smith-Stark, 2009, p. 55). This practice was an exception in Meso-American lexicography.

3.3.4 Semantic Ordering

Although Arenas’ vocabulary was not a missionary dictionary, it is arranged thematically. In the Franciscan tradition of grammars and dictionaries describing Arabic (mainly from Damascus), Bernardino González’s Intérprete follows alphabetical order, the first section Spanish-Arabic according to the Spanish alphabet and the second following the Arabic alphabet, but one would expect the book to start on the opposite side, with all the material in reverse order (seen from a Western perspective). This is not the case. The author opens his work with the alif of the Arabic alphabet in the same place as the Western alphabet would put the A. The book is also arranged according to the root and its derivations, which means, for instance, that the word for ‘key’ (Spanish ‘llave’), in Arabic miftāḥ, is not included under the letter m- of the instrumental prefix (a ‘servile’ letter) but under the first radical, the f- (root f-t-ḥ). One of his pupils, Lucas Cavallero, copied more or less the grammatical treatise of his teacher, but his dictionary is arranged differently; it gathers highly colloquial lemmas of the Syro-Lebanese sedentary dialect and is arranged thematically, starting with the Creator and religious terms, followed by human beings, animals, inanimate objects, etc. This order is also applied in other parts of the world. The anonymous Hindi grammar discussed in Zwartjes (2011) occupies 51 pages of a larger work, in the rest of which (pp. 52–136) word lists are included, arranged thematically. In fact, the anonymous Portuguese work resembles the word lists of Joan Josua Ketelaar (1659–1718), which are arranged according to the order of the Latin reader Ianua linguarum resereta of 1631 by Jan Amos Komensky, known as Comenius (1592–1670) (Zwartjes, 2014, pp. 275–276).

4. Microstructure: Typology of the Lemmas

4.1 Nebrija en Calepino

Smith-Stark (2009) and Fernández Rodríguez (2014) distinguish two entry styles: the first is mainly based on Calepino, the second on Nebrija. The first has a more descriptive, encyclopedic or anecdotal character, the second is more “functional,” with entries with simple equivalences. In the 18th century, when the Real Academia was founded and its dictionary published, lexicographers preferred to take this text as their model. It is obvious that such a dichotomy is a simplification of the facts. For instance, James (in Zwartjes et al., 2009) distinguished ten different entry styles in Tamil lexicographical works.

Molina, Gilberti, Córdova, Alvarado and Urbano, and Molina were used as models for some dictionaries in the Philippines (Méntrida; cf. García-Medall, 2009, p. 155). In other parts of the world, particularly outside the Spanish territories, sources other than Nebrija were probaby used, but more or less the same dichotomy applies: sources which have mainly simple equivalences, and those which are more descriptive or encyclopedic (“Gebrauchslexicographie” and “Dokumentationslexikographie”; Kühn, 1990). Some examples follow:

Calepino (1502)


Mons terrae tumor altissim. Ab eminendo quasi eminens dictus: propter quod omne quod eminet mons dici potest. Virgili. Praeruptus aquae mons. Sumitur quandoque pro saxo. Idem. . . . Sic aliquando saxum pro monte ut apud eundem Saxi de uertice pastor. Huic Septimontium locus: qui habet septem montes.

Monticulus diminutiuum: paruus mons. Dicitur & Monticellus.

Nebrija 1492

Mons montis. por el monte

Nebrija c. 1495

Monte. mons montis. collis collis

Montaña. montes montium, nemus. Oris

Some examples from Spanish missionary dictionaries according to the style of Nebrija are Molina or González Holguín, where simple equivalents are given without any explanation or contextualization. Ruiz Montoya in his Vocabulario follows the style of Nebrija, and gives an enormous amount of extra information in the Tesoro. Others follow the entry-style of Nebrija and expand the number of entries, often including culturally specific information in their translation, as Varo did in his grammar of Mandarin Chinese:

Santo Tomás (1560) (Quechua)

Monte çacha çapa

Molina (1571) (Nahuatl)

Monte tepetl

Montaña o montañas tepetla. quauhtla

Martínez (1604)

Primera parte:

Hacha Arbol (not translated as ‘mountain’)

Orco Animal, macho (not translated as ‘mounain’)

Montaña. Acha acha.

Monte cerro. Orco

González Holguín (1608) (Quechua)

Montaña hachha hachha

Monte o cerro orcco

Montes juntos orcco orcco

Ruiz Montoya (1639, 1640)

1640 (vocabulario)

Monte. Caá

Monte alto Caa ĭbaté

Monte espeso. Caa anã

Monte grande. Caa guaçû

Monte ralo descombrado. Caá catuobá. Caa guĭpe ĭ

1639 (Tesoro)

(three columns with a large amount of examples, expressions, near-synonyms, collocations, etc.)

Varo (ca. 1680)

Montes de minas de plata (mountains containing silver mines

Monte [mountain] xān/lîng numeral de montes [numerical counter for mountains]. Têu’

Montes altos

Con cuestas dificiles

Espesos con arboles

Montes continuos

Cordillera de montes

Montes que diuiden la Jurisdiccion [mountains which mark the boundary of a district or jurisdiction]

Monte tres los mas altos dela China [the three highest mountains of China] kuēn; lûn

Montes sinco de fama [the famous five mountains]. gù iŏ

Monte que todo es piedra sin yerba [a mountan which is entirely of stone with no plant]

Monte el mas afamado en la Provincia de xān tūng

One example of an entry which follows Calepino’s style:

San Buenaventura (1613)

Monte) Bondoc (pc) alto y espeso, nasabondoc, esta en el monte, bondoc nang bondoc yaring daã, todo es montes y mas montes este camino, na doon sa paa nang bondoc, alla esta al pie del monte.

Monte) Golor (pp) bajo que bondoc, es alto, indi bondoc natotoca, at golor na lamang, no es monte muy alto sino medianoi, yari ñga ang totoong golor, este es monte mediano propiamnete.

Monte) Gubat (pp) de arboles y espesura, nasa gubat at nagtataga nang bangca, esta en el monte cortando de que haçer vn nauio, nasagubat at hungmaniia nãg calap, esta en el monte arrastrando madera.

Monte) Damu (pc) o espesura del, nasa damut cqingmicqita nang cacalapin sa bahay, esta en el monte buscando madera para haçer vna casa, o para adreçar la hecha.

Monteçillo) Bondochondocan (oo) monte peuqueño, diminutivo de bondoc.

Examples of functional approaches, with simple equivalents in other parts of the world:

Xavier (?) (Portuguese- Persian- Hindi)

Monte grande—pahâr, paârhât—qou, qûh

Monte pequeno—du’gar, ttihâ

Monte de terra ma’timento—đđer—toudâ, ga’ġ


Maunoir (1659) Breton

First section Français-Breton

Mont menez

Second section:

Menez montagne

González (ca. 1707) (Arabic)

First section:

Montaña جبال جبل‎ (singular and plural are given)

Second section

Monte جبال جبل‎ (sing + plur.)

Da Lecce (1702) (Albanian)

Montagna—małł., i it.

Monte erto—małłi in alt

Cima del monte—maicua e małłit

Costa del monte, Piede del monte, etc.

An example of an entry with a more encyclopedic character is Egede’s dictionary of Greenlandic. As one would expect, the word ‘iceberg’ is included:

Egede (1750) (Greenlandic)

Second section (Latin index):

Mons, 59a, glacialis, 41b. mari imminens, 64b, montem conscendit speculandi causa, 126a montes tumidi, editi, 187a, montium cacumina, 45b.

First section (Greenlandic–Norwegian/Danish–Latin)

Kakkak, Bierg, mons

Kakkársoak, et stort Bierg, mons magnus

Kakkáliak, et gjort Bierg, Taarn, mons artificialis, turris

Kakkîpok S.s.s. 1. Gaaer op paa et Bierg, montem conscendit, 2. Lagt paa Land in terram expositus, . . .

Illuliak et Iis=Bierg, mons glacialis, glacies in montis altitudinem acervata

Sagard’s approach is different. There is no entry for ‘montagne,’ but the reader can find it under the lemma Monter, descender, where it is included, together with examples which he considered useful for a conversation, a coalescence of two genres, the Renaissance colloquia merged into the dictionary:

Sagard, 1632 (Huron)

Monter descender


Quieinontou te



Le monte, il monte la montagne


Ie monte en haut. 3 per.

Aratan achahouy

N. Sçais-tu bien monter? Y monteras-tu bien?

N. Chieinhouy daaratan.

Les ames de Husrons ne sçauroient monter.

Téhouaton atiskein déhouandate haraten.

[more entries follow related to the antonym ‘descendre.’]

4.2 Use of Non-Western Writing Systems

Missionaries often used foreign scripts when available, but not always. There seems to have been no general policy among missionaries of the different orders and national traditions. Pedro de Alcalá worked with transliterations of Arabic, whereas others also used Arabic script. Xavier transcribed Persian and Hindustani, as in the anonymous grammar of Hindustani published by the Propaganda Fide. González used Arabic script in his dictionary of Arabic arranged according to the letter order of the Arabic alphabet. Antão de Proença used Tamil script, and the Tamil entries are arranged according to the Portuguese letter order, just as if they were Portuguese entries (“como se estiuesse nas lusitanas”). The Chinese dictionaries of the Dominican Francisco Díaz use Chinese characters, but they are arranged alphabetically according to their romanizations. Francisco Varo, also a Dominican, used only romanizations for Chinese, as in the Japanese-Latin-Portuguese dictionaries of the Jesuits. Local scripts were sometimes used in the Philippines (baybayin), but others were against their use. For reasons of space it will not be possible to give a comprehensive overview, but it has to be mentioned that missionary sources often contain invaluable information about phonology when the Roman alphabet is used, in cases when other alphabets or writing systems do not supply the necessary details in order to be able to reconstruct the status quo of the languages or varieties of that time. For this reason, missionary documentation of spoken Chinese, Japanese, Arabic, Persian, and the like deserve to be studied.

4.3 Markedness in Lemmatization

4.3.1 Grammatical Information, Morphological, Derivational Information, Parts of Speech, Citation Forms

Missionary DictionariesClick to view larger

Figure 3. Nebrija (1492, Prologue).

As has been seen in previous examples, grammatical information can be added, and the degree to which such information is included is far from consistent. It is often sporadic, as in the case of Nebrija (Smith-Stark, 2009, p. 43), but in other cases it is more systematic. The citation forms in Nebrija are the first person singular of the Latin verbs of the present tense indicative; in the Latin-Spanish section, the verb dare (‘to give’) has the form (do) followed by the second person, with only its ending (as), and the first person singular of the perfect (dedi), as in sum, es, fui, which is frequently used as the title of chapters in the grammars where equivalents of the verb ‘to be’ or discussed. For (substantive) nouns, Nebrija gives the nominative and the genitive singular (Claritudo, inis), and for the (adjective) noun the nominative singular of the masculine gender, followed by feminine and neuter. ( In addition to this information related to the inflectional pattern of the entries in question, an elaborate series of abbreviations is used (see Figure 3).

For learners of Arabic, for instance, the plural of the nouns is difficult to predict, and for that reason, the plural form of the nouns is systematically given in Arabic grammars:

Alcalá (1505) Arabic (colloquial)

Monte gébbel [sic] gibĭl

Montaña gébel gibĭ

González (ca. 1707) (Arabic)

Montaña جبال جبل‎ (singular and plural are given)

As has been seen, Alcalá’s dictionary is arranged alphabetically, according to parts of speech. This also occurs in Europe, where a comparable approach is found in the bilingual English-French work Lesclaircissement de la langue francoyse (Palsgrave, 1530). Alcalá gives always three forms of the verb, the first person of the present tense, the “preterito perfecto,” and finally the imperative, as in:

Alcalá (1505) Arabic (colloquial)

Abaxar nahbát habátt ahbát

Abaxar algo nihappát happátt happát

In Spanish grammars of Amerindian languages, information is also given regarding the citation forms of the entries. Santo Thomás (1560) gives the first person singular of the present tense of the indicative. All the other forms are derived from this base form. Santo Thomás also explains that he does the same in Quechua as Nebrija did for Latin: after the first person, the second person is given, but only the last syllable is given, not the complete word with its inflectional ending. In the 16th century, the isolation of the smallest grammatical or functional unit—particularly since the concept of the morpheme did not yet exist—was far from advanced. In Latin grammars written in Europe, there are entries such as do, as (first person singular indicative present of the Latin verb dare ‘to give,’ and secondly, only the ending -as for the second person, omitting the d- from the stem). Latin dictionaries in Europe also give nouns together with the genitive singular, but the exact morphemes are not always consistently divided (the types ars, tis, actor, oris), as occurs in grammars of Greek, where usually the infinitive is not given. Many Amerindian languages did not have infinitives in the Greco-Latin sense, but only use certain nominalizations of verbal roots. In Molina, unsystematic details about parts of speech are given, as in Nebrija:

Presto, aduerbio, yciuhca, çaniciuhca

Poder nombre, uelitiliztli, ueliyotl, uelitilizçotl

Poder verbo. niueliti (examples cited from Smith-Stark 2009, p. 43).

Molina’s dictionary contains attempts to separate the root from its prefixes, and after the entries, the prefixes are given, followed by the Spanish translation and the Nahuatl form in the preterite tense:

Pia. nino. ‘guardarse de algo’ pret. oninopix.

Pia. Nite. ‘guardar a otro,’ prete. onitepix

Pia. Nitla. ‘guardar alguna cosa’ preter. onitlapix.

Molina’s second part, Nahuatl-Spanish, does not generally include separate bound morphs as independent entries (such as tla-, te- ni-). As is the case for the learner of Greek, who must know Greek morphology in order to understand the structure of the form elelukè, the learner of Nahuatl had to learn morphology first from the Arte in order to be able to have a good insight into the internal structure of Nahuatl words. There is no doubt that the study of Semitic languages in Europe gave a new stimulus to the study and analysis of the internal structure of words (Reuchlin, 1506). Priscian’s concept of dictio (the dictionarium actually means that it is a compilation of dictiones) is defined as pars minima orationis constructae. The root in Semitic languages is something different (radix). The root is formed by ‘radical letters’ (littera radicalis, or truncalis) containing a semanteme, to which ‘servile letters’ can be added. In one of the earliest grammars of Hebrew published in Europe, there are attempts to analyze the internal structure of words such as inhonorificabilitudines; another step forward in Hebrew grammars was the distinction between pronomina separata and pronomina affixa. Some missionaries had a certain knowledge of Hebrew and could advance in the study of the internal structure of the word. Lagunas (1574) is one example, the author of a grammar and dictionaries of Tarascan. His dictionary is arranged according to the roots, as in dictionaries of Hebrew or Arabic.

On the other hand, some dictionaries exhibit admirable and fascinating approaches of missionary linguists who attempted to solve the problem of the best way to handle roots, affixes, and entry words. Basalenque’s (1642) Matlazinca-Spanish dictionary is arranged according to the roots, preceded by columns on the left side of the roots indicating which prefixes can be combined with them. Lagunas explains in his prologue that he marks the root with the symbol of a cross (‘cruz’). After the cross, the lemma starts with the root in Tarascan, followed by its translation. Next, more complicated words are given, and Lagunas gives in the margin in Italics the several affixes which he discusses separately (he called them ‘interposiciones,’ but there are also ‘particles’ or ‘prepositions’). One example (Lagunas, 1574, p. 10) is the word andambezcanitomines. The word has the root Andà, which can be found in alphabetical order, marked by a cross. It means, according to Lagunas ‘llegar, vencer, o cumplir algo de veras, o de burla.’ Then the word andambezcanitomines is analyzed, followed by other constructions, such as Andacazcani, andahcazingani, and so on.

Missionary DictionariesClick to view larger

Figure 4. Lagunas (1574, p. 10).

The forms given in the margin correspond to a section of the grammar entitled “De las interposiciones que ya comiençan por su orden Alphabetico,” which is in fact a glossary of interpositions. In that section, the user can search for Bèz and on fol. 144 he will find that it means ‘hazer alguna cosa de burla.’

The second, no less admirable source is the Tesoro of Ruiz de Montoya, where the several particles and affixes are included in the main body of the dictionary. When the same lemma has different meanings, these are numbered in the margins. The first entry of the Tesoro is A-, which has 11 different meanings and functions: as bound morpheme (in the terminology of Ruiz de Monoya, ‘en composión’), as prenominal prefix for the first person in verbal paradigms (‘nota de la primera persona con los verbos’), etc.

Lexicography was not yet particularly advanced in Europe. In Dutch dictionaries, we can find such lemmas as gesteente (‘stone’ as a generic noun) and steen, but the bound circumfixes ge- -te are not included as separate lemmas; the case is similar for German. Spanish dictionaries contain, for instance, asemejar, but never the prefix a- as a separate lemma. Nebrija does not include the clitic personal pronouns me, te, etc., as separate lemmas. Bound morphemes are not used as headwords (which is in fact a misnomer in this context) in Spanish dictionaries of this period. Some missionaries reached important milestones in the history of linguistics in their attempt to divide the traditional concept of dictio into smaller units, and even created names for them, such as the pronombres conjugativos (in Meso-American grammars, used for pronominal prefixes) and interposiciones. The latter were so important, according to Lagunas, that it would make sense to recognize them as the ninth part of speech. Another problem is that the word ‘dictionary’ is in fact also a misnomer when polysynthetic languages are concerned. Words can be used with the absolutive ending, as in dictionaries, but a word in Nahuatl is in fact always a phrase. As Andrews demonstrates (2003), Molina’s translation of Spanish monacordio(clavichord’) into Nahuatl is petlacalmecaueuetl (1571, I: 86r and also in II: 81r). This “word” is in fact a complete phrase, meaning literally “it is an upright drum with strings that has the form of a wickerwork coffer.”

The root was also a basic concept for Philippine grammars and dictionaries. In the first printed dictionary of Tagalog, San Buenaventura observes that he will explain both the isolate Tagalog roots (which can stand by themselves) and the “bound roots.” San Buenaventura also includes abbreviations in the entries, which contain grammatical information, parts of speech, and other phenomena (R. Reçiprocos, A. Adjetivos, 1 ac., Ad. adjetivo, Abs. abstracto, Comp. comparativo, Sup superlativo, Adver. Adversivo, etc.). A novelty is the Latin abbreviation ff meaning facere facere (in his text façere façere) for causatives. Another novelty in the Philippines is the system of numbers for verbal derivations, developed, according to San Buenaventura, by Juan Oliver, whose grammar and dictionaries have been lost. The numbering of these verbal derivations was used in many other sources from the Philippines.

In the entries of San Buenaventura, not only grammatical categories, parts of speech, and derivation patterns are marked, but also pronunciation. He uses the same abbreviations as Calepino in order to mark stress on the final syllable or the penultimate. The abbreviations used are (pp) (‘acento breve’ or penultima producta) and (pc) (‘acento largo,’ penultima correpta or corrupta); they are not indicated sporadically but in all the 16,350 entries, as occurs later in most other Philippine dictionaries. San Buenaventura also marks compound forms with the abbreviations duo dic. (‘two dictions’) and tres dic. (‘three dictions’) (for more details, see García-Medall, 2009, pp. 101–102). However, the internal structure of the word, distinguishing roots and affixes, was not made visible in his dictionary, and the reader has to consult the grammar of Blancas de San José, as the author explains in his prologue, for a better understanding of these morphological processes, such as the root ganda (‘beauty’) from which gumanda is derived (to become beautiful’). The perfective non-realis form bumilí is derived from the root bilí (‘to buy’ or ‘to sell,’ depending on the perspective of the buyer/seller). With the infix -um- it means ‘to buy,’ but with the prefix mag- (magbilí) the derived form means ‘to sell.’ In Tagalog the so-called “Actor voice,” “Patient voice,” “Locative voice,” “Conveyeance voice,” and “Undergoer voice” are distinguished morphologically, and all the necessary affixes are described in the Arte. It may be concluded that the two works were complementary. Together with the texts (such as the catechism), the dictionary and grammar formed the central written tools of language learning, which went hand in hand with oral explanations, whenever possible with the participation of native speakers. Shortcomings in dictionaries or grammars are often compensated for and taught in other works or at other moments during language instruction.

Derivation in antiquity was seen as a movement from one part of speech to another (nouns from verbs, verbs from nouns, etc.), but in the Philippines, missionary grammarians and lexicographers regarded all parts of speech as derivations from “roots.” Although they were not yet as refined as modern analysis, where distinctions are made between simple bases, derived bases, compound bases, and phrasal bases (Schachter & Otanes, 1972, pp. 355–356), the efforts of the Philippine priests represent a great step forward in the history of lexicography. San Buenaventura markes “voices” following the method of Oliver, and after the root the numbers are marked. There are several lemmas ‘Matar’ (patay: ‘to kill something’; matay ‘to kill someone (else)’) and several derivations, such as nanatay 3. ac (the abbreviation indicates the third derived form of the active, ‘tercera activa,’ which are construed by prefixation of nana-/nama-/namo-/nanging). Nagpapacamatay is also given in the same entry, which is the fifth active, prefixing nagpapa. The system of Oliver, described also in other sources such as Blancas de San José, has nine active forms, nine number passive forms, ten passive forms of facere facere, etc. (for more details, see García-Medall, 2009, pp. 74–80). Usually, the reverse Tagalog-Spanish dictionaries do not contain a “morph to morph” analysis of such derived forms, such as occurs in the dictionary of Molina of Nahuatl.

Outside the Spanish territories grammatical marking is also often found in entries. For example, from Sagard (1632, p. 8): “Pour le temps present iáy mis un pnt, pour le preterit un pt. & pour le future un fu. Pour les personnes, il y a pour la première un 1. Pour la seconde un 2. & pour la troisiesme un 3. & per significe personne, & le singulier & plurier par S.P. & les genres masculine & femenin par M. & F.”

4.3.2 Other Cases of Markedness

Markedness in lexicography means labeled information. Several types are distinguished, including diachronic, diatopic, diaintegrative, diaevaluative, dianormative, diaphasic, diastratic, diactechnic, and diatextual (see Hartmann & James, 1998). In premodern missionary lexicography, there is generally not a systematic approach, but some sources use one or several of these types. Nebrija explains in his prologue what he prefers to mark in the entries. In Figure 1, apart from the abbreviations related to grammar and morphology, Nebrija also uses other abbreviations, such as os., pr., no., b., po., and ra: os. is used for words labeled as ‘oscos y opicos’ (Oscan, an Italian tribe in antiquity), which is a case of marking language contact phenomena (diachronic); pr. stands for prisca verba (‘belonging to former times’); no. (nova) indicates neologisms; b. ‘barbarious’ (diaevaluative markedness); po. stands for ‘poetic’ (diatextual); and ra. ‘quo raranter utendum est’ (‘seldom used,’ i.e., diafrequential markedness). Finally, lemmas can be marked as probata (‘approved,’ which means words which were used during the two centuries after the birth of Cicero). According to Nebrija, such words belong to a prestigious register, ‘appreciated and approved, for reasons; even when they are seldom used, they can be used, showing respect to the authors of antiquity’ (“por onra del antiguedad”) (diachronical, diafrequential, and diaevaluative markedness). Missionary linguists-lexicographers usually do not mark all these features using abbreviations. Nevertheless, there is often information about frequency of use, style, etymology, dialect, sociolect, etc. Diaevaluative information is sometimes included (for instance, when insults are translated). Such comments on the discussed lemmas can allude to appreciative, derogatory, or offensive meaning, among other things. In the following paragraphs, some illustrative examples are given.

San Buenaventura marks loanwords with the letter C., which means “corrupt Castilian, pronounced by the Tagalans in a horrible way” (“significa que el vocablo es castellano y que lean ya corrompido los Tagalos a su modo horruno”). In other dictionaries, loans are included, but not marked as such. Particularly in the section Spanish–indigenous language, a Castilian word is often translated as “idem,” which means that the same Castilian word is used in the indigenous language. Guerra’s dictionary which is included in his Arte with the title “Copia de los verbos, nombres, adverbios y de los significados que cada qual de ellos tiene” is a compilation of frequently used words from a register he calls ‘mexicano adulterado’ (‘adulterous Nahuatl’), opposed to ‘mero mexicano’ (‘real Nahuatl’) as it is spoken in the central valley, far away from Guadalajara, where he preached and learned a regional variety of the language. Molina does not mark “corrupt Nahuatl,” but sporadically he marks metaphors (diaconnotative markedness), as in the case of the entry (1571, II, p. 96) “Teixtenacaz. Embaxador, mensagero Metaph.” San Buenaventura also marks metaphors, as he explains in his prologue. This practice is much more developed in Spanish and Portuguese grammars written in Asia than in those written in the New World. When Molina gives several translations or near synonyms, one often has to guess whether the alternative word is a local equivalent (regional variety) or a word which belongs to a different style or register.

Diastratical and diaphasical markedness occur, but less frequently than diatopical varieties, which occur in almost every dictionary, although they are usually not marked with abbreviations. San Buenaventura (1613) is an exception: in his dictionary of Tagalog, abbreviations are used marking diatopical varieties: M. (Manila), L. (Laguna), T. (Tinguian, and even more precisely, “between Nacarlan and the coast”), and S. (Silanga).

Anonymous (1603–1604) is a huge compilation of contemporary ethnographic knowledge in Japan, and it includes several technical terms from Buddhism and Japanese literature. Like Rodrigues’ grammar, it provides the reader with detailed information about regional varieties, particularly Ximo (Kyushu) and Cami (Kyoto) speech, but also varieties from other kingdoms, such as the speech of the kingdom of Vouari or Owari (Zwartjes, 2014). Women’s or children’s words are also labeled as such, and words related to Japanese religious practices are labeled as B (Buppo). In order to accommodate differences in language usage, multiple equivalents are sometimes given.

Diatechnical varieties are usually not marked by abbreviations. It may be surprising that technical terms related to religion, the church, and its institutions are not always included, even when in Europe they were sometimes included. Cardoso’s dictionary has the subtitle Ecclesiasticarum etiam vocabulorum interpretatione, and even specialized dictionaries appeared, such as the Vocabularium ecclesiasticum of Rodrigo Fernández de Santaella, which was frequently reprinted due to its great success. Some lexicographers followed Nebrija too closely, and did not expand the lemmas, adapting the secular model to the specific needs related to the profession of priests. As was demonstrated in Zwartjes (2016), many important Christian concepts are not yet included in the first section of Molina’s dictionary. Words are still demonstrated, but on the other hand the prologue of González Holguín’s dictionary of Quechua states that the Indians do not have words for spirituality, virtues and vices, or the next world, and he decided to include them in his dictionary, although he does not mark these ‘technical’ terms using abbreviations.

5. The Content of the Dictionaries: Eurocentrist or “Intercultural Lexicography”?

As has been seen, missionary dictionaries are far from uniform. In the more encylopedia-like dictionaries, there are not only many Western concepts which are translated into other languages, but also the other way around.

5.1 Dominance of Western Models: Eurocentric Approaches

As Hamann (2015, pp. 86–91) demonstrates, Westerners often compared Amerindian gods and goddesses with Greek and Roman analogies; for example, Huitzilpochitli was “another Mars.”1 Hamann (2015) shows that lexicographers in New Spain used as their model Nebrija’s Castilian-Latin dictionary, which contains a great number of entries for “diuino” (‘divine’).2 Juan de Córdova’s (1578) Spanish-Zapotec dictionary contains the same entries, and follows Nebrija strictly. All translations start with the word colanij, which is combined with words for stars, fire, water, etc. The consequence is that learners from the West would find equivalents in Zapotec which were in fact divinatory techniques of the ancient Romans, not those from the indigenous cultures themselves.

It is easy to find other examples of concepts that were not relevant and items that were not always removed or of terms that one would expect to be included and are not recorded at all, since lexicographers often followed their models strictly. A clear example is the entry for “mezquita” (‘mosque’) in Nebrija, also included in Molina’s Arte (1571), which even contains Nahuatl translations: Mahomacalli (lit. “it is the house of Mohammed”) and Mahomatlatlatlauhtilizcalli (polysynthetic construct combining the root “Mahoma” with tlatlatlauhtiliztli [‘prayer’] and calli [‘house’]).3

5.2 Missionary Dictionaries as Documentation of Foreign Cultures

Apart from criticizing the Eurocentric approach of Western missionaries, less attention has traditionally been paid to the missionaries’ attempts to immerse themselves in the cultures of “the other.” Historians, cultural anthropologists, and scholars of literature study indigenous oral narrative texts, native people’s integration into national society, or post/colonial literature in terms of the creation of “the other.”

In most missionary dictionaries of Asian languages, for instance, there are huge compilations and studies of concepts from non-Western cultures, philosophy, religion, Buddhism, local habits, administration, history, geography, medicine, flora, fauna, etc. Comprehensive studies bringing together recent research on semantics is still one of the most salient desiderata of the historiography of missionary lexicography.

Many lexicographers documented African, Amerindian, and Asian religions, lifestyles, hunting or fishing manners, clothing, etc., which makes some of them historians, sociologists, and anthropologists avant la lettre. Examples are special lemmas included as icebergs and mountains where rice is cultivated.

6. Final Remarks and Future Research Opportunities

It is obvious that many topics are not discussed in this overview, including the importance of the design of the entries and the dictionaries as a whole. Together with the content, the presentation and “learnability” of these works and their entries were important, and the corpus shows many differences between them. There are many compilations, word-lists, etc., less advanced than those featured here. So some dictionaries may contribute much less to the history of missionary lexicography than others. Nevertheless, such texts often contain much invaluable information regarding the history of the languages, many of them extinct or endangered, and in other cases, these dictionaries contain crucial information related to varieties of global languages, such as Arabic or Chinese.

                                                                                                                                                                                                                                                (1.) De Rhodes’ dictionary includes comparisons with “demons” in Antiquity, such as in “Tres doemones [sic] quos Ethnici superstitiosè colunt putantes primum cælo, secundum terræ, & tertium mari dominari vt antiquitus Ethnici nostrates vocabant Iouem Plutonem & Neptunum” (606).

                                                                                                                                                                                                                                                (2.) “Por estrellas, por la tierra, por el agua, por el ayre, por el fuego, por bacines, por las aves, por los sacrificios, por cuerpo muerto, por las asaduras, por la cara, por las manos, etc.” is translated as “mathematicus, geomanticus, hydromanticus, aeromanticus, pyromanticus, engastromantes, augur, auspex amd haruspex, necromantia. Extipex. Metoposcopus, chiromanticus, etc.” (Nebrija, 1492, f. xliii–v).

                                                                                                                                                                                                                                                (3.) Gilberti’s Spanish-P’urhépecha (Tarascan) dictionary contains an entry for “mezquita” (1559, f. 120v) translated as diabloeueri quahtaqueri (“house of the devil”; quahta = “casa”, f. 36v). Urbano’s trilingual dictionary (1605) has the same entry, since his dictionary is in fact another version of Molina’s bilingual dictionary complemented by translations into Otomi. The entry for “mosque” (“mezquita”) is maintained with the translation into Otomi: naxæcãmbemcangṹ. magṹnquexæcambenimahoma (f. 292v), which is a calque of the complex Nahuatl translation by Molina. In Juan de Córdova’s Spanish-Zapotec dictionary (1578, f. 267v), there is also an entry for “mezquita” translated as lìchi pezè láo (the word for “house” [“casa”] is lìchi [“casa donde moramos”] [1578, f. 74r], and the word for “devil” [“diablo”] is pezèlàotào [“diabolica cosa” is xitèni pezèlào] [1578, f. 139r]).