date: 22 March 2018

Tupian Languages

Summary and Keywords

“Tupian” is a common term applied by linguists to a linguistic stock of seven families spread across great parts of South America. Tupian languages share a large number of structural and morphological similarities which make genetic relationship very probable. Four families (Arikém, Mondé, Tuparí, and Raramarama-Poruborá) are still limited to the Madeira-Guaporé region in Brazil, considered by some scholars to be the Tupí homeland. Other families and branches would have migrated, in ancient times, down the Amazon (Mundurukú, Mawé) and up the Xingú River (Juruna, Awetí). Only the Tupí-Guarani branch, which makes up about 40 living languages, mainly spread to the south.

Two Tupí-Guaraní languages played an important part in the Portuguese and Spanish colonisation of South America, Tupinambá on the Brazilian coast and Guaraní in colonial Paraguay. In the early 21st century, Guaraní is spoken by more than six million non-Indian people in Paraguay and in adjacent parts of Argentina and Brazil.

Tupí-Guaraní (TG) is an artificial term used by linguists to denominate the family composed by eight subgroups of languages, one of them being the Guaraní subgroup and the other one the extinct Tupinambá and its varieties.

Important phonological characteristics of Tupian languages are nasality and the occurrence of a high central vowel /ɨ/, a glottal stop /ʔ/, and final consonants, especially plosives in coda position. Nasality seems to be a common characteristic of all branches of the family. Most of them show phenomena such as nasal harmony, also called nasal assimilation or regressive nasalization by some scholars.

Tupian languages have a rich morphology expressed mainly by suffixes and prefixes, though particles are also important to express grammatical categories. Verbal morphology is characterized by generally rich devices of valence-changing formations. Relational inflection is one of the most striking phenomena of TG nominal phrases. It allows marking the determination of a noun by a preceding adjunct, its syntactical transformation into a nominal predicate, or the absence of any relation. Relational inflection partly occurs also in other branches and families than Tupí-Guaraní. Verbal person marking is realized by prefixing in most languages; some languages of the Tuparí and Juruna family, however, use only free pronouns.

Tupian syntax is based on the predication of both verbs and nouns. Subordinate clauses, such as relative clauses, are produced by nominalization, while adverbial clauses are formed by specific particles or postpositions on the predicate. Traditional word order is SOV.

Keywords: classification, historical linguistics, reconstruction, colonial period, nasality, syntax, voice, nominalization

1. Classification of Tupian Languages

Tupian LanguagesClick to view larger

Figure 1. Gijn, Galucio, and Nogueira (2015, p. 301, by Eriksen). Fig. 1 does not include the presence of Paraguayan Guaraní in Paraguay and adjacent areas.

Permission has been given by Jimena Beltrão, editor of the Boletim do Museu Paraense Emilio Goeldi – Ciências Humanas (Belém, Brazil) (February 16, 2017).

“Tupian” is a common term applied by linguists to a stock of ten language families (Rodrigues, 1986, 1999), spread across great parts of South America, from Rondonia in northwestern Brazil to French Guyana, from the Rio Negro in Northern Amazonia to the Iquitos region of Eastern Peru, to the whole of Eastern Bolivia, Northern Argentina, Paraguay, and Southern Brazil. More recent classifications (Galucio et al., 2015) reduce Rodrigues’ list of ten families to seven, making a single family from Ramarama and Poruborá as well as from Mawé, Awetí, and Tupí-Guaraní. Tupian languages share a large number of structural and morphological similarities which make genetic relationship very probable. Four families (Arikém, Mondé, Tuparí, and Raramarama-Poruborá) are still limited to the Madeira-Guaporé region in Brazil, considered by some scholars to be the Tupí homeland. Others would have migrated, in ancient times, down the Amazon (Mundurukú, Mawé) and up the Xingú River (Juruna, Awetí). Only the Tupí-Guarani branch, which makes up about 40 living languages, spread to the wide regions mentioned above.

The first reliable classification of the whole Tupian family is Rodrigues (1964), improved by Rodrigues (1986) and Rodrigues and Cabral (2012, pp. 496–499). For this traditional classification, see Figure 2.

Tupian LanguagesClick to view larger

Figure 2. By courtesy of Galucio et al. (2015, p. 230).

Permission has been given by Jimena Beltrão, editor of the Boletim do Museu Paraense Emilio Goeldi – Ciências Humanas (Belém, Brazil) (February 16, 2017).

Figure 2 does not respect the geographical and typological distribution of the whole stock into two branches, a Western branch with the Ramarama-Poruborá, Mondé, Tuparí, and Arikém families, all located in Rondonia, Brazil, and an Eastern branch with the Juruna, Mundurukú, and Mawetí-Guaraní families (some scholars prefer the terminology of one big family and ten or only seven subfamilies). The existence of Western and Eastern branches is not confirmed by a subsequent classification.

More recent is the classification by Galucio et al. (2015). Applying computational methods, the authors produce a classification of the stock based on lexical distance. The classification confirms the classifications based on the traditional comparative method, relying on phonological criteria. However, the recent classification shows a highly interesting hierarchy of families and branches. The most clearly identified families are Mondé, Tuparí , and Mawetí-Awetí-Tupí-Guaraní or Mawetí-Guaraní (see below 1.2). It also evidences the rather short distance between Karo, Poruborá, and Karitiana (Arikém). Tupí-Guaraní languages form a unit which is separated from Awetí nearly by the same distance as Awetí from Mawé, but the distance of the whole family from Mondé on the one hand and Juruna on the other is much greater (Galucio et al., 2015, pp. 239–241).

1.1 Tupian Families

Several families and subfamilies of the stock only have one still extant member. In the cases of Poruborá, Mawé, and Awetí, the existence of other, possibly extinct, members is unknown. In some cases, the languages from which the family names were derived are extinct: in the Ramarama branch only Karo survived, while Ramarama (Itogapúk or Ntogapíd) and Urumí disappeared; in the Arikém family only Karitiana survived, while Arikém and Kabixiana became extinct. Within the Mundurukú family, Kuruaya is moribund, and only Mundurukú is still spoken. The classification of Mondé is complex, all the more because its details differ from one classification to the other. In all of them, however, Salamay on the one hand and Suruí and Cinta-Larga on the other are the most divergent languages from the rest of the family. The Tuparí family consists of Makurap, Tuparí, Wayoró, Akuntsú, and Mekéns; the former members Kepkiriwát and Waratégaya are extinct.

All the members of the Eastern branch have moved down the Madeira and the Amazon by historical migrations (Rodrigues, 2000; Jolkesky, 2016, p. 197). The Juruna family entered the valley of the Xingú River and occupied the Middle and Upper Xingú. Only Juruna survived, while Manitsawá and Xipaya died out.

1.2 Mawetí-Guaraní Family

Mawé or Sateré-Mawé, near the Southern banks of the middle Amazon, and Awetí, on the upper Xingú, are two single members of separate branches which are the closest to the large Tupí-Guaraní subfamily. In recent years, several attempts have been made to show the common features of the three branches and even to suggest a Mawé-Awetí-Tupí-Guaraní or Mawetí-Guaraní family (Meira & Drude, 2015; Corrêa-da-Silva, 2010). Similarities are found not only in phonology but also in morphosyntax (person marking, relational inflection, etc.; Rodrigues & Dietrich, 1997; Drude, 2006).

1.3 Tupí-Guaraní Branch

Tupí-Guaraní (TG) is the largest and most wide-spread branch within the Tupian stock. Members of this branch first came into contact with Europeans in the 16th and 17th centuries (see section 2). TG is the most studied and best known of the Tupian branches, though many of its members were thoroughly analyzed by linguists only in the last three decades, not much earlier than languages of the other families. There is not yet a fully satisfactory classification of TG languages. The standard classification, based on historical phonology, is that of Rodrigues (1964) and Rodrigues and Cabral (2002). Dietrich’s attempt to combine phonological and morphosyntactic criteria (Dietrich, 1990) led to a more differentiated but also more complicated picture. More recent studies including lexical criteria (Michael et al., 2015) give interesting results, more or less harmonizing with Rodrigues and Cabral (2002).

Tupian LanguagesClick to view larger

Figure 3. Michael et al. (2015, p. 204). The editor of LIAMES (Campinas, Brazil) gave his permission in an e-mail (February 14, 2017).

Fig. 3 shows a clear Guaranian group with Paraguayan Guaraní, Mbyá, and Aché on the one hand and Kaiowá, Xetá, and Avá-Ñandeva on the other, while Bolivian Chiriguano (today Western Guaraní) and Tapiete form another subgroup within the Guaranian group. This perfectly corresponds with subgroup 1 of Rodrigues and Cabral (2012, pp. 498–499). What is called “Southern” in Fig. 3 corresponds with Rodrigues’s subgroup 2, Guarayu and Guarasu (Warázu or Pauserna) versus Siriono, Yuki, and the extinct Jorá. The “Diasporic” group, which includes Tupinambá, Omagua, and Kokama in Fig. 3, partially corresponds with subgroup 3 of Rodrigues and Cabral (2012), which also includes historical Língua Geral Amazônica and its modern follower Nheengatú (see section 2.1). On the other hand, Rodrigues and Cabral (2012) exclude Omagua and Kokama, because these two languages are considered not to be of Tupí-Guaraní origin but created by language contact. “Nuclear TG” of Fig. 3 is justified by the central importance of languages like Tocantins Asuriní, Parakanã, Tapirapé, and Suruí Aikewára (not considered in Fig. 3), but also Tembé, its dialect Guajajara, and Avá-Canoeiro, set a little apart in Fig. 3. They form subgroup 4 of Rodrigues and Cabral (2012). However, “Nuclear TG” also comprises Xingú Asuriní, Araweté, and Anambé, which form subgroup 5 in Rodrigues & Cabral, though it also includes Kayabí and Parintintin. The latter in Fig. 3 stands for the whole subgroup 6 of Rodrigues and Cabral (2012): Parintintin, Uru-eu-wau-wau, Amondawa, and the now extinct Apiaká form the Kawahib cluster. Subgroup 7 of Rodrigues and Cabral consists of the rather independent Kamaiurá (Upper Xingú region), a TG language par excellence in Fig. 3. Last but not least, there are Guajá and Ka’apor as “Nuclear TG” languages in Fig. 3, but also Wayampi (Wayãpi) and Emerillon, two languages of French Guyana. They are marked as “Peripheral” in Fig. 3, but are associated in subgroup 8 of Rodrigues and Cabral (2012), together with Zo’é, a language discovered only in the 1990’s, not considered in Fig. 3.

1.4 Language Maintenance, Loss, and Revitalization

In order to round out the list of Tupian languages of sections 1.1–1.3, it seems useful to give examples of languages that have been mentioned in former times and have been classified in our models, or have become extinct more recently. In this context, the study of Tupian languages shows once more that a language is not an established unit, but may arise from the split of a group, “live” for a certain time, and then disappear. In recent decades, there have been repeated reports of small groups of indigenous individuals, often first contacted by prospectors, speaking a hitherto unknown language. Investigations made by anthropologists and linguists generally lead to the identification of a new language that proceeded from the effect of violent contacts between natives and national farmers or prospectors. A small group of youngsters had succeeded in escaping, hiding in the bush, and surviving by living in the forest, avoiding contact with white people. One example is Akuntsú, a Tuparí language discovered in 1984, spoken by five individuals without children and therefore highly endangered. It is difficult to say that a language is actually extinct. Politically and socially, a language may be extinct when there is no longer a speaking community. However, there have been cases of languages that had been believed extinct for many years, but, unexpectedly, some speakers or semi-speakers were detected and interviewed by linguists. The most spectacular case is that of Guarasu (formerly Pauserna, TG, Southern in Fig. 3), which had been thought to be extinct since the 1970’s according to some specialists (Riester, 1977), since the 1990’s according to others (Adelaar, 2004, p. 622). In 2016 a couple of full speakers were found in Southern Rondonia, Brazil (Ramirez, Vegini, & França, 2017). Of course, the language is socially extinct in any case.

“At the time of European arrival in the New World, Omagua and Kokama were spoken across a wide region, from the Aguarico River in Ecuador to the Içá River in Brazil” (O’Hagan, 2011, p. 11). Omagua was documented in the 18th century, but then was said to have disappeared. Some rememberers living near Iquitos (Peru) make one suspect that the language was still spoken two generations ago. Kokama is spoken by about 1,000 speakers in the Loreto Department of Peru. Although both languages appear in Fig. 3, today they are no longer genetically classified as TG languages (O’Hagan, 2011, pp. 9–10; Vallejos Yopán, 2016, 1.4.5).

In spite of the impressive extension of so many members of Tupian languages over the whole South American continent, most of the languages are in a critical state. Many languages have less than 100 speakers, and only a few have more than 10,000. Even these languages are endangered, because in many cases the language is no longer passed on from parents to children, but spoken mainly or only by the grandparental generation. Social attitudes in favor of Spanish or Portuguese are increasingly observed.

Revitalization initiatives have been started in many indigenous communities in recent decades. The effect is not an actual revitalization, but it certainly leads to the revalorization of the language as a cultural heritage; in the best case, its use as a secondary language or as a language for special (e.g., ceremonial and/or narrative) purposes can be achieved.

2. Importance of Tupí-Guaraní Languages in Colonial Times

2.1 Tupinambá, Língua Geral, and Nheengatú

When the Portuguese occupied the Atlantic coast of Brazil during the first half of the 16th century, Tupinambá was a widespread language between the São Paulo region in the south and the Maranhão region in the north. It was not only the language of several tribes who spoke varieties of Tupinambá on the coast, but also became the language of the descendants of Portuguese settlers and their indigenous wives. Portuguese colonists generally arrived without women. Historical documents tell us that Portuguese was not spoken or understood by all settlers in regions far from the political centers Salvador da Bahia and Rio. Therefore interpreters were usual in trials. This situation only changed after the middle of the 18th century, when the use of Portuguese was reinforced by Portuguese politics.

From the middle of the 16th century, Tupinambá was used as a language for Jesuit missions on the whole of the Brazilian coast. The common use of the language resulted in the new denomination of “língua geral” (‘general language’) from the 17th century on. It was disseminated by Portuguese settlers and missionaries from Maranhão to the whole Amazon basin. The Língua Geral Amazônica was used as a lingua franca by settlers and missionaries in their contacts with surrounding Indians. Many of the indigenous people spoke other languages, but used Língua Geral as a second or third language.

Tupinambá has been documented by Jesuit missionaries since the 16th century. The first was Joseph (José) de Anchieta, who wrote a grammar (Anchieta, 1595), a catechism, and even religious poems and miracle plays in Tupinambá. Língua Geral Amazônica, which differs from Tupinambá only by a few simplifications, has been documented since the end of the 17th century through new catechisms and several dictionaries.

During the 19th century, Língua Geral was still used as a lingua franca in Amazonia, even in large cities such as Belém and Manaus. Since the end of the 19th century, Língua Geral has been called Nheengatú ‘good language.’ Today it survives in the Upper Rio Negro region, spoken by local settlers and indigenous communities which gave up their own Arawak or Tukano languages in order to adopt Nheengatú. The term Tupí, created by shortening Tupinambá, has been used in Brazil since the middle of the 19th century in order to characterize the glorious indigenous past of Brazil, in opposition to the Guaraní tradition of antagonistic Paraguay.

2.2 Guaraní

Guaraní was a Tupí-Guaraní language spoken by widespread groups of indigenous people in Paraguay. Some of them had migrated to the north, then to the west during the 15th century, and had finally settled near the foothills of the Andes of what is today Eastern Bolivia. Among them were the Chiriguano or Western Guaraní.

Since the beginning of Spanish colonization, Paraguay had been isolated from the centers of the early Spanish world, such as Spain, Peru, and Chile. As the British closed the mouth of the River Plate to all Spanish ships, it was hard for settlers from Spain to reach Paraguay by ship. When Jesuits started Christian mission in Paraguay at the beginning of the 17th century, there were few Spanish settlers among the many Guaraní people. The social situation of the settlers was nearly the same as on the Brazilian coast: the settlers, who generally had come without women, took indigenous wives. Guaraní was the general language between the different tribes, mostly speaking different varieties of Guaraní, and it was the mother tongue of the descendants of the first Spanish settlers and their Guaraní wives. Guaraní was the only language allowed within the Jesuit Reductions, large closed settlements of Indians under Jesuit administration, during the 17th and 18th centuries.

The Guaraní language was normalized and described by Antonio Ruiz de Montoya, Paraguay’s first great missionary and founder of the reductions in southern Paraguay. Colonial Paraguay included parts of what today is Argentina (parts of Formosa, Chaco, Santa Fe, Corrientes, and Misiones) and Brazil (south of Mato Grosso do Sul and western Paraná). This is the reason why bilingual speakers of Spanish and Guaraní even today are still to be found in these regions (see also section 3). Montoya wrote a catechism, a grammar, and an important dictionary (Montoya, 1993–2011). He adopted not only Anchieta’s spelling of Tupinambá, thus establishing the principles of Guaraní orthography, but also many of Anchieta’s insights into the structure of Tupinambá and the closely related Guaraní.

During the wars of Independence (1810–1811), Paraguay remained a separate nation between Spanish-speaking Argentina and Portuguese-speaking Brazil because of its fierce defense of Guaraní as its proper language. Paraguay became independent in 1811.

3. Paraguayan Guaraní in the 20th and 21st Centuries

Paraguay is the only South American country where the national population speaks, besides Spanish, an indigenous language, Guaraní or Paraguayan Guaraní (in order to distinguish it from other forms of indigenous Guaraní, Kaiowá, Avá-Ñandeva, or Mbyá). Paraguay’s bilingualism is a heritage from colonial times. Indeed, Guaraní is spoken by the large majority, perhaps 85%, of the 6.8 million inhabitants. Probably more than 5.7 million people are bilingual. In the countryside, Guaraní is more present than in towns, especially than in the capital Asunción. In the countryside, there are still 30% of monolingual speakers of Guaraní, with little knowledge of Spanish. However, Spanish is the dominant language everywhere; it is the official language, spoken, as far as possible, with unknown people and everybody in public places. Guaraní is the language of intimacy, spoken with relatives and friends. It became an official language in 1992, together with Spanish. None of the other indigenous languages has achieved this status. Since then, Guaraní has been taught in schools at different levels. More details about how diglossia is functioning in Paraguay can be found in Zajícová (2009).

In Argentina and Brazil, there may be about 100,000 bilingual speakers of Paraguayan Guaraní and Spanish, perhaps more if Paraguayan migrants in Buenos Aires and São Paulo are included. The diglossic situation is quite different from Paraguay, because speaking Guaraní does not have any prestige in these countries and does not get public support. The Argentine province of Corrientes has more speakers of Guaraní than the other mentioned provinces, and there are even monolingual speakers living in the swampy regions of northern Corrientes. Guaraní received co-official status there in 2004, without any visible consequences. Corrientes Guaraní (Guaraní Correntino) is slightly different from Paraguayan Guaraní (Cerno, 2013). It may have developed into a rather independent dialect of Guaraní since colonial times.

The Guaraní spoken by Paraguayans and speakers of the adjacent areas of Argentina and Brazil is characterized by systematic code-switching between Guaraní and Spanish. This social behavior is based on the generally unconscious attitude of citizens who do not wish to be mistaken for indigenous people. This type of spoken Guaraní is called “jopará” ‘mixture’ (Kallfell, 2016). Jopará is not to be understood as a mixed language, as has been suggested by some scholars, but as a linguistic behavior characterized by the alternative use of two languages (Dietrich, 2010).

4. Reconstructing Tupí-Guaraní and Proto-Tupí

A short history of early comparative studies and classifications of TG since the end of the 18th century is given by Rodrigues and Cabral (2012, pp. 495–496). After Lorenzo Hervás (1735–1809), Carl F. P. von Martius, and Lucien Adam, one of the outstanding personalities was Čestmír Loukotka. Rodrigues had been developing his concept of a Tupian linguistic stock since 1955 (Rodrigues, 1964). The idea of a stock of ten genetically related families led to the reconstruction of at least parts of the assumed proto-language. Rodrigues and his school initiated the reconstruction of the phonological system of Proto-Tupí-Guaraní (PTG), first published in Rodrigues and Dietrich (1997, p. 268). This was accompanied by the partial reconstruction of PTG pronouns and morphosyntactic devices (Rodrigues, 2001). At the same time, important work was done by Cheryl Jensen, who reconstructed PTG phonology (Jensen, 1998, Appendix II, pp. 604–606) and morphosyntax. The University of Brasilia, the University of São Paulo (USP), the University of Campinas (UNICAMP), among others, the Federal University of Pará, and the Museu Paraense Emilio Goeldi, the latter two at Belém, Brazil, continue to be centers of Tupian linguistic studies.

New data gained from fieldwork published in new doctoral dissertations by various authors were extremely helpful. At the same time, reconstruction was extended from PTG to Proto-Tupí (PT). While PTG has a time depth of about 1,500 years, that of PT extends to perhaps 5,000 years (Jolkesky, 2016, pp. 636–644). This means that the first migrations of PT families may have started in the third millennium bce and the PTG branch may have separated from Proto-Mawetí-Guaraní 2,000 years ago. The spreading of TG groups began during the first millennium ce.

Proto-Tupí phonology is presented in Rodrigues (2005, 2007) and Rodrigues and Cabral (2012, pp. 502–509). It must be remembered, however, that reconstructions are hypotheses, not indisputable facts. According to the authors cited, PT already had the same vowel inventory of six vowels as found in PTG (Rodrigues & Dietrich, 1997, p. 268). PT phonemes are marked by two asterisks, PTG phonemes by one:

Tupian Languages

The system of proto-consonant of PT is the following:

Tupian Languages

In PTG the glottalized series got lost. Among the obstruents, note the emergence of /β‎/. The series of prenasalized stops is no longer found in PTG, even though they can be found in some Southern Guaranian languages as allophonic variants that apparently developed later. There are affricates, but no fricatives:

Tupian Languages

Rodrigues (1964) based his classification of TG mainly on the evolution of *ʦ, *ʧ, and *pj. Subgroups were differentiated according to the preservation of *ʦ and *ʧ or their merger with a new phoneme /s/, which could be preserved as in the Guaraní subgroup or change to /h/ or ø, typical for Nuclear TG languages (see Fig. 3); *pj was preserved, as in Tupinambá, or merged with *kw to /kw/, as in the majority of the TG languages. Lists of TG lexical cognates may be found in Dietrich (2015), lists of Tupí lexical cognates in Rodrigues and Cabral (2012, pp. 502–509) and Galucio et al. (2015, pp. 249–274).

5. Phonology

As observable in the reconstructed phoneme inventory, the main phonological features of Tupian languages are nasality and the occurrence of a high central vowel /ɨ/ and a glottal stop /ʔ/. One further feature is the presence of final consonants, especially stops in coda position.

5.1 Segmental Phonology

The existence of a high central vowel /ɨ/ is a noteworthy fact only from the point of view of European languages like English, Spanish, or German, where it is unknown, but the phoneme is frequent in American languages and also in Indo-European languages such as Russian, Rumanian, and Kashmiri, as well as in Turkish.

A glottal stop /Ɂ/ occurs in many Tupian languages, first of all in intervocalic position. Generally, it is not phonemic, but occurs in syllable initial position after a preceding vocalic coda. In non-Mawetí-Guaraní Amazonian languages, glottal stops may also occur after a consonantal coda (Mundurukú ǝkɁá ‘house’; Karitiana patĩn [pat’Ɂĩn] ‘sister’).

In some cases, however, /Ɂ/ may be phonemic (e.g., Emerillon opɨta ‘s/he stops’ versus oɁɨta ‘s/he swims,’ where the opposition depends on /p/ - /Ɂ/).

Syllable structure is CV or VC; CCV does not occur, with the exception of vowel dropping in unstressed syllables, for example in Aché and Guarayu (TG). Final consonants are preserved in all families, except for the Southern Guaranian group of Tupí-Guaraní.

5.2 Suprasegmental Phonology, Nasal Harmony

5.2.1 Tone and Stress

Only some non-TG Amazonian languages have phonemic tone; these include all languages and dialects of the Mondé family (Moore & Meyer, 2014, p. 614), Juruna, and Mundurukú, together with the nearly extinct Kuruaya. These languages distinguish two phonological tones, low (unmarked) and high (generally marked by an orthographic accent). Some examples:

Gavião (Mondé) opposes tone and length (marked by ^), see the following double minimal pair: magap ‘fat’ – magáp ‘egg’ – mâgap ‘1s + fat’ – mâgâp ‘1s + egg’

Juruna: aɁá ‘bat’ – aɁa ‘penis’ – áɁá ‘yawning’

Kuruaya iʤi ‘hind’ – iʤí ‘his mother’

Some other languages seem to have pitch accent, for instance Karo (Ramarama) and Karitiana (Arikém). All the other languages have stress. In most languages, isolated words are stressed on the final syllable: Araweté pane [pa’ne] ‘almost, Kamayurá jaɨtata’i [jaɨtata’Ɂi] ‘star,’ Tocantins Asurini tato [ta’tɔ] ‘armadillo,’ Guajá kaɁa [ka’Ɂa] ‘jungle.’

Only in some languages does stress fall on the penultimate syllable, namely in Western Guaraní (Chiriguano) and Xetá (ete [‘ete] ‘body’), and “Peripheral” Bolivian languages (Fig. 3) like Guarayu (at least in some dialects), Guarasu (kaɁa [‘kaɁa] ‘forest’), Yuki (yiba [‘jiba] ‘arm’), and Siriono, where it may be an areal feature, but also in Wayampi/Wayãpi (kaɁa [‘kaɁa] ‘herb, plant’), Emerillon, Avá-Canoeiro (napukaj [na’pukai] ‘I don’t scream’), and Xingú Asurini (kaɁa [‘kaɁa] ‘leaf’).

Some Tupian languages, for example Karo (Ramarama) and Akuntsú (Tuparí), are characterized by stress on the final syllable as well as on the penultimate, according to the phonological context.

5.2.2 Nasal Harmony

Nasality is a basic feature of nearly all Tupian languages (see, among others, Drude, 2009; Rodrigues & Cabral, 2011; Rose, 2008; Singerman, 2016). Only Guarasu, Tocantins and Xingu Asurini, Tembe, and Guajajara have lost their nasal vowels. Many languages, however, are characterized by synchronic phonotactic nasalization phenomena, known as nasal harmony. Regressive nasalization starting from a stressed nasal vowel can be observed frequently:


Tupian Languages


Tupian Languages


Tupian Languages

A second kind of nasalization or nasal harmony starts from the nasal consonants, /m, n, ŋ, ŋw/, causing regressive nasalization. This occurs systematically in the Guaranian subgroup of Tupí-Guaraní. It is based on the interrelation between the unvoiced stops /p t k/, their voiced prenasalized counterparts [mb], [nd], [ŋg], and the nasal consonants /m n ŋ/, corresponding to the points of articulation bilabial, alveolar, velar.

In Paraguayan as well as in Correntino Guaraní, Kaiowá, Ñandeva, and Mbyá, the segment o-mondo / 3-send away / ‘s/he sent (it) away’ is phonetically oral because its final stressed vowel [o] is oral. The distribution of nasal consonants and prenasalized stops depends on the nasal or oral quality of the following vowel: prenasalized stops always occur before oral vowels; nasal consonants occur with intrinsically nasal vowels or with nasalized vowels. In the given example, o-mondo, regressive nasalization starts from [nd], nasalizing preceding [o] > [õ]. Going further to the left, nasal harmony selects [m], instead of [mb], because following [õ] requires a nasal consonant [m]. Finally, resulting [mõ] nasalizes the preceding [o]: [õmõ’ndo].

But nasal harmony is even more effective. It is important in the morphophonemics of the Guaranian subgroup, Guarayu, Siriono, Kamayurá, dialects of Wayãpi, and the Kawahib cluster, especially in inflectional processes involving prefixes. All prefixes with the onset /j/ have oral and nasal variants [j/ɲ] depending on the following phonetic context: one of the allomorphs -je-/-ñe- ‘reflexive voice,’ -jo-/-ño- ‘reciprocal voice’ is selected according to the orality or nasality of the following transitive verb (regressive nasalization). Examples from Correntino Guaraní (Cerno, 2013, pp. 179–181):


Tupian Languages


Tupian Languages


Tupian Languages


Tupian Languages

The same mechanism of selection functions in nominal and verbal person marking of the Guaraní dialects. The nominal second person prefix presents the variants nde-/ne for singular, pende-/pene- for plural, jande-/jane is for first person plural inclusive. The verbal series has the corresponding form ja-/ña ‘1PL inclusive.

Progressive nasalization is observed in composition, in word formation processes with the causative prefix mbo-/mo-, and in the selection of allomorphic suffixes. When a nasal segment is followed by an oral segment with an unvoiced onset stop, the effect of nasal harmony is changing the unvoiced stop into its prenasalized counterpart. Examples from Paraguayan Guaraní (Par.G.), the language which most systematically follows nasal harmony rules:


Tupian Languages


Tupian Languages


Tupian Languages


Tupian Languages

These mechanisms are not found systematically in all TG languages.

6. Morphosyntax

Attempts to reconstruct parts of Tupian morphosyntax have been made by, among others, Gildea (2002), Rodrigues and Cabral (2012, pp. 509–562), and Birchall (2015).

The difference of genderlects (men use other words, word forms, or pronouns than women, or have a different intonation), although it is not a feature of morphosyntax but of a more general kind of speech differentiation, will be mentioned here. The phenomenon is found in different South American families (Aikhenvald, 2014, p. 375), and throughout the Tupi stock (Rose, 2015).

Male and female speech in Aweti may be due to the contact with Yawalapiti (Arawak family) in the indigenous territory of Xingu Park, Brazil.

6.1 Nouns and Verbs

Lexical word classes in Tupian languages are nouns and verbs. Other word classes are pronouns, adverbs, and grammatical words (particles and postpositions). Adjectives are not a special word class, but appear as nouns (Dietrich, 2001, for example), verbs (Seki, 2000, pp. 67–69), or something else (Couchili, Maurel, & Queixalós, 2002). Verbal predicates may be syntactically bound to nouns as subjects or objects. In Tupian other than the Mawetí-Guaraní languages, nouns may be determined by person markers (possessive markers) different from personal pronouns; verbal subject agreement, however, is mostly realized with personal pronouns:


Tupian Languages


Tupian Languages

In the extended Mawetí-Guaraní family, the distinction between verbs and nouns is made by different sets of person markers. Neglecting details of singular languages, the prefixes of Mawetí-Guaraní verbs are basically the following, the first ones being those of the Guaraní cluster, insofar as they are different from other languages (Km = Kamayurá). For more details, see Birchall (2015):

Tupian Languages

Nominal prefixes may be identical with personal pronouns or derived from those. Since the proto-languages probably did not have a 3p pronoun, but used demonstratives instead, the 3p prefix is generally different from pronominal forms.

Reconstructed nominal person prefixes are:

Tupian Languages

Person-marked non-predicative nouns are interpreted as “possessed” nouns whenever they are arguments (14), (15), (17), (19) and the predicate is expressed by a verb (15) or a predicative noun (16)–(19):


Tupian Languages


Tupian Languages


Tupian Languages


Tupian Languages


Tupian Languages


Tupian Languages

Mawetí-Guaraní languages traditionally have neither copulas nor verbs like ‘to have, own,’ but use person-marked predicative nouns instead. In recent times, as a consequence of lasting pressure by the dominant Spanish and Portuguese, most languages have introduced the use of verbs meaning ‘to be in a place’ as an equivalent for the copula (in sentences like ‘he is the chief’) and ‘to hold’ or similar as an equivalent of ‘to have.’ Some specialists in Tupian languages classify nominal predicates as stative (Vallejos Yopán, 2016, §8.3.1) or descriptive verbs (Seki, 2000, pp. 67–69; ex. 16–19). Most scholars, however, classify predicative nouns as nouns, considering the constructions as existential, identification, or possessive clauses (Galucio, 2001, pp. 181–185; Dietrich, 2001; Rose, 2002; Rose, 2011, pp. 180–206). In the light of this conception, example (16) would be glossed as ‘with regard to me there is/was my gladness,’ (17) as ‘with regard to my mother there are many sons/daughters,’ (18) as ‘there is my remembrance of you,’ (19) as ‘there is length of his neck.’

With regard to TG languages, there has been a debate not just about the distinction between nouns and verbs itself, but about the question how to classify the lexicon within the two classes (see Queixalós, 2001a). A solution may be the observation that many roots switch from one class to the other according to the syntactical context. The distinction between nouns and verbs allows finding out the word class of a root in a given text, but it is more difficult to establish the rules which allow predicting the word class a root may appear in (see also Dietrich, 2017).

6.2 Relational Inflection

Relational inflection is one of the most striking phenomena of TG nominal phrases. It allows marking the determination of a head by a preceding adjunct, or its syntactical transformation into a nominal predicate, or marking the absence of any relation. Today it is fully grammaticalized only in TG languages, but reflexes in Mawé and Awetí as well as in languages of other Tupian families such as Mundurukú, Tuparí, Akuntsú, and Makurap make a common origin probable (Rodrigues & Cabral, 2012, pp. 511–517). In the beginning, there may have been only phonotactical variation of initial */t/- in nominal and nominalized verbal stems (Meira & Drude, 2013). Most languages, however, took advantage of this kind of variation in order to establish functional oppositions: the subclass of lexical stems subject to relational inflection is composed of vowel initial stems. Prefixed t- indicates the absence of any syntactical relation: Mbyá, Tocantins Asuriní, Kamayurá t-ape ‘path, way’ is the form of the lexical item, for instance in enumerations: Mbyá t-eʧa ‘eye,’ t-embi’u ‘food,’ t-embé ‘lip.’ Other initial forms are s-o’o, ø-o’o ‘game, meat,’ ø-oo, ø-ok ‘house.’

Syntactical determination of a head by a preceding adjunct (“genitive” relation) is expressed by position. The head follows the adjunct:

Karo yate gap / pig fat / ‘pig’s fat’

Mbyá ʧe ʧy / I mother / ‘mother of mine, my mother’

In the nominal subclass subject to relational inflection the relation between adjunct and head, nominal or pronominal, is highlighted by an r-prefix, marked REL in the glossing:


Tupian Languages


Tupian Languages


Tupian Languages


Tupian Languages

Examples (20) to (23) show that nominal first person and second person prefixes were originally personal pronouns. This is still evident in constructions with relational prefix R-.

The third kind of relational inflection occurs in third person relation, marked by s-, h-, or ø-, according to the historical evolution of PTG *ʦ-. Its syntactical function is “possessive” (Guarayu s-embe ‘his/her lips,’ Kamayurá h-emi’u ‘his/her meal’) or predicative (Guarayu s-aku, Par. G. h-aku ‘(it) is hot,’ Tocantins Asurini ø-e’e, Wayãpi ø-ẽ’ẽ ‘(it) is sweet’).


Tupian Languages

but two predicates in (25):


Tupian Languages

In languages that did not preserve functional relational inflection, one of the possible prefixes, generally ø- or r-, has been lexicalized, e.g., Aché ‘sweet,’ achy ‘pain,’ Siriono erẽmbe ‘lips,’ raku ‘heat, hot,’ Kokama tsaku ‘(be) hot.’

6.3 Case-Marking

It seems that all Tupian languages have (monosyllabic) suffixes and (generally polysyllabic) postpositions for marking syntactical relations. Rodrigues and Cabral (2012, pp. 517–521) think that some Tupian families (Arikém, Tuparí, Mawé, Tupí-Guaraní) “have developed inflectional morphological cases”; others speak of suffixes and postpositions (e.g., Rose, 2011, pp. 233–-234). The reconstructed cases would have been **pe ‘punctual locative/dative,’ **ko ‘allative,’ **wo ~ mo ‘diffuse locative,’ **eʦe ‘relative/associative,’ **eri and wi ‘ablative,’ **erjo ~ **erje ‘associative,’ **ʦoče ‘superessive,’ and **na ‘translative’ (Rodrigues & Cabral, 2012, p. 517). In Tupinambá, as a classical example of TG languages, Rodrigues (2001, pp. 107–109) establishes five cases marked by unstressed suffixes: -pe ‘punctual locative,’ -β‎o ‘diffuse locative,’ -i ‘situational locative,’ -a ‘argumentative,’ and -amo ‘translative’ (English ‘at’). Apart from the argumentative, four cases have a locative meaning and are used in adverbial phrases. The argumentative case is used for the distinction of syntactical, core and non-core arguments (subjects, objects, adverbials other than locatives), and locatives. This kind of case marking is still used in modern TG languages (subgroups 4–8; see the discussion by Queixalós, 2001a). The punctual locative case (PTG *-pe) functions as locative in several Tupian and as locative and dative in many TG languages. The diffuse locative (PTG *-β‎o) is used to form gerunds in Tupinambá and Guaraní, but has locative meaning in Tuparí, Makurap, and Aweti, and instrumental in Mundurukú.

The other reconstructed cases are found as locative suffixes or postpositions in modern Tupí languages. Some of them are identifiable in TG languages: **eʦe ‘relative/associative’ is preserved in Mawé (-eté ‘at, against’), it is locative ‘on,’ ‘onto’ for example in Kamayurá -r-ehe, Mbyá –re, instrumental in Wayãpi -le ‘with,’ PTG *ʦuwi ‘from’ contains a reflex of PT **wi ‘ablative’; it corresponds with Kamayurá -wi, Guarayu -sui, Tocantins Asurini -hi ‘from.’ Reflexes of **ko ‘allative’ may be identified in dative forms of Paiter (-kaj) and Karo (-kəj) and -kaj ‘allative’ of Mundurukú.

6.4 Voice: Valence-Changing Devices

Voice is the most important verbal category in all Tupian languages. Valence-changing devices are prefixes in all studied languages. Rodrigues and Cabral (2012, pp. 527–533) distinguish four derivational valence-changing prefixes in PT. Causative **mo- changes intransitive bases into transitive verbs and also derives transitive verbs from nouns. Causative-comitative **erjo- ~ erje- changes intransitive into transitive verbs, the causee being simultaneously the co-agent in the verbal process.


Tupian Languages

Transitive verbs are made intransitives by means of reflexive **we- and reciprocal **wo-.


Tupian Languages


Tupian Languages


Tupian Languages

In modern Paraguayan Guaraní, more than in other TG languages, the reflexive is also used as a middle voice or even as a passive. This is probably due to influence from Spanish or Portuguese, where reflexive forms frequently are used to avoid the expression of the agent (middle voice); cf. Spanish la puerta se abre, Portuguese a porta abre-se ‘the door opens’:


Tupian Languages

A fifth valence-changing device has been developed in TG languages. The factitive suffix *-ukar makes ditransitive from transitive verbs. The human/animate object agent is coded as an indirect object (dative), as in the following example:


Tupian Languages

6.5 Nominalization and “Subordination”

6.5.1 Nominalization

Especially in TG languages, stems denoting an action may occur as predicates or referents according to whether they have verbal or nominal prefixes, for example Tocantins Asurini a-hém / 1SG.V come.out / ‘I came out’ – sé hem-a / 1SG.NOM come.out-CAS / ‘my coming out.’ In sé hem-a, hem is a verb that is nominalized, showing the nominal suffix -a.

Nominalization of verb forms and verb phrases is a common strategy in Tupian word formation and syntax. According to Rodrigues and Cabral (2012, p. 533), four nominalizing affixes can be reconstructed for PT:

Tupian Languages

Agentive (AG) nouns are derived from transitive verbs: Surui-Paiter aka-t ‘killer,’ Mundurukú yaoka-at ‘killer,’ Mawé henoi-hat ‘teacher.’ Examples of reflexes of the reconstructed PTG form *-ʦar/ *-ar/ *-tar are Correntino Guaraní monda-ha ‘robber,’ juka-ha ‘killer’ (Cerno, 2013, p. 261);

Guarayu poro-mboe-sar / people-teach-AG / ‘teacher’; Tapirapé miãr-a kotok-ar-a / game-CAS kill-AG-CAS / ‘killer of game,’ ‘hunter’; Kayabí pinaetyk-at / angle.for-AG / ‘fisher.’

Patient nominalization by means of PT **pɨt, corresponding to what are past participles (PP) in English (bound, lost, sharpened), are found in Tuparí and Tupí-Guaraní:


Tupian Languages


Tupian Languages

Object nominalizations (ON) are nominalizations of a transitive verb including its generic object. They are observed in Arikém, Tuparí, Mundurukú, and the larger Mawetí-Guaraní family. In TG languages, reflexes of PT **-mi- occur (Tupinambá, Zo’é), but the more widespread form is PTG *-emi-, used with relational inflection prefixes. Reflexes of this morpheme correspond with present passive participles in Old Greek or Russian. Examples from. Kaiowá are:


Tupian Languages


Tupian Languages

Many of these formations are more or less lexicalized: Kaiowá/Par. G./Mbyá t-embi-’u ‘meal, food,’ literally ‘the object of eating,’ ‘what is eaten’; t-embi-reko ‘wife,’ literally ‘what is owned/held.’

Circumstance nominalizations (CSN) of verbs are derived by means of PT **-ap. They are used in many Tupian families to refer to the process itself, its circumstances, or places, holders, or receptacles where the process takes place or its results are preserved:


Tupian Languages


Tupian Languages


Tupian Languages


Tupian Languages

6.5.2 “Subordination”

As is common in South America (cf. Gijn et al., 2011; Gijn et al., 2015), in Tupian languages subordination is mainly achieved by nominalization, which turns the predicate clauses into syntactic arguments (cf. Drude, 2011). Three kinds of subordination will be regarded here: reported speech and other complement clauses, relative clauses, and adverbial clauses.

See (38) as an example of a complement clause in Paraguayan Guaraní. It is expressed by means of agentive -ha (reflex of PTG *-ʦar):


Tupian Languages

Examples of relative clause nominalizers in Mundurukú, Mekéns, Karo, Gavião, and Tupi-Guaraní:


-(i)at (see also 6.5.1 above)

Mekéns (Tuparí)


Galucio, 2006, 57

Karo (Ramarama)

-(a)p or -(a)m

van Gijn et al., 2015, p. 313

Gavião (Mondé)

-mát and -méne

Moore, 2012


*-β‎aɁe ‘relat. clause’ (RCL):

Rose, 2011, pp. 343–350 [-mãɁẽ]


Tupian Languages

Adverbial clauses do not show definite syntactic devices. As van Gijn et al. (2015) show, different, generally clause-like strategies are used. Temporal particles are important, but also locative (separative) expressions, which may denote causality. In Karitiana, for example, adverbial embedded clauses are marked by head movement and aspectual subordinators; relative clauses, too, instead of nominalization, show fronting (Storto, 2011).

6.6 Other Syntactic Issues

Basic word order in most Tupian families is SOV, but Karitiana presents SOV and OSV in matrix and in embedded clauses (Storto, 2011, p. 220). Under the pressure of dominant Spanish and Portuguese, a change to SVO has been observed in Western Guaraní and Mbyá, but especially in Paraguayan Guaraní. Other syntactic issues such as negation and interrogation, as well as other kinds of word formation (diminutive/attenuative, augmentative, compounding, noun incorporation, reduplication), would be interesting subjects for more detailed descriptions.

