The Eskimo-Aleut language family consists of two quite different branches, Aleut and Eskimo. The latter consists of Yupik and Inuit languages. It is spoken from the eastern coast of Russia to Greenland. The family is thought to have developed and diverged in Alaska between 4,000 and 6,000 years ago, although recent findings in a variety of fields suggest a more complex prehistory than previously assumed. The language family shares certain characteristics, including polysynthetic word formation, an originally ergative-absolutive case system (now substantially modified in Aleut), SOV word order, and more or less similar phonological systems across the language family, involving voiceless stop and voiced fricative consonant series often in alternation, and an originally four-vowel system frequently reduced to three. The languages in the family have undergone substantial postcolonial contact effects, especially evident in (although not restricted to) loanwords from the respective colonial languages. There is extensive language documentation for all languages, although not necessarily all dialects. Most languages and dialects are severely endangered today, with the exception of Eastern Canadian Inuit and Greenlandic (Kalaallisut). There are also theoretical studies of the languages in many linguistic fields, although the languages are unevenly covered, and there are still many more studies of the phonologies and syntaxes of the respective languages than other aspects of grammar.
Hokan is a linguistic stock or phylum based on a series of hypotheses about deeper genetic relationships among languages that extend geographically from Northern California to Nicaragua. Following the general effort to genetically link the vast number of Native American languages and to reduce them to a few superstocks, Dixon and Kroeber first proposed the Hokan stock in 1913, to include several California indigenous languages: Karuk, Chimariko, Shastan, Palaihnihan (Atsugewi and Achumawi), Pomoan, Yana, and later Esselen and Yuman. The name Hokan stems from the Atsugewi word for “two”: hoqi. While the first proposals by Dixon and Kroeber rested on very limited cognate sets comprising only five words, later assessments by Sapir included hundreds of putative cognate sets and analyses of Hokan morphosyntax. By 1925, Sapir further included Washo, Salinan, Seri, Chumashan, Tequistlatecan, and Subtiaba-Tlapanec as the Southern Hokan branch into the stock.
Throughout the 20th century, scholars sought additional evidence for the stock as more and refined data on the languages became available. A number of languages were added, and earlier proposals were abandoned. A new surge in work on individual California indigenous languages in the 1950s and 1960s prompted a string of studies conducting binary comparisons. This renewed interest inspired a series of Hokan conferences held until the 1990s. A more recent comprehensive assessment of the entire stock was undertaken by Kaufman in 1988. Applying rigorous analysis and only implicating those languages for which he encountered substantial evidence, Kaufman proposes sixteen classificatory units for Hokan clustered geographically. Kaufman’s Hokan stock also includes Coahuilteco and Comecrudan in Mexico and Jicaque in Nicaragua.
Although Hokan was widely studied in the 20th century, and many scholars presented what they thought to be supporting evidence, it is far from being an established genetic unit. In fact, many scholars today treat it with a lot of skepticism. One major challenge, as with any phylum-level affiliation, is its time depth. Proto-Hokan is thought to be at least as antique as Proto-Indo-European. Moreover, many of the languages were spoken in geographically contiguous areas, with speakers being multilingual and in close contact for an extended period of time, as is the case in Northern California. This suggests considerable language contact effects and complicates the distinction between true cognates and ancient borrowings. Many of the languages involved further show similarities in grammatical structure as a result of language contact.
Hokan languages stretch across California, Nevada, South Texas, various parts of Mexico, Honduras, and Nicaragua and display notable structural differences. Phonologically, the languages show great variation including small and large phoneme inventories and different phonological processes. Typologically, they are equally diverse, but many are considered polysynthetic to varying degrees. Morphosyntactic and grammatical similarities are evident especially among languages spoken in Northern California. These resemblances include sets of lexical affixes with similar meanings and affinities in core argument patterns.
Languages from at least five genetically unrelated families are spoken in the Caucasus, but there are only three endemic linguistic families belonging to the region: Kartvelian, West Caucasian, and Northeast Caucasian. These families are rather heterogeneous in terms of the number of languages and the distribution of the speakers across them. The Caucasus represents a situation where languages with millions of speakers have coexisted with one-village languages for hundreds of years, and where multilingualism has always been the norm. The richness of Caucasian languages on every linguistic stratum is dazzling: here we find some of the largest consonant inventories, inflectional systems where the mere number of word forms strains credibility (one of the Caucasian languages, Archi, is claimed to have over a million and a half word forms), and challenging syntactic structures. The typological interest of the Caucasian languages and the challenges they present to linguistic theory lie in different areas. Thus, for Kartvelian languages, the number of factors at play in the verbal system make the task of the production of a correct verbal form far from trivial. West Caucasian languages represent an instance of polysynthetic polypersonal verb inflection, which is unusual not only for Caucasus but for Eurasia in general. East Caucasian languages have large systems of non-finite forms which, unusually, retain the ability to realize agreement in gender and number while their non-finite nature is determined by the inability to head an independent clause and to express certain morpho-syntactic categories such as illocutionary force and evidentiality. Finally, all Caucasian languages are ergative to some extent.
Eva Buchi and Steven N. Dworkin
This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
Within the field of linguistics, etymology is the only subdiscipline that is uniquely historical in its study of the relevant linguistic data. It is one of the oldest fields in Romance linguistics. The scholar credited with establishing Romance linguistics as a scholarly discipline, Friedrich Diez (1794–1876) authored both the first comparative Romance historical grammar (his three-volume Grammatik der Romanischen Sprachen [1836–1844]) and the first pan-Romance etymological dictionary (his Etymologisches Wörterbuch der Romanischen Sprachen ). A similar combination, illustrating the indissoluble link between etymology and historical grammar (especially the study of sound change), can be seen in the work of Wilhelm Meyer-Lübke (1861–1936), author of a four-volume Grammatik der Romanischen Sprachen (1890–1902) and of the last complete pan-Romance etymological dictionary, the Romanisches Etymologisches Wörterbuch (3d definitive edition, 1935).
The concept of etymology as practiced by Romanists has changed over the last 100 years. At the outset, Romance etymologists took as their brief the search for and identification of individual word origins. Starting in the early 20th century, various specialists began to view etymology as the preparation of the complete history of all facets of the evolution over time and space of the words or lexical families under study. Identification of the underlying base was only the first step in the process. From this perspective, etymology constitutes an essential element of diachronic lexicology, which covers all formal, semantic, and syntactic facets of a word’s evolution, including, if appropriate, the circumstances leading to its demise and replacement.
Practitioners of Romance etymology tend to study the history of individual words or word families in specific Romance languages rather than across the entire family. Almost every Romance language and many of their regional varieties have at least one etymological dictionary devoted to the history of its vocabulary (or at least to the identification of relevant word origins), the most notable being such multi-volumed works as the Französisches Etymologisches Wörterbuch (1922–2002), the Lessico Etimilogico Italiano (1979–), the Diccionario crítico etimológico castellano e hispánico (1980–1991), and the Diccionari etimològic i complimenari de la llengua catalana (1980–2001). The last complete pan-Romance dictionary remains the afore-cited third edition of Meyer-Lübke’s Romanisches etymologisches Wörterbuch.
Although originally coined as a riposte to the Neogrammarian view of sound change, Jules Gilliéron’s (1854–1926) dictum, “each word has its own history,” applies equally well to etymology. Yakov Malkiel (1914–1998), one of the leading writers on questions of method and practice in Romance etymology, has discussed the unique and complex nature of etymological solutions. As a result of the emphasis on individual problems and solutions, Romance etymology has not lent itself to the formulation of theories on the nature of lexical change, although there was in the past no shortage of literature on questions of methodology.
Although specialists continue to work on language-specific etymological questions, etymology is not currently at the forefront of work in Romance historical linguistics, a situation that may result, in part, from its lack of engagement with broad theoretical issues. Most studies still appear in the form of journal articles or Festschrift contributions. There is currently underway a new pan-Romance project, the Dictionnaire étymologique Roman (DéRom), with a new (and controversial) methodological underpinning, namely the rigorous application to the Romance data of comparative reconstruction to capture more accurately the phonological and morphological reality of proto-Romance (in essence a register of spoken Latin) and the semantic scope of the etymological base. This project has reawakened an interest in Romance etymology among a new generation of Romanists. Indeed, to remain vital and relevant within the framework of Romance linguistics, etymology must go beyond the details of individual lexical histories and make an effort to link its findings to our understanding of the nature and processes of language change.
The Iroquoian languages are spoken today in New York State, Ontario, Quebec, Wisconsin, North Carolina, and Oklahoma. The languages share a relatively small segment inventory, a challenging accentual system, polysynthetic morphology, a complex system of pronominal affixes, an unusual kinship terminology, and a syntax that functions almost exclusively to combine the meaning of two expressions. Some of the languages have been documented since contact with Europeans in the 16th century. There exists substantial scholarly linguistic work on most of the languages, and solid teaching materials continue to be developed.
Gregory D. S. Anderson
The Munda language family constitutes the westernmost branch of the widespread Austroasiatic language family. Munda formerly was considered sister to the rest of the phylum, then known as Mon-Khmer, but this has been revised, and Munda is considered as Austroasiatic as any other branch. The internal classification of the Munda languages is still disputed, but a clear North Munda group exists and is uncontroversial. Other higher-order internal divisions remain disputed, although low-level groups like Sora-Gorum or Gutob-Remo are clear and accepted by almost all researchers today.
Phonologically speaking, Munda languages make extensive use of glottal stop and pre-glottalized stops, nasal vowels, and retroflexion. Word level prosody shows Austroasiatic features with an overlay of South Asian areal features on the phrase level. Register and tone have been reported for individual languages such as creaky voice in Gorum and a low tone in Korku.
Nouns in Munda languages may encode a range of grammatical and local cases, person and number of possessors, and covert distinctions of animacy in agreement and other morphosyntactic features. Verbs in Munda languages can be quite complex, with subject and object as well as TAM encoding, transitivity, finiteness, etc. Kherwarian languages stand out in this regard as well as for the distributional facts of the subject clitics, where the preferred locus is enclitic to the word immediately preceding the verb. Systems of negation can be very complicated and show unexpected interactions with TAM marking in languages like Gutob.
Syntactically, Munda languages show many typical South Asian features such as verb-final structure, as well as non-finite structures, and in some cases switch reference systems or noun incorporation.
The current sociolinguistic and demographic contexts of the different Munda languages range from expanding and healthy with official status in the case of Santali to seriously endangered in the case of Gorum.
The word accent system of Tokyo Japanese might look quite complex with a number of accent patterns and rules. However, recent research has shown that it is not as complex as has been assumed if one incorporates the notion of markedness into the analysis: nouns have only two productive accent patterns, the antepenultimate and the unaccented pattern, and different accent rules can be generalized if one focuses on these two productive accent patterns.
The word accent system raises some new interesting issues. One of them concerns the fact that a majority of nouns are ‘unaccented,’ that is, they are pronounced with a rather flat pitch pattern, apparently violating the principle of obligatoriness. A careful analysis of noun accentuation reveals that this strange accent pattern occurs in some linguistically predictable structures. In morphologically simplex nouns, it typically tends to emerge in four-mora nouns ending in a sequence of light syllables. In compound nouns, on the other hand, it emerges due to multiple factors, such as compound-final deaccenting morphemes, deaccenting pseudo-morphemes, and some types of prosodic configurations.
Japanese pitch accent exhibits an interesting aspect in its interactions with other phonological and linguistic structures. For example, the accent of compound nouns is closely related with rendaku, or sequential voicing; the choice between the accented and unaccented patterns in certain types of compound nouns correlates with the presence or absence of the sequential voicing. Moreover, whether the compound accent rule applies to a certain compound depends on its internal morphosyntactic configuration as well as its meaning; alternatively, the compound accent rule is blocked in certain types of morphosyntactic and semantic structures.
Finally, careful analysis of word accent sheds new light on the syllable structure of the language, notably on two interrelated questions about diphthong-hood and super-heavy syllables. It provides crucial insight into ‘diphthongs,’ or the question of which vowel sequence constitutes a diphthong, against a vowel sequence across a syllable boundary. It also presents new evidence against trimoraic syllables in the language.
K. A. Jayaseelan
The Dravidian languages have a long-distance reflexive anaphor taan. (It is taan in Tamil and Malayalam, taanu in Kannada and tanu in Telugu.) As is the case with other long-distance anaphors, it is subject-oriented; it is also [+human] and third person. Interestingly, it is infelicitous if bound within the minimal clause when it is an argument of the verb. (That is, it seems to obey Principle B of the binding theory.) Although it is subject-oriented in the normal case, it can be bound by a non-subject if the verb is a “psych predicate,” that is, a predicate that denotes a feeling; in this case, it can be bound by the experiencer of the feeling. Again, in a discourse that depicts the thoughts, feelings, or point of view of a protagonist—the so-called “logophoric contexts”—it can be coreferential with the protagonist even if the latter is mentioned only in the preceding discourse (not within the sentence). These latter facts suggest that the anaphor is in fact coindexed with the perspective of the clause (rather than with the subject per se). In cases where this anaphor needs to be coindexed with the minimal subject (to express a meaning like ‘John loves himself’), the Dravidian languages exhibit two strategies to circumvent the Principle B effect. Malayalam adds an emphasis marker tanne to the anaphor; taan tanne can corefer with the minimal subject. This strategy parallels the strategy of European languages and East Asian languages (cf. Scandinavian seg selv). The three other major Dravidian languages—Tamil, Telugu, and Kannada—use a verbal reflexive: they add a light verb koL- (lit. ‘take’) to the verbal complex, which has the effect of reflexivizing the transitive predicate. (It either makes the verb intransitive or gives it a self-benefactive meaning.)
The Dravidian languages also have reciprocal and distributive anaphors. These have bipartite structures. An example of a Malayalam reciprocal anaphor is oral … matte aaL (‘one person … other person’). The distributive anaphor in Malayalam has the form awar-awar (‘they-they’); it is a reduplicated pronoun. The reciprocals and distributives are strict anaphors in the sense that they apparently obey Principle A; they must be bound in the domain of the minimal subject. They are not subject-oriented.
A noteworthy fact about the pronominal system of Dravidian is that the third person pronouns come in proximal-distal pairs, the proximal pronoun being used to refer to something nearby and the distal pronoun being used elsewhere.
George van Driem
Several language families and a few language isolates are represented in the Himalayas, the world’s greatest massif, running a length of over 3,600 km. The most well-represented language family in this region happens to be the Trans-Himalayan language family, whose very centre of gravity and phylogenetic diversity is situated within the Eastern Himalaya. This most populous language family on our planet in terms of numbers of speakers used to be known as Tibeto-Burman but, in some circles, the family formerly also went by the names “Indo-Chinese” or “Sino-Tibetan”, the latter two labels actually designating empirically unsupported and now obsolete models of language relationship. The study of Trans-Himalayan historical grammar began with Brian Houghton Hodgson in the 1830s, who during this time served at Kathmandu as the British Resident to the Kingdom of Nepal. Periodically, minor studies devoted attention to several of the more salient morphosyntactic phenomena of Trans-Himalayan historical grammar, but Stuart Wolfenden contributed the first major monograph to the subject in the 1920s. Finally, the historical morphosyntax of the Trans-Himalayan language family came to be the focus of numerous linguistic studies from the 1970s onward, and since that time our understanding of the historical grammar of the language family has changed drastically.
As ever more languages out of the hundreds of previously undocumented Trans-Himalayan tongues came to be described and analysed in great detail, it came to be understood that the flamboyant verbal agreement morphology observed in languages such as the Kiranti languages of eastern Nepal and the rGyalrongic languages of southwestern China were neither grammatically innovative nor represented typological flukes, but instead represented the most grammatically conservative languages within the entire language family. Subsequently, cognate inflectional systems or vestiges of cognate conjugational morphology were discovered in most other branches of the language family as well. The geographical centre, as well as the centre of phylogenetic diversity of the Trans-Himalayan language family, was identified as the highland arc of the Eastern Himalaya. Sinitic languages, although representing by far the most populous single branch of the Trans-Himalayan family, were now understood as constituting just one out of many subgroups, not more divergent from other branches than any one of the four dozen other subgroups making up the language family. The various types of epistemic marking systems observed sporadically throughout the region were shown to be secondary innovations, reflecting a great variety of semantically distinct language-specific grammatical categories. Particularly, languages showing the typology of the Loloish or Sinitic type were shown to be innovative in their grammar, having lost much of the original Trans-Himalayan morphosyntax.
The rigor and intensity of investigation on Japanese in modern linguistics has been particularly noteworthy over the past 50 years. Not only has the elucidation of the similarities to and differences from other languages properly placed Japanese on the typological map, but Japanese has served as a critical testing area for a wide variety of theoretical approaches.
Within the sub-fields of Japanese phonetics and phonology, there has been much focus on the role of mora. The mora constitutes an important timing unit that has broad implications for analysis of the phonetic and phonological system of Japanese. Relatedly, Japanese possesses a pitch-accent system, which places Japanese in a typologically distinct group arguably different from stress languages, like English, and tone languages, like Chinese. A further area of intense investigation is that of loanword phonology, illuminating the way in which segmental and suprasegmental adaptations are processed and at the same time revealing the fundamental nature of the sound system intrinsic to Japanese.
In morphology, a major focus has been on compounds, which are ubiquitously found in Japanese. Their detailed description has spurred in-depth discussion regarding morphophonological (e.g., Rendaku—sequential voicing) and morphosyntactic (e.g., argument structure) phenomena that have crucial consequences for morphological theory. Rendaku is governed by layers of constraints that range from segmental and prosodic phonology to structural properties of compounds, and serves as a representative example in demonstrating the intricate interaction of the different grammatical aspects of the language. In syntax, the scrambling phenomenon, allowing for the relatively flexible permutation of constituents, has been argued to instantiate a movement operation and has been instrumental in arguing for a configurational approach to Japanese. Japanese passives and causatives, which are formed through agglutinative morphology, each exhibit different types: direct vs. indirect passives and lexical vs. syntactic causatives. Their syntactic and semantic properties have posed challenges to and motivations for a variety of approaches to these well-studied constructions in the world’s languages.
Taken together, the empirical analyses of Japanese and their theoretical and conceptual implications have made a tremendous contribution to linguistic research.