As might be expected from the difficulty of traversing it, the Sahara Desert has been a fairly effective barrier to direct contact between its two edges; trans-Saharan language contact is limited to the borrowing of non-core vocabulary, minimal from south to north and mostly mediated by education from north to south. Its own inhabitants, however, are necessarily accustomed to travelling desert spaces, and contact between languages within the Sahara has often accordingly had a much greater impact. Several peripheral Arabic varieties of the Sahara retain morphology as well as vocabulary from the languages spoken by their speakers’ ancestors, in particular Berber in the southwest and Beja in the southeast; the same is true of at least one Saharan Hausa variety. The Berber languages of the northern Sahara have in turn been deeply affected by centuries of bilingualism in Arabic, borrowing core vocabulary and some aspects of morphology and syntax. The Northern Songhay languages of the central Sahara have been even more profoundly affected by a history of multilingualism and language shift involving Tuareg, Songhay, Arabic, and other Berber languages, much of which remains to be unraveled. These languages have borrowed so extensively that they retain barely a few hundred core words of Songhay vocabulary; those loans have not only introduced new morphology but in some cases replaced old morphology entirely. In the southeast, the spread of Arabic westward from the Nile Valley has created a spectrum of varieties with varying degrees of local influence; the Saharan ones remain almost entirely undescribed. Much work remains to be done throughout the region, not only on identifying and analyzing contact effects but even simply on describing the languages its inhabitants speak.
The central goal of the Lexical Semantic Framework (LSF) is to characterize the meaning of simple lexemes and affixes and to show how these meanings can be integrated in the creation of complex words. LSF offers a systematic treatment of issues that figure prominently in the study of word formation, such as the polysemy question, the multiple-affix question, the zero-derivation question, and the form and meaning mismatches question.
LSF has its source in a confluence of research approaches that follow a decompositional approach to meaning and, thus, defines simple lexemes and affixes by way of a systematic representation that is achieved via a constrained formal language that enforces consistency of annotation. Lexical-semantic representations in LSF consist of two parts: the Semantic/Grammatical Skeleton and the Semantic/Pragmatic Body (henceforth ‘skeleton’ and ‘body’ respectively). The skeleton is comprised of features that are of relevance to the syntax. These features act as functions and may take arguments. Functions and arguments of a skeleton are hierarchically arranged. The body encodes all those aspects of meaning that are perceptual, cultural, and encyclopedic.
Features in LSF are used in (a) a cross-categorial, (b) an equipollent, and (c) a privative way. This means that they are used to account for the distinction between the major ontological categories, may have a binary (i.e., positive or negative) value, and may or may not form part of the skeleton of a given lexeme. In order to account for the fact that several distinct parts integrate into a single referential unit that projects its arguments to the syntax, LSF makes use of the Principle of Co-indexation. Co-indexation is a device needed in order to tie together the arguments that come with different parts of a complex word to yield only those arguments that are syntactically active.
LSF has an important impact on the study of the morphology-lexical semantics interface and provides a unitary theory of meaning in word formation.
Nora C. England
Mayan languages are spoken by over 5 million people in Guatemala, Mexico, Belize, and Honduras. There are around 30 different languages today, ranging in size from fairly large (about a million speakers) to very small (fewer than 30 speakers). All Mayan languages are endangered given that at least some children in some communities are not learning the language, and two languages have disappeared since European contact. Mayas developed the most elaborated and most widely attested writing system in the Americas (starting about 300 BC).
The sounds of Mayan languages consist of a voiceless stop and affricate series with corresponding glottalized stops (either implosive and ejective) and affricates, glottal stop, voiceless fricatives (including h in some of them inherited from Proto-Maya), two to three nasals, three to four approximants, and a five vowel system with contrasting vowel length (or tense/lax distinctions) in most languages. Several languages have developed contrastive tone.
The major word classes in Mayan languages include nouns, verbs, adjectives, positionals, and affect words. The difference between transitive verbs and intransitive verbs is rigidly maintained in most languages. They usually use the same aspect markers (but not always). Intransitive verbs only indicate their subjects while transitive verbs indicate both subjects and objects. Some languages have a set of status suffixes which is different for the two classes. Positionals are a root class whose most characteristic word form is a non-verbal predicate. Affect words indicate impressions of sounds, movements, and activities. Nouns have a number of different subclasses defined on the basis of characteristics when possessed, or the structure of compounds. Adjectives are formed from a small class of roots (under 50) and many derived forms from verbs and positionals.
Predicate types are transitive, intransitive, and non-verbal. Non-verbal predicates are based on nouns, adjectives, positionals, numbers, demonstratives, and existential and locative particles. They are distinct from verbs in that they do not take the usual verbal aspect markers. Mayan languages are head marking and verb initial; most have VOA flexible order but some have VAO rigid order. They are morphologically ergative and also have at least some rules that show syntactic ergativity. The most common of these is a constraint on the extraction of subjects of transitive verbs (ergative) for focus and/or interrogation, negation, or relativization. In addition, some languages make a distinction between agentive and non-agentive intransitive verbs. Some also can be shown to use obviation and inverse as important organizing principles. Voice categories include passive, antipassive and agent focus, and an applicative with several different functions.
The Dravidian languages, spoken mainly in southern India and south Asia, were identified as a separate language family between 1816 and 1856. Four of the 26 Dravidian languages, namely Tamil, Telugu, Kannada, and Malayalam, have long literary traditions, the earliest dating back to the 1st century
A typical characteristic of Dravidian, which is also an areal characteristic of south Asian languages, is that experiencers and inalienable possessors are case-marked dative. Another is the serialization of verbs by the use of participles, and the use of light verbs to indicate aspectual meaning such as completion, self- or nonself-benefaction, and reflexivization. Subjects, and arguments in general (e.g., direct and indirect objects), may be nonovert. So is the copula, except in Malayalam.
A number of properties of Dravidian are of interest from a universalist perspective, beginning with the observation that not all syntactic categories N, V, A, and P are primitive. Dravidian postpositions are nominal or verbal in origin. A mere 30 Proto-Dravidian roots have been identified as adjectival; the adjectival function is performed by inflected verbs (participles) and nouns. The nominal encoding of experiences (e.g., as fear rather than afraid/afeared) and the absence of the verb have arguably correlate with the appearance of dative case on experiencers. “Possessed” or genitive-marked N may fulfill the adjectival function, as noticed for languages like Ulwa (a less exotic parallel is the English of-possessive construction: circles of light, cloth of gold). More uniquely perhaps, Kannada instantiates dative-marked N as predicative adjectives. A recent argument that Malayalam verbs originate as dative-marked N suggests both that N is the only primitive syntactic category, and the seminal role of the dative case.
Other important aspects of Dravidian morphosyntax to receive attention are anaphors and pronouns (not discussed here; see separate article, anaphora in Dravidian), in particular the long-distance anaphor taan and the verbal reflexive morpheme; question (wh-) words and the question/disjunction morphemes, which combine in a semantically transparent way to form quantifier words like someone; the use of reduplication for distributive quantification; and the occurrence of ‘monstrous agreement’ (first-person agreement in clauses embedded under a speech predicate, triggered by matrix third-person antecedents).
Traditionally, agreement has been considered the finiteness marker in Dravidian. Modals, and a finite form of negation, also serve to mark finiteness. The nonfinite verbal complement to the finite negative may give the negative clause a tense interpretation. Dravidian thus attests matrix nonfinite verbs in finite clauses, challenging the equation of finiteness with tense.
The Dravidian languages are considered wh-in situ languages. However, wh-words in Malayalam appear in a pre-verbal position in the unmarked word order. The apparently rightward movement of some wh-arguments could be explained by assuming a universal VO order, and wh-movement to a preverbal focus phrase. An alternative analysis is that the verb undergoes V-to-C movement.
Number is the category through which languages express information about the individuality, numerosity, and part structure of what we speak about. As a linguistic category it has a morphological, a morphosyntactic, and a semantic dimension, which are variously interrelated across language systems. Number marking can apply to a more or less restricted part of the lexicon of a language, being most likely on personal pronouns and human/animate nouns, and least on inanimate nouns. In the core contrast, number allows languages to refer to ‘many’ through the description of ‘one’; the sets referred to consist of tokens of the same type, but also of similar types, or of elements pragmatically associated with one named individual. In other cases, number opposes a reading of ‘one’ to a reading as ‘not one,’ which includes masses; when the ‘one’ reading is morphologically derived from the ‘not one,’ it is called a singulative. It is rare for a language to have no linguistic number at all, since a ‘one–many’ opposition is typically implied at least in pronouns, where the category of person discriminates the speaker as ‘one.’ Beyond pronouns, number is typically a property of nouns and/or determiners, although it can appear on other word classes by agreement. Verbs can also express part-structural properties of events, but this ‘verbal number’ is not isomorphic to nominal number marking. Many languages allow a variable proportion of their nominals to appear in a ‘general’ form, which expresses no number information. The main values of number-marked elements are singular and plural; dual and a much rarer trial also exist. Many languages also distinguish forms interpreted as paucals or as greater plurals, respectively, for small and usually cohesive groups and for generically large ones. A broad range of exponence patterns can express these contrasts, depending on the morphological profile of a language, from word inflections to freestanding or clitic forms; certain choices of classifiers also express readings that can be described as ‘plural,’ at least in certain interpretations. Classifiers can co-occur with other plurality markers, but not when these are obligatory as expressions of an inflectional paradigm, although this is debated, partly because the notion of classifier itself subsumes distinct phenomena. Many languages, especially those with classifiers, encode number not as an inflectional category, but through word-formation operations that express readings associated with plurality, including large size. Current research on number concerns all its morphological, morphosyntactic, and semantic dimensions, in particular the interrelations of them as part of the study of natural language typology and of the formal analysis of nominal phrases. The grammatical and semantic function of number and plurality are particularly prominent in formal semantics and in syntactic theory.
Old and Middle Japanese are the pre-modern periods of the attested history of the Japanese language. Old Japanese (OJ) is largely the language of the 8th century, with a modest, but still significant number of written sources, most of which is poetry. Middle Japanese is divided into two distinct periods, Early Middle Japanese (EMJ, 800–1200) and Late Middle Japanese (LMJ, 1200–1600). EMJ saw most of the significant sound changes that took place in the language, as well as profound influence from Chinese, whereas most grammatical changes took place between the end of EMJ and the end of LMJ. By the end of LMJ, the Japanese language had reached a form that is not significantly different from present-day Japanese.
OJ phonology was simple, both in terms of phoneme inventory and syllable structure, with a total of only 88 different syllables. In EMJ, the language became quantity sensitive, with the introduction of a long versus short syllables. OJ and EMJ had obligatory verb inflection for a number of modal and syntactic categories (including an important distinction between a conclusive and an (ad)nominalizing form), whereas the expression of aspect and tense was optional. Through late EMJ and LMJ this system changed completely to one without nominalizing inflection, but obligatory inflection for tense.
The morphological pronominal system of OJ was lost in EMJ, which developed a range of lexical and lexically based terms of speaker and hearer reference. OJ had a two-way (speaker–nonspeaker) demonstrative system, which in EMJ was replaced by a three-way (proximal–mesial–distal) system.
OJ had a system of differential object marking, based on specificity, as well as a word order rule that placed accusative marked objects before most subjects; both of these features were lost in EMJ. OJ and EMJ had genitive subject marking in subordinate clauses and in focused, interrogative and exclamative main clauses, but no case marking of subjects in declarative, optative, or imperative main clauses and no nominative marker. Through LMJ genitive subject marking was gradually circumscribed and a nominative case particle was acquired which could mark subjects in all types of clauses.
OJ had a well-developed system of complex predicates, in which two verbs jointly formed the predicate of a single clause, which is the source of the LMJ and NJ (Modern Japanese) verb–verb compound complex predicates. OJ and EMJ also had mono-clausal focus constructions that functionally were similar to clefts in English; these constructions were lost in LMJ.
Reduplication is a word-formation process in which all or part of a word is repeated to convey some form of meaning. A wide range of patterns are found in terms of both the form and meaning expressed by reduplication, making it one of the most studied phenomenon in phonology and morphology. Because the form always varies, depending on the base to which it is attached, it raises many issues such as the nature of the repetition mechanism, how to represent reduplicative morphemes, and whether or not a unified approach can be proposed to account for the full range of patterns.
Polysynthesis is informally understood as the packing of a large number of morphemes into single words, as in (1) from Bininj Gun-wok (Evans, in press).
'I cooked the wrong meat for them again.'
Its status as a distinct typological category into which some of the world’s languages fall, on a par with isolating, agglutinating, or fusional languages, has been controversial from the start. Nevertheless, researchers working with these languages are seldom in doubt as to their status as distinct from these other morphological types. This has been complicated by the fact that the speakers of such languages are largely limited to hunter-gatherers—or were so in the not too distant past—so the temptation is to link the phenomenon directly to way of life. This proves to be oversimplified, although it is certainly true that languages qualifying as polysynthetic are almost everywhere spoken in peripheral regions and are on the decline in the modern world—few children are learning them today.
Perhaps the most pervasive of the traits that give these languages the impression of a “special” status is that of holophrasis, which can be defined as the (possible) expression of what in less synthetic languages would be whole sentences in single complex (usually verbal) words. It turns out, however, that there is much greater variety among polysynthetic languages than is generally thought: there are few other traits that they all share, although distinct subtypes can in fact be distinguished, notably the affixing as opposed to the incorporating type.
These languages have considerable importance for the investigation of the diachronic complexification of languages in general and of language acquisition by children, as well as for theories of language universals. The sociolinguistic factors behind their development have only recently begun to be studied in depth. All polysynthetic languages today are to some degree endangered (they are dying off at an alarming rate), and many have been poorly studied if at all, which makes their investigation before it is too late a prime goal for linguistics.
Christina L. Gagné
Psycholinguistics is the study of how language is acquired, represented, and used by the human mind; it draws on knowledge about both language and cognitive processes. A central topic of debate in psycholinguistics concerns the balance between storage and processing. This debate is especially evident in research concerning morphology, which is the study of word structure, and several theoretical issues have arisen concerning the question of how (or whether) morphology is represented and what function morphology serves in the processing of complex words. Five theoretical approaches have emerged that differ substantially in the emphasis placed on the role of morphemic representations during the processing of morphologically complex words. The first approach minimizes processing by positing that all words, even morphologically complex ones, are stored and recognized as whole units, without the use of morphemic representations. The second approach posits that words are represented and processed in terms of morphemic units. The third approach is a mixture of the first two approaches and posits that a whole-access route and decomposition route operate in parallel. A fourth approach posits that both whole word representations and morphemic representations are used, and that these two types of information interact. A fifth approach proposes that morphology is not explicitly represented, but rather, emerges from the co-activation of orthographic/phonological representations and semantic representations. These competing approaches have been evaluated using a wide variety of empirical methods examining, for example, morphological priming, the role of constituent and word frequency, and the role of morphemic position. For the most part, the evidence points to the involvement of morphological representations during the processing of complex words. However, the specific way in which these representations are used is not yet fully known.