John E. Joseph
Ferdinand de Saussure (1857–1913), the founding figure of modern linguistics, made his mark on the field with a book he published a month after his 21st birthday, in which he proposed a radical rethinking of the original system of vowels in Proto-Indo-European. A year later, he submitted his doctoral thesis on a morpho-syntactic topic, the genitive absolute in Sanskrit, to the University of Leipzig. He went to Paris intending to do a second, French doctorate, but instead he was given responsibility for courses on Gothic and Old High Gerrman at the École Pratique des Hautes Études, and for managing the publications of the Société de Linguistique de Paris. He abandoned more than one large publication project of his own during the decade he spent in Paris. In 1891 he returned to his native Geneva, where the University created a chair in Sanskrit and the history and comparison of languages for him. He produced some significant work on Lithuanian during this period, connected to his early book on the Indo-European vowel system, and yielding Saussure’s Law, concerning the placement of stress in Lithuanian. He undertook writing projects about the general nature of language, but again abandoned them. In 1907, 1908–1909, and 1910–1911, he gave three courses in general linguistics at the University of Geneva, in which he developed an approach to languages as systems of signs, each sign consisting of a signifier (sound pattern) and a signified (concept), both of them mental rather than physical in nature, and conjoined arbitrarily and inseparably. The socially shared language system, or langue, makes possible the production and comprehension of parole, utterances, by individual speakers and hearers. Each signifier and signified is a value generated by its difference from all the other signifiers or signifieds with which it coexists on an associative (or paradigmatic) axis, and affected as well by its syntagmatic axis. Shortly after Saussure’s death at 55, two of his colleagues, Bally and Sechehaye, gathered together students’ notes from the three courses, as well as manuscript notes by Saussure, and from them constructed the Cours de linguistique générale, published in 1916. Over the course of the next several decades, this book became the basis for the structuralist approach, initially within linguistics, and later adapted to other fields. Saussure left behind a large quantity of manuscript material that has gradually been published over the last few decades, and continues to be published, shedding new light on his thought.
David R. Mortensen
Hmong-Mien (also known as Miao-Yao) is a bipartite family of minority languages spoken primarily in China and mainland Southeast Asia. The two branches, called Hmongic and Mienic by most Western linguists and Miao and Yao by Chinese linguists, are both compact groups (phylogenetically if not geographically). Although they are uncontroversially distinct from one another, they bear a strong mutual affinity. But while their internal relationships are reasonably well established, there is no unanimity regarding their wider genetic affiliations, with many Chinese scholars insisting on Hmong-Mien membership in the Sino-Tibetan superfamily, some Western scholars suggesting a relationship to Austronesian and/or Tai-Kradai, and still others suggesting a relationship to Mon-Khmer. A plurality view appears to be that Hmong-Mien bears no special relationship to any surviving language family.
Hmong-Mien languages are typical—in many respects—of the non-Sino-Tibetan languages of Southern China and mainland Southeast Asia. However, they possess a number of properties that make them stand out. Many neighboring languages are tonal, but Hmong-Mien languages are, on average, more so (in terms of the number of tones). While some other languages in the area have small-to-medium consonant inventories, Hmong-Mien languages (and especially Hmongic languages) often have very large consonant inventories with rare classes of sounds like uvulars and voiceless sonorants. Furthermore, while many of their neighbors are morphologically isolating, few language groups display as little affixation as Hmong-Mien languages. They are largely head-initial, but they deviate from this generalization in their genitive-noun constructions and their relative clauses (which vary in position and structure, sometimes even within the same language).
Inflection is the systematic relation between words’ morphosyntactic content and their morphological form; as such, the phenomenon of inflection raises fundamental questions about the nature of morphology itself and about its interfaces. Within the domain of morphology proper, it is essential to establish how (or whether) inflection differs from other kinds of morphology and to identify the ways in which morphosyntactic content can be encoded morphologically. A number of different approaches to modeling inflectional morphology have been proposed; these tend to cluster into two main groups, those that are morpheme-based and those that are lexeme-based. Morpheme-based theories tend to treat inflectional morphology as fundamentally concatenative; they tend to represent an inflected word’s morphosyntactic content as a compositional summing of its morphemes’ content; they tend to attribute an inflected word’s internal structure to syntactic principles; and they tend to minimize the theoretical significance of inflectional paradigms. Lexeme-based theories, by contrast, tend to accord concatenative and nonconcatenative morphology essentially equal status as marks of inflection; they tend to represent an inflected word’s morphosyntactic content as a property set intrinsically associated with that word’s paradigm cell; they tend to assume that an inflected word’s internal morphology is neither accessible to nor defined by syntactic principles; and they tend to treat inflection as the morphological realization of a paradigm’s cells. Four important issues for approaches of either sort are the nature of nonconcatenative morphology, the incidence of extended exponence, the underdetermination of a word’s morphosyntactic content by its inflectional form, and the nature of word forms’ internal structure. The structure of a word’s inventory of inflected forms—its paradigm—is the locus of considerable cross-linguistic variation. In particular, the canonical relation of content to form in an inflectional paradigm is subject to a wide array of deviations, including inflection-class distinctions, morphomic properties, defectiveness, deponency, metaconjugation, and syncretism; these deviations pose important challenges for understanding the interfaces of inflectional morphology, and a theory’s resolution of these challenges depends squarely on whether that theory is morpheme-based or lexeme-based.
This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
The concept of innateness (innate is first recorded in the period 1375–1425; from Latin innātus “inborn”) relates to types of behavior and knowledge that are present in the organism since birth (in fact, since fertilization), prior to any sensory experience with the environment. The term has been applied to two general types of qualities. The first consists of instinctive and inflexible reflexes and behaviors, which are apparent in survival, mating, and rearing activities. The other relates to cognition, with certain concepts, ideas, propositions, and particular ways of mental computation suggested to be part of one’s biological makeup. While both types of innatism have a long history in human philosophy and science (e.g., Plato and Descartes), some bias appears to exist in favor of claims for inherent behavioral traits, which are typically accepted when satisfactory empirical evidence is provided. One famous example is Lorenz’s demonstration of imprinting, a natural phenomenon that obeys a predetermined mechanism and schedule (Lorenz’s incubator-hatched goslings imprinted on his boots, the first moving object they encountered). Likewise, there seems to be little controversy in regard to predetermined ways of organizing sensory information, as is the case with the detection and classification of shapes and colors by the mind. In contrast, the idea that certain types of abstract knowledge may be part of an organism’s biological endowment (i.e., not learned) is typically faced with a greater sense of skepticism, and touches on a fundamental question in epistemological philosophy: Can reason be based (to a certain extent) on a priori knowledge—that is, knowledge that precedes and is independent of experience? The most influential and controversial claim for such innate knowledge in modern science is Chomsky’s breakthrough nativist theory of Universal Grammar in language and the famous “Argument from the Poverty of the Stimulus.” The main Chomskyan hypothesis is that all human beings share a preprogrammed linguistic infrastructure consisting of a finite collection of rules that, in principle, may generate (through combination or transformation) an infinite number of (only) grammatical sentences. Thus, the innate grammatical system constrains and structures the acquisition and use of all natural languages.
The Japanese psycholinguistics research field is moving rapidly in many different directions as it includes various sub-linguistics fields (e.g., phonetics/phonology, syntax, semantics, pragmatics, discourse studies). Naturally, diverse studies have reported intriguing findings that shed light on our language mechanism. This article presents a brief overview of some of the notable early 21st century studies mainly from the language acquisition and processing perspectives. The topics are divided into various sections: the sound system, the script forms, reading and writing, morpho-syntactic studies, word and sentential meanings, and pragmatics and discourse studies sections. Studies on special populations are also mentioned.
Studies on the Japanese sound system have advanced our understanding of L1 and L2 (first and second language) acquisition and processing. For instance, more evidence is provided that infants form adult-like phonological grammar by 14 months in L1, and disassociation of prosody is reported from one’s comprehension in L2. Various cognitive factors as well as L1 influence the L2 acquisition process. As the Japanese language users employ three script forms (hiragana, katakana, and kanji) in a single sentence, orthographic processing research reveal multiple pathways to process information and the influence of memory. Adult script decoding and lexical processing has been well studied and research data from special populations further helps us to understand our vision-to-language mapping mechanism. Morpho-syntactic and semantic studies include a long debate on the nativist (generative) and statistical learning approaches in L1 acquisition. In particular, inflectional morphology and quantificational scope interaction in L1 acquisition bring pros and cons of both approaches as a single approach. Investigating processing mechanisms means studying cognitive/perceptual devices. Relative clause processing has been well-discussed in Japanese because Japanese has a different word order (SOV) from English (SVO), allows unpronounced pronouns and pre-verbal word permutations, and has no relative clause marking at the verbal ending (i.e., morphologically the same as the matrix ending). Behavioral and neurolinguistic data increasingly support incremental processing like SVO languages and an expectancy-driven processor in our L1 brain. L2 processing, however, requires more study to uncover its mechanism, as the literature is scarce in both L2 English by Japanese speakers and L2 Japanese by non-Japanese speakers. Pragmatic and discourse processing is also an area that needs to be explored further. Despite the typological difference between English and Japanese, the studies cited here indicate that our acquisition and processing devices seem to adjust locally while maintaining the universal mechanism.
This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
Kra–Dai, also known as Tai–Kadai, Daic, and Kadai, are a family of highly diverse languages found in southern China, northeast India, and Southeast Asia. The number of these languages is estimated to be close to a hundred, with approximately 100 million speakers all over the world. As the name itself suggests, Kra-Dai is made up of two major groups, Kra and Dai. The former refers to a number of lesser-known languages, some of which have only a few hundred fluent speakers or even less. The latter (also known as Tai, or Kam-Tai) is well established, and comprises the best known members of the family: Thai and Lao, the national languages of Thailand and Laos, whose speakers account for over half of Kra-Dai population.
The ultimate genetic affiliation of Kra-Dai remains controversial, although a consensus among Western scholars holds that it belongs under Austronesian. The majority of Kra-Dai languages have no writing systems of their own, particularly Kra. Languages with writing systems include Thai, Lao, Sipsongpanna Dai, and Tai Lue. These use the Indic-based scripts. Others use Chinese character-based scripts, such as the Zhuang and Kam-Sui in southern China and surrounding regions. Romanized scripts were also introduced in the 1950s, by the government for the Zhuang and the Kam-Sui languages. Almost each group within Kra-Dai has a rich oral history tradition.
The languages are typically tonal, isolating, and analytic, lacking in inflectional morphology, with no distinction for number and gender. A significant number of basic vocabulary items are mono-syllabic, but bi-syllabic and multi-syllabic compounds also abound. There are morphological processes in which etymologically related words manifest themselves in groups through tonal, initial, or vowel alternations. Reduplication is a salient word formation mechanism. In syntax, the Kra-Dai languages can be said to have basic SVO word order. They possess a rich system of noun classifiers. Other features include verb serialization without overt marking to indicate grammatical relations. A number of lexical items (mostly verbs) may function as grammatical morphemes in syntactic operations. Temporal and aspectual meanings are expressed through tense-aspect markers typically derived from verbs, while mood and modality are conveyed via a rich array of discourse particles.
Phenomena involving the displacement of syntactic units are widespread in human languages. The term displacement refers here to a dependency relation whereby a given syntactic constituent is interpreted simultaneously in two different positions. Only one position is pronounced, in general the hierarchically higher one in the syntactic structure. Consider a wh-question like (1) in English:
(1) Whom did you give the book to <whom>
The phrase containing the interrogative wh-word is located at the beginning of the clause, and this guarantees that the clause is interpreted as a question about this phrase; at the same time, whom is interpreted as part of the argument structure of the verb give (the copy, in <> brackets). In current terms, inspired by minimalist developments in generative syntax, the phrase whom is first merged as (one of) the complement(s) of give (External Merge) and then re-merged (Internal Merge, i.e., movement) in the appropriate position in the left periphery of the clause. This peripheral area of the clause hosts operator-type constituents, among which interrogative ones (yielding the relevant interpretation: for which x, you gave a book to x, for sentence 1). Scope-discourse phenomena—such as, e.g., the raising of a question as in (1), the focalization of one constituent as in TO JOHN I gave the book (not to Mary)—have the effect that an argument of the verb is fronted in the left periphery of the clause rather than filling its clause internal complement position, whence the term displacement. Displacement can be to a position relatively close to the one of first merge (the copy), or else it can be to a position farther away. In the latter case, the relevant dependency becomes more long-distance than in (1), as in (2)a and even more so (2)b:
a Whom did Mary expect [that you would give the book to<whom >]
b Whom do you think [that Mary expected [that you would give the book to <whom >]]
50 years or so of investigation on locality in formal generative syntax have shown that, despite its potentially very distant realization, syntactic displacement is in fact a local process. The audible position in which a moved constituent is pronounced and the position of its copy inside the clause can be far from each other. However, the long-distance dependency is split into steps through iterated applications of short movements, so that any dependency holding between two occurrences of the same constituent is in fact very local. Furthermore, there are syntactic domains that resist movement out of them, traditionally referred to as islands. Locality is a core concept of syntactic computations. Syntactic locality requires that syntactic computations apply within small domains (cyclic domains), possibly in the mentioned iterated way (successive cyclicity), currently rethought of in terms of Phase theory. Furthermore, in the Relativized Minimality tradition, syntactic locality requires that, given X . . . Z . . . Y, the dependency between the relevant constituent in its target position X and its first merge position Y should not be interrupted by any constituent Z which is similar to X in relevant formal features and thus intervenes, blocking the relation between X and Y. Intervention locality has also been shown to allow for an explicit characterization of aspects of children’s linguistic development in their capacity to compute complex object dependencies (also relevant in different impaired populations).
Laura A. Michaelis
Meanings are assembled in various ways in a construction-based grammar, and this array can be represented as a continuum of idiomaticity, a gradient of lexical fixity. Constructional meanings are the meanings to be discovered at every point along the idiomaticity continuum. At the leftmost, or ‘fixed,’ extreme of this continuum are frozen idioms, like the salt of the earth and in the know. The set of frozen idioms includes those with idiosyncratic syntactic properties, like the fixed expression by and large (an exceptional pattern of coordination in which a preposition and adjective are conjoined). Other frozen idioms, like the unexceptionable modified noun red herring, feature syntax found elsewhere. At the rightmost, or ‘open’ end of this continuum are fully productive patterns, including the rule that licenses the string Kim blinked, known as the Subject-Predicate construction. Between these two poles are (a) lexically fixed idiomatic expressions, verb-headed and otherwise, with regular inflection, such as chew/chews/chewed the fat; (b) flexible expressions with invariant lexical fillers, including phrasal idioms like spill the beans and the Correlative Conditional, such as the more, the merrier; and (c) specialized syntactic patterns without lexical fillers, like the Conjunctive Conditional (e.g., One more remark like that and you’re out of here). Construction Grammar represents this range of expressions in a uniform way: whether phrasal or lexical, all are modeled as feature structures that specify phonological and morphological structure, meaning, use conditions, and relevant syntactic information (including syntactic category and combinatoric potential).
Computational models of human sentence comprehension help researchers reason about how grammar might actually be used in the understanding process. Taking a cognitivist approach, this article relates computational psycholinguistics to neighboring fields (such as linguistics), surveys important precedents, and catalogs open problems.
Phonotactics is the study of restrictions on possible sound sequences in a language. In any language, some phonotactic constraints can be stated without reference to morphology, but many of the more nuanced phonotactic generalizations do make use of morphosyntactic and lexical information. At the most basic level, many languages mark edges of words in some phonological way. Different phonotactic constraints hold of sounds that belong to the same morpheme as opposed to sounds that are separated by a morpheme boundary. Different phonotactic constraints may apply to morphemes of different types (such as roots versus affixes). There are also correlations between phonotactic shapes and following certain morphosyntactic and phonological rules, which may correlate to syntactic category, declension class, or etymological origins.
Approaches to the interaction between phonotactics and morphology address two questions: (1) how to account for rules that are sensitive to morpheme boundaries and structure and (2) determining the status of phonotactic constraints associated with only some morphemes. Theories differ as to how much morphological information phonology is allowed to access. In some theories of phonology, any reference to the specific identities or subclasses of morphemes would exclude a rule from the domain of phonology proper. These rules are either part of the morphology or are not given the status of a rule at all. Other theories allow the phonological grammar to refer to detailed morphological and lexical information. Depending on the theory, phonotactic differences between morphemes may receive direct explanations or be seen as the residue of historical change and not something that constitutes grammatical knowledge in the speaker’s mind.