Paul de Lacy
Phonology has both a taxonomic/descriptive and cognitive meaning. In the taxonomic/descriptive context, it refers to speech sound systems. As a cognitive term, it refers to a part of the brain’s ability to produce and perceive speech sounds. This article focuses on research in the cognitive domain.
The brain does not simply record speech sounds and “play them back.” It abstracts over speech sounds, and transforms the abstractions in nontrivial ways. Phonological cognition is about what those abstractions are, and how they are transformed in perception and production.
There are many theories about phonological cognition. Some theories see it as the result of domain-general mechanisms, such as analogy over a Lexicon. Other theories locate it in an encapsulated module that is genetically specified, and has innate propositional content. In production, this module takes as its input phonological material from a Lexicon, and refers to syntactic and morphological structure in producing an output, which involves nontrivial transformation. In some theories, the output is instructions for articulator movement, which result in speech sounds; in other theories, the output goes to the Phonetic module. In perception, a continuous acoustic signal is mapped onto a phonetic representation, which is then mapped onto underlying forms via the Phonological module, which are then matched to lexical entries.
Exactly which empirical phenomena phonological cognition is responsible for depends on the theory. At one extreme, it accounts for all human speech sound patterns and realization. At the other extreme, it is little more than a way of abstracting over speech sounds. In the most popular Generative conception, it explains some sound patterns, with other modules (e.g., the Lexicon and Phonetic module) accounting for others. There are many types of patterns, with names such as “assimilation,” “deletion,” and “neutralization”—a great deal of phonological research focuses on determining which patterns there are, which aspects are universal and which are language-particular, and whether/how phonological cognition is responsible for them.
Phonological computation connects with other cognitive structures. In the Generative T-model, the phonological module’s input includes morphs of Lexical items along with at least some morphological and syntactic structure; the output is sent to either a Phonetic module, or directly to the neuro-motor interface, resulting in articulator movement. However, other theories propose that these modules’ computation proceeds in parallel, and that there is bidirectional communication between them.
The study of phonological cognition is a young science, so many fundamental questions remain to be answered. There are currently many different theories, and theoretical diversity over the past few decades has increased rather than consolidated. In addition, new research methods have been developed and older ones have been refined, providing novel sources of evidence. Consequently, phonological research is both lively and challenging, and is likely to remain that way for some time to come.
Marilyn May Vihman
Child phonological templates are idiosyncratic word production patterns. They can be understood as deriving, through generalization of patterning, from the very first words of the child, which are typically close in form to their adult targets. Templates can generally be identified only some time after a child’s first 20–50 words have been produced but before the child has achieved an expressive lexicon of 200 words. The templates appear to serve as a kind of ‘holding strategy’, a way for children to produce more complex adult word forms while remaining within the limits imposed by the articulatory, planning, and memory limitations of the early word period. Templates have been identified in the early words of children acquiring a number of languages, although not all children give clear evidence of using them. Within a given language we see a range of different templatic patterns, but these are nevertheless broadly shaped by the prosodic characteristics of the adult language as well as by the idiosyncratic production preferences of a given child; it is thus possible to begin to outline a typology of child templates. However, the evidence base for most languages remains small, ranging from individual diary studies to rare longitudinal studies of as many as 30 children. Thus templates undeniably play a role in phonological development, but their extent of use or generality remains unclear, their timing for the children who show them is unpredictable, and their period of sway is typically brief—a matter of a few weeks or months at most. Finally, the formal status and relationship of child phonological templates to adult grammars has so far received relatively little attention, but the closest parallels may lie in active novel word formation and in the lexicalization of commonly occurring expressions, both of which draw, like child templates, on the mnemonic effects of repetition.
When the phonological form of a morpheme—a unit of meaning that cannot be decomposed further into smaller units of meaning—involves a particular melodic pattern as part of its sound shape, this morpheme is specified for tone. In view of this definition, phrase- and utterance-level melodies—also known as intonation—are not to be interpreted as instances of tone. That is, whereas the question “Tomorrow?” may be uttered with a rising melody, this melody is not tone, because it is not a part of the lexical specification of the morpheme tomorrow. A language that presents morphemes that are specified with specific melodies is called a tone language. It is not the case that in a tone language every morpheme, content word, or syllable would be specified for tone. Tonal specification can be highly restricted within the lexicon. Examples of such sparsely specified tone languages include Swedish, Japanese, and Ekagi (a language spoken in the Indonesian part of New Guinea); in these languages, only some syllables in some words are specified for tone. There are also tone languages where each and every syllable of each and every word has a specification. Vietnamese and Shilluk (a language spoken in South Sudan) illustrate this configuration. Tone languages also vary greatly in terms of the inventory of phonological tone forms. The smallest possible inventory contrasts one specification with the absence of specification. But there are also tone languages with eight or more distinctive tone categories. The physical (acoustic) realization of the tone categories is primarily fundamental frequency (F0), which is perceived as pitch. However, often other phonetic correlates are also involved, in particular voice quality. Tone plays a prominent role in the study of phonology because of its structural complexity. That is, in many languages, the way a tone surfaces is conditioned by factors such as the segmental composition of the morpheme, the tonal specifications of surrounding constituents, morphosyntax, and intonation. On top of this, tone is diachronically unstable. This means that, when a language has tone, we can expect to find considerable variation between dialects, and more of it than in relation to other parts of the sound system.
Matthew K. Gordon
Metrical structure refers to the phonological representations capturing the prominence relationships between syllables, usually manifested phonetically as differences in levels of stress. There is considerable diversity in the range of stress systems found cross-linguistically, although attested patterns represent a small subset of those that are logically possible. Stress systems may be broadly divided into two groups, based on whether they are sensitive to the internal structure, or weight, of syllables or not, with further subdivisions based on the number of stresses per word and the location of those stresses. An ongoing debate in metrical stress theory concerns the role of constituency in characterizing stress patterns. Certain approaches capture stress directly in terms of a metrical grid in which more prominent syllables are associated with a greater number of grid marks than less prominent syllables. Others assume the foot as a constituent, where theories differ in the inventory of feet they assume. Support for foot-based theories of stress comes from segmental alternations that are explicable with reference to the foot but do not readily emerge in an apodal framework. Computational tools, increasingly, are being incorporated in the evaluation of phonological theories, including metrical stress theories. Computer-generated factorial typologies provide a rigorous means for determining the fit between the empirical coverage afforded by metrical theories and the typology of attested stress systems. Computational simulations also enable assessment of the learnability of metrical representations within different theories.
Corpus Phonology is an approach to phonology that places corpora at the center of phonological research. Some practitioners of corpus phonology see corpora as the only object of investigation; others use corpora alongside other available techniques (for instance, intuitions, psycholinguistic and neurolinguistic experimentation, laboratory phonology, the study of the acquisition of phonology or of language pathology, etc.). Whatever version of corpus phonology one advocates, corpora have become part and parcel of the modern research environment, and their construction and exploitation has been modified by the multidisciplinary advances made within various fields. Indeed, for the study of spoken usage, the term ‘corpus’ should nowadays only be applied to bodies of data meeting certain technical requirements, even if corpora of spoken usage are by no means new and coincide with the birth of recording techniques. It is therefore essential to understand what criteria must be met by a modern corpus (quality of recordings, diversity of speech situations, ethical guidelines, time-alignment with transcriptions and annotations, etc.) and what tools are available to researchers. Once these requirements are met, the way is open to varying and possibly conflicting uses of spoken corpora by phonological practitioners. A traditional stance in theoretical phonology sees the data as a degenerate version of a more abstract underlying system, but more and more researchers within various frameworks (e.g., usage-based approaches, exemplar models, stochastic Optimality Theory, sociophonetics) are constructing models that tightly bind phonological competence to language use, rely heavily on quantitative information, and attempt to account for intra-speaker and inter-speaker variation. This renders corpora essential to phonological research and not a mere adjunct to the phonological description of the languages of the world.
This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
Autosegments were introduced by John Goldsmith in his 1976 MIT dissertation to represent tone and other suprasegmental phenomena. Goldsmith’s intuition, embodied in the term he created, was that autosegments constituted an independent, conceptually equal tier of phonological representation, with both tiers realized simultaneously like the separate voices in a musical score.
The analysis of suprasegmentals came late to generative phonology, even though it had been tackled in American structuralism with the long components of Harris 1944 and despite being a particular focus of Firthian prosodic analysis. The standard version of generative phonology of the era (Chomsky & Halle’s The Sound Pattern of English) made no special provision for phenomena that had been labeled suprasegmental or prosodic by earlier traditions.
An early sign that tones required a separate tier of representation was the phenomenon of tonal stability. In many tone languages, when vowels are lost historically or synchronically, their tones remain. The behavior of contour tones in many languages also falls into place when the contours are broken down into sequences of level tones on an independent level or representation. The autosegmental framework captured this naturally, since a sequence of elements on one tier can be connected to a single element on another. But the single most compelling aspect of the early autosegmental model was a natural account of tone spreading, a very common process that was only awkwardly captured by rules of whatever sort. Goldsmith’s autosegmental solution was the well-formedness condition, requiring, among other things, that every tone on the tonal tier be associated with some segment on the segmental tier, and vice-versa. Tones thus spread more or less automatically to segments lacking them. The condition of well-formedness, at the very core of the autosegmental framework, was a rare constraint, posited nearly two decades before optimality theory.
One-to-many associations and spreading onto adjacent elements are characteristic of tone but not confined to it. Similar behaviors are widespread in long-distance phenomena including intonation, vowel harmony, and nasal prosodies, as well as more locally with partial or full assimilation across adjacent segments. A major discovery, in Mark Liberman’s 1975 MIT dissertation, was that autosegmental tiers have hierarchical structure, with Goldsmith’s autosegments as the terminal elements of those structures.
The early autosegmental notion of tiers of representation that were distinct but conceptually equal soon gave way to a model with one basic tier—called the skeleton or CV tier—connected to tiers for particular kinds of articulation, including tone and intonation, nasality, vowel features, and others. This has led to hierarchical representations of phonological features in current models of feature geometry, replacing the unordered distinctive feature matrices of early generative phonology.
Autosegmental representations and processes also provide a means of representing nonconcatenative morphology, notably the complex interweaving of roots and patterns in Semitic languages.
Marie K. Huffman
Articulatory phonetics is concerned with the physical mechanisms involved in producing spoken language. A fundamental goal of articulatory phonetics is to relate linguistic representations to articulator movements in real time and the consequent acoustic output that makes speech a medium for information transfer. Understanding the overall process requires an appreciation of the aerodynamic conditions necessary for sound production and the way that the various parts of the chest, neck, and head are used to produce speech. One descriptive goal of articulatory phonetics is the efficient and consistent description of the key articulatory properties that distinguish sounds used contrastively in language. There is fairly strong consensus in the field about the inventory of terms needed to achieve this goal. Despite this common, segmental, perspective, speech production is essentially dynamic in nature. Much remains to be learned about how the articulators are coordinated for production of individual sounds and how they are coordinated to produce sounds in sequence. Cutting across all of these issues is the broader question of which aspects of speech production are due to properties of the physical mechanism and which are the result of the nature of linguistic representations. A diversity of approaches is used to try to tease apart the physical and the linguistic contributions to the articulatory fabric of speech sounds in the world’s languages. A variety of instrumental techniques are currently available, and improvement in safe methods of tracking articulators in real time promises to soon bring major advances in our understanding of how speech is produced.
Kodi Weatherholtz and T. Florian Jaeger
The seeming ease with which we usually understand each other belies the complexity of the processes that underlie speech perception. One of the biggest computational challenges is that different talkers realize the same speech categories (e.g., /p/) in physically different ways. We review the mixture of processes that enable robust speech understanding across talkers despite this lack of invariance. These processes range from automatic pre-speech adjustments of the distribution of energy over acoustic frequencies (normalization) to implicit statistical learning of talker-specific properties (adaptation, perceptual recalibration) to the generalization of these patterns across groups of talkers (e.g., gender differences).
This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
Phonological learnability deals with the formal properties of phonological grammars when combined with algorithms that attempt to learn the language-specific aspects of those grammars. The classical learning task can be outlined as follows: beginning at a pre-determined initial state, the learner is exposed to positive evidence of legal strings and structures from the target language, and its goal is to reach a pre-determined end state, where the grammar will produce or accept all, and only, the target language’s strings and structures. In addition, a phonological learner must also acquire a set of language-specific representations for morphemes, words, and so on, and in many cases, the grammar and the representations must be acquired at the same time.
Phonological learnability research seeks to determine how the architecture of the grammar, and the workings of an associated learning algorithm, influence success in completing this learning task, that is, in reaching the end-state grammar. One basic question is about convergence: Is the learning algorithm guaranteed to converge on an end-state grammar, or will it never stabilize? Is there a class of initial states, or a kind of learning data (evidence), which can prevent a learner from converging?
Next is the question of success: Assuming the algorithm will reach an end state, will it match the target? In particular, will the learner ever acquire a grammar that deems grammatical a superset of the target language’s legal outputs? How can the learner avoid such superset end-state traps, whether by calibration of the initial state, biases in the learning algorithm, or other methods?
A third question considers the time-course of learning: How long does the learner take to reach the end state? How does the time to convergence and/or success increase as the grammar becomes more complex and the evidence set becomes larger? Are some grammars too complex to ever be learned?
In assessing phonological learnability, the analyst also has many differences between potential learning algorithms to consider. At the core of any algorithm is its update rule, meaning its method(s) of changing the current grammar on the basis of evidence. Other key aspects of an algorithm include how it is triggered to learn, how it processes and/or stores the errors that it makes, and how it responds to noise or variability in the learning data. Ultimately, the choice of algorithm is also tied to the type of phonological grammar being learned—whether the generalizations being learned are couched within rules, features, parameters, constraints, rankings, and/or weightings.
Child phonology refers to virtually every phonetic and phonological phenomenon observable in the speech productions of children, including babbles. This includes qualitative and quantitative aspects of babbled utterances as well as all behaviors such as the deletion or modification of the sounds and syllables contained in the adult (target) forms that the child is trying to reproduce in his or her spoken utterances. This research is also increasingly concerned with issues in speech perception, a field of investigation that has traditionally followed its own course; it is only recently that the two fields have started to converge. The recent history of research on child phonology, the theoretical approaches and debates surrounding it, as well as the research methods and resources that have been employed to address these issues empirically, parallel the evolution of phonology, phonetics, and psycholinguistics as general fields of investigation. Child phonology contributes important observations, often organized in terms of developmental time periods, which can extend from the child’s earliest babbles to the stage when he or she masters the sounds, sound combinations, and suprasegmental properties of the ambient (target) language. Central debates within the field of child phonology concern the nature and origins of phonological representations as well as the ways in which they are acquired by children. Since the mid-1900s, the most central approaches to these questions have tended to fall on each side of the general divide between generative vs. functionalist (usage-based) approaches to phonology. Traditionally, generative approaches have embraced a universal stance on phonological primitives and their organization within hierarchical phonological representations, assumed to be innately available as part of the human language faculty. In contrast to this, functionalist approaches have utilized flatter (non-hierarchical) representational models and rejected nativist claims about the origin of phonological constructs. Since the beginning of the 1990s, this divide has been blurred significantly, both through the elaboration of constraint-based frameworks that incorporate phonetic evidence, from both speech perception and production, as part of accounts of phonological patterning, and through the formulation of emergentist approaches to phonological representation. Within this context, while controversies remain concerning the nature of phonological representations, debates are fueled by new outlooks on factors that might affect their emergence, including the types of learning mechanisms involved, the nature of the evidence available to the learner (e.g., perceptual, articulatory, and distributional), as well as the extent to which the learner can abstract away from this evidence. In parallel, recent advances in computer-assisted research methods and data availability, especially within the context of the PhonBank project, offer researchers unprecedented support for large-scale investigations of child language corpora. This combination of theoretical and methodological advances provides new and fertile grounds for research on child phonology and related implications for phonological theory.