Show Summary Details

Page of

PRINTED FROM the OXFORD RESEARCH ENCYCLOPEDIA, LINGUISTICS ( (c) Oxford University Press USA, 2016. All Rights Reserved. Personal use only; commercial use is strictly prohibited. Please see applicable Privacy Policy and Legal Notice (for details see Privacy Policy).

date: 26 May 2017

Dispersion Theory and Phonology

Summary and Keywords

Dispersion Theory concerns the constraints that govern contrasts, the phonetic differences that can distinguish words in a language. Specifically it posits that there are distinctiveness constraints that favor contrasts that are more perceptually distinct over less distinct contrasts. The preference for distinct contrasts is hypothesized to follow from a preference to minimize perceptual confusion: In order to recover what a speaker is saying, a listener must identify the words in the utterance. The more confusable words are, the more likely a listener is to make errors. Because contrasts are the minimal permissible differences between words in a language, banning indistinct contrasts reduces the likelihood of misperception.

The term ‘dispersion’ refers to the separation of sounds in perceptual space that results from maximizing the perceptual distinctiveness of the contrasts between those sounds, and is adopted from Lindblom’s Theory of Adaptive Dispersion, a theory of phoneme inventories according to which inventories are selected so as to maximize the perceptual differences between phonemes. These proposals follow a long tradition of explaining cross-linguistic tendencies in the phonetic and phonological form of languages in terms of a preference for perceptually distinct contrasts.

Flemming proposes that distinctiveness constraints constitute one class of constraints in an Optimality Theoretic model of phonology. In this context, distinctiveness constraints predict several basic phenomena, the first of which is the preference for maximal dispersion in inventories of contrasting sounds that first motivated the development of the Theory of Adaptive Dispersion. But distinctiveness constraints are formulated as constraints on the surface forms of possible words that interact with other phonological constraints, so they evaluate the distinctiveness of contrasts in context. As a result, Dispersion Theory predicts that contrasts can be neutralized or enhanced in particular phonological contexts. This prediction arises because the phonetic realization of sounds depends on their context, so the perceptual differences between contrasting sounds also depend on context. If the realization of a contrast in a particular context would be insufficiently distinct (i.e., it would violate a high-ranked distinctiveness constraint), there are two options: the offending contrast can be neutralized, or it can be modified (‘enhanced’) to make it more distinct.

A basic open question regarding Dispersion Theory concerns the proper formulation of distinctiveness constraints and the extent of variation in their rankings across languages, issues that are tied up with the questions about the nature of perceptual distinctiveness. Another concerns the size and nature of the comparison set of contrasting word-forms required to be able to evaluate whether a candidate output satisfies distinctiveness constraints.

Keywords: contrast, neutralization, enhancement, speech perception, perceptual similarity

1. Dispersion

Distinctiveness constraints penalize pairs of contrasting possible words that fall below some threshold of perceptual difference, but they can be given a preliminary formulation as constraints that penalize pairs of contrasting words that are differentiated by a particular specified contrast. For example, *D−T penalizes pairs of words that are distinguished only by a difference in stop voicing, such as [da] vs. [ta] or [bi] vs. [pi]. Constraints against less distinct contrasts are universally ranked above constraints against more distinct contrasts. For example, the perceptual difference between prenasalized and voiceless stops [nd] vs. [t] is greater than the perceptual difference between voiced and voiceless stops [d] vs. [t], so *D−T universally ranks above *ND−T, a constraint against contrasts between prenasalized and voiceless stops, because less distinct contrasts incur greater constraint violations. In what follows, [X−Y] is used to refer to contrast between X and Y, where X and Y might be sounds or sound sequences, e.g., [nd−t], or [ata-ada].

A first point to note is that these constraints regulate the permissible contrasts between possible words, not actual words.1 So, if the contrast between unrounded [ɑ‎] and rounded [ɔ‎] is insufficiently distinct in a language that does not mean that both vowels can appear as long as they do not minimally contrast in any pair of words in the lexicon. In other words, it is not the case that [θɔt‎] is acceptable just as long as there is no word [θɑt‎] in the lexicon; rather, [θɔt‎] is not a possible word in such a language if [θɑt‎] is a possible word. This is in line with the standard conception according to which a phonological grammar must characterize the set of possible words (e.g., Chomsky & Halle, 1965). More specifically, contrasts between possible words are important because new words are continually borrowed or coined, but as long as they obey the constraints on well-formed words in that language, a new word will never result in an insufficiently distinct contrast between words.

Distinctiveness constraints favor well-dispersed contrasts, but this preference is opposed by two classes of conflicting constraints that favor maximizing the number of contrasts and minimizing effort. The interaction of these constraints can be observed in the typology of systems of contrasts among voiceless, voiced, and prenasalized stops.

Many languages contrast voiced and voiceless stops, for example, French (Tranel, 1987) and Russian (Jones & Ward, 1969) contrast /b, d, g/ and /p, t, k/, but there are also a number of languages in which the voiced counterparts of voiceless stops are prenasalized (e.g., /mb, nd, Ng/) and plain voiced stops do not occur, for example, San Juan Colorado Mixtec (Campbell, Peterson, & Lorenzo Cruz, 1986) and Fijian (Schütz, 1985). These two types of stop contrasts can be analyzed as deriving from different rankings of distinctiveness constraints with respect to effort-minimization constraints (Flemming, 2004).

Prenasalization of voiced stops results in a more distinct contrast with voiceless stops because prenasalization results in higher intensity of voicing during the stop closure (Stevens, Keyser, & Kawasaki, 1986), so a contrast such as /t−nd/ is better dispersed than /t−d/, and this is favored by distinctiveness constraints (cf. Iverson & Salmons, 1996). However, a conflicting articulatory constraint, *ND, favors plain voiced stops over prenasalized stops. This constraint plausibly encodes a dispreference for greater articulatory complexity of prenasalized stops, compared to voiced stops, because they involve an additional velum gesture that must be precisely timed with respect to the oral closure gesture.

This analysis can be formalized in terms of a fixed ranking of distinctiveness constraints, *D−T >> *ND−T, and a fixed ranking of segmental markedness constraints *ND >> *D, where *D is a constraint that penalizes voiced obstruents. If the constraint against plain voicing contrasts, *D−T, ranks above the constraint against prenasalized stops, *ND, then the more distinct contrast /t−nd/ is preferred, whereas the reverse ranking favors the less distinct, but articulatorily simpler, contrast /t−d/.

An additional constraint, Maximize Contrasts, is required to explain why these contrasts are maintained at all, given that both voiced and prenasalized stops violate effort-minimization constraints. Maximize Contrasts is a positive constraint favoring larger inventories of contrasts. Increasing the number of contrasting sounds increases the information content of sounds: if two sounds can contrast in a particular context then uttering a single sound could distinguish between two words, whereas if three sounds can contrast, then that single sound could distinguish between three words. Accordingly, the evaluation of Maximize Contrasts is based on the number of contrasting sounds in the inventory, with higher numbers being preferred. Maximize Contrasts generally conflicts with maximizing the distinctiveness of contrasts because there is a finite space of possible sounds, so fitting more contrasts into that space generally means that the sounds have to be more similar. Maximize Contrasts and the distinctiveness constraints together conflict with effort minimization because it is not possible to realize distinct contrasts without employing sounds that violate some effort-minimization constraints.

So if Maximize Contrasts is ranked below the other constraints, then voicing contrasts are avoided altogether, and the only stops are the lowest effort voiceless stops

(1i), as in languages such as Hawaiian (Elbert & Pukui, 1979) and Tiwi (Osborne, 1974).2 Conversely, if Maximize Contrasts ranks above the other constraints, then all three consonant types contrast (1iv). The rankings that derive voiced-voiceless and prenasalized-voiceless contrasts are illustrated by (1ii) and (1iii), respectively. Because we are deriving inventories of sounds, the candidates in these tableaux are sets of contrasting sounds. The contrasts are exemplified with coronal stops, but the pattern of violations of these constraints is the same for other places of articulation.


Dispersion Theory and Phonology

This analysis illustrates how distinctiveness constraints interact with conflicting constraints to derive a general preference for more distinct contrasts over less distinct contrasts, and limitations on that preference. But the data considered so far are not sufficient to provide evidence for distinctiveness constraints over alternative analyses. In particular, the same typology of voicing and prenasalization contrasts could be derived without distinctiveness constraints if the relative ranking of *ND and *D is allowed to vary cross-linguistically. The /nd, t/ inventory could then be derived by the ranking *D >> Maximize contrasts >> *ND, while the /d, t/ inventory would be derived from the ranking *ND >> Maximize contrasts >> *D. These two analyses are distinguished by their predictions concerning the markedness of non-contrastive voicing.

A basic difference between distinctiveness constraints like *D−T and regular segmental markedness constraints like *D is that *D−T is violated only if voiced stops contrast with voiceless stops, whereas a constraints like *D is violated by any voiced stop, regardless of what it contrasts with. So the ranking of distinctiveness constraints *D−T >> *ND−T only prefers prenasalized stops over voiced stops where they yield a more distinct contrast with voiceless stops. If there is no such contrast then the fixed ranking *ND >> *D implies that voiced stops should be preferred. On the other hand, if the ranking *D >> *ND were possible it would derive a preference for prenasalized stops over voiced stops regardless of whether they contrast with voiceless stops.

The prediction of the distinctiveness constraint ranking turns out to be correct: prenasalized stops are only preferred over voiced stops where they contrast with voiceless stops. Voiced stops that do not contrast with voiceless stops can arise through intervocalic stop voicing—languages such as Tümpisa Shoshone (Dayley, 1989) and Bardi (Bowern, 2012) have a single series of stops that are generally voiceless, but are voiced in certain contexts, including between vowels. If the ranking *D >> *ND were possible, then we would expect that prenasalized stops could be preferred over voiced stops even under these circumstances, deriving an unattested pattern of intervocalic prenasalization (Flemming, 2004).

This is illustrated by the tableaux in (2). The derivation of intervocalic voicing as in Bardi requires that a constraint against intervocalic voiceless stops, *VTV, ranks above *D, which in turn ranks above Ident(voice), a constraint that penalizes changes in [voice] specifications between input and output. This ranking combined with *D >> *ND would derive intervocalic voicing accompanied by prenasalization, as shown in (2ii), because this ranking implies that prenasalized stops are preferred over voiced stops in all contexts; (2i) shows that the same ranking derives only voiceless stops when the stops are not between vowels—that is, even an underlyingly voiced stop is mapped onto a voiceless stop in this context. The constraint penalizing changes in [nasal] specifications, Ident(nasal), must rank below *D to allow the mapping from /t/ to [nd] in (2ii).


Dispersion Theory and Phonology

The fact that prenasalized stops are only preferred over plain voiced stops where they result in a more distinct contrast with voiceless stops follows from the fixed rankings *D−T >> *ND−T and *ND >> *D—given these rankings, intervocalic prenasalization is correctly predicted to be impossible because the distinctiveness constraints only apply to voicing contrasts, so an allophonically voiced stop does not violate *D−T. Consequently, in the absence of a voicing contrast, plain voiced stops are always preferred over prenasalized stops.

This example illustrates a basic prediction of distinctiveness constraints: the markedness of a sound depends on the sounds that it contrasts with. Without distinctiveness constraints, markedness constraints only penalize phonological configurations in individual forms, not contrasts between forms, so there should not be any phonological generalizations that are sensitive to contrastive status in this way. A number of other generalizations of this type have been identified:

  1. (i) Front unrounded and back rounded vowels are preferred where vowels contrast in backness, but these preferences are suspended where backness contrasts are neutralized. These generalizations follow from a preference to maximize the distinctiveness of vowel contrasts based on second formant frequency which is irrelevant where second formant contrasts are neutralized (Flemming, 2004).

  2. (ii) Pre- and/or post-oralization of nasal stops adjacent to oral vowels (/ma/ → [mba], /am/ → [abm]) only applies where oral vowels contrast with nasal vowels. This generalization follows if partial oralization serves to maintain the distinctiveness of vowel nasalization contrasts by protecting oral vowels from nasal coarticulation, which would make them perceptually similar to nasalized vowels (Herbert, 1986; Stanton, 2015). In the absence of vowel nasalization contrasts, this motivation for partial oralization of nasals does not apply.

  3. (iii) If a language has one non-anterior sibilant, it is usually palato-alveolar [ʃ], but if there is a contrast between two non-anterior sibilants, it is usually between alveolo-palatal [ɕ] and retroflex [ʂ], while [ʃ] is rarely, if ever, contrasted with another non-anterior sibilant. The first generalization suggests that [ʃ] is the unmarked non-anterior sibilant, but if so we would expect to find this sound included in larger sibilant inventories as well. Both patterns can be explained if [ʃ] is articulatorily less marked than [ʂ, ç], but [ʂ] and [ç] are more distinct from each other than either is from [ʃ] (Zygis & Padgett, 2010; Lee-Kim, 2014, p. 78).

2. Perceptual Similarity and the Formulation of Distinctiveness Constraints

The previous analyses formulate distinctiveness constraints as constraints against pairs of words that are differentiated by the sounds specified in the constraint. This formulation necessitates a very large number of constraints—more than one for every pair of segment types because the distinctiveness of contrasts can depend on context, as we will see shortly. These constraints are then ranked according to the principle that less distinct contrasts are more marked than more distinct contrasts—that is, constraints penalizing less distinct contrasts are ranked above those penalizing more distinct contrasts.

There have been efforts to improve on this formulation of Dispersion Theory in two related respects: (i) developing a framework in which distinctiveness constraints can be given more general formulations, and (ii) deriving universal rankings among distinctiveness constraints from independent generalizations about perceptual similarity between speech sounds.

The connection between these two goals is made apparent if we imagine having a comprehensive model of perceptual similarity that derives a scalar measure of distinctiveness for any pair of speech sounds. It would then be possible to derive all rankings among distinctiveness constraints, but it would also be possible to radically reduce the set of distinctiveness constraints by formulating them to refer directly to perceptual distance. All that would be required is a single hierarchy of constraints, each penalizing contrasts that fall below a specified threshold for perceptual distinctiveness.

In the absence of a general model of perceptual similarity, Flemming (2002, 2004) takes steps toward a more general formulation of distinctiveness constraints by adopting phonological representations that more directly reflect perceptual distinctiveness. In this approach, sounds are represented as points in a multidimensional perceptual space, where distance in the space corresponds to perceptual distinctiveness, that is, sounds that are further apart in the space are more distinct (cf. Shepard, 1957; Nosofsky, 1992).

For example, the greater distinctiveness of *ND−T contrasts was attributed previously to the greater difference in intensity of closure voicing between these sounds compared to a plain voicing (D−T) contrast. We can posit a perceptual dimension corresponding to the acoustic dimension of intensity of closure voicing, on which voiceless stops have the lowest value, prenasalized stops have the highest value, and plain voiced stops have an intermediate value. For convenience we will assume that these values are 0, 2 and 1, respectively. We can then determine that ND−T contrasts are more distinct than D−T contrasts because prenasalized and voiceless stops are separated by a distance of two on this dimension while voiced and voiceless stops are separated by a distance of only one. Distinctiveness constraints can then be formulated as setting minimum acceptable perceptual distances between contrasting forms, so a constraint Mindist = voice:2 is violated by forms that differ by less than 2 on the voicing intensity dimension. The ranking *D−T >> *ND−T can then be replaced by the ranking Mindist = voice:1 >> Mindist = voice:2.3

This formulation of distinctiveness constraints yields a more compact constraint set. For example, a single constraint, Mindist = voice:2, penalizes voicing contrasts at all places of articulation ([p−b], [t−d], [k−g], etc) and also penalizes contrasts between prenasalized and voiced stops ([mb−b], etc.).4 The first generalization was expressed previously by formulating the distinctiveness constraint *D−T to generalize over place of articulation, but Mindist constraints referring to perceptual representations can also group articulatorily heterogeneous contrasts such as prenasalization and voicing. In the same way, if the vowel contrasts [i−e], [e−ɛ], and [ɛ−a] are all separated by a distance of 2 units on the perceptual dimension corresponding to first formant frequency (F1), then three constraints *i−e, *e−ɛ, *ɛ−a, can be replaced by Mindist=F1:3.

However, while the relative magnitude of perceptual differences along a single dimension can often be determined with reasonable confidence, it is much less clear how differences on multiple dimensions combine to yield an overall perceptual distance. For example, voiced and voiceless stops differ on a variety of dimensions in addition to closure voicing, such as the duration and intensity of the release burst (Repp, 1979; Revoile, Pickett, Holden, & Talkin, 1982), and voice onset time (Lisker & Abramson, 1970). For the case of D−T and ND−T contrasts, consideration of these additional dimensions does not change the picture much because voiced and prenasalized stops are similar in these respects (Burton, Blumstein, & Stevens, 1992), and thus equally distinct from voiceless stops along these dimensions. However, it is generally not clear how the perceptual distance between a pair of stops depends on simultaneous differences on these dimensions, so it is not yet possible to formulate distinctiveness constraints that refer directly to a general measure of perceptual distance. Flemming (2002, pp. 31–32) expresses generalizations about the relative distinctiveness of contrasts on multiple dimensions in terms of Mindist constraints that specify minimum differences on more than one dimension. For example, Mindist = voice:2 & VOT:1 requires that contrasting sounds differ in both closure voicing and VOT. The combination of these two differences is greater than a difference of voice:2 alone, so Mindist = voice:2 ranks above Mindist = voice:2 & VOT:1.

The other issue faced by the Mindist formulation of distinctiveness constraints is the problem of specifying equivalences between perceptual distances across dimensions. If a constraint like Mindist = voice:2 were translated into a constraint on generalized perceptual distance, it should be satisfied by contrasts that do not differ in voicing as long as they are separated by a perceptual distance that is equal to or greater than a distance of 2 on the voicing dimension. For example, if the difference in VOT between aspirated [tʰ] and unaspirated [t] is as distinct as [nd−t] then a [tʰ−t] contrast should satisfy the generalized equivalent of Mindist = voice:2, but it appears to violate it because [tʰ] and [t] do not differ in closure voicing.

Again, little is known about perceptual equivalences between differences on distinct perceptual dimensions. Flemming (2002, p. 30) proposes that Mindist constraints should disjunctively list all of the perceptual differences that satisfy them, for example, Mindist = voice:2 or VOT:1 would be satisfied by a difference in voice or VOT. Alternatively, equivalence in perceptual distinctiveness can be expressed in terms of constraint ranking, given a slightly different interpretation of Mindist constraints. According to this interpretation a constraint like Mindist = voice:2 does not require every contrast to differ by voice:2, rather it penalizes contrasts that are only differentiated by voice and fall below the specified threshold, so the constraint is violated by contrasts that differ by voice:1 but has nothing to say about contrasts that differ on dimensions other than voicing, and thus does not assign a violation to [t−tŒ]. The aspiration contrast differs in VOT, so its acceptability depends on the ranking of constraints such as Mindist = VOT:1 and Mindist=VOT:2. Given this approach, perceptual equivalence of distances on different dimensions can be represented by ranking the relevant constraints at the same level in the constraint hierarchy. For example, if Mindist = voice:2 and Mindist = VOT:1 are always ranked together, then they will always both rank above or both rank below Maximize Contrasts, so contrasts of voice:2 and VOT:1 will always be equally acceptable or equally unacceptable from the point of view of perceptual distinctiveness.5

Another open question about the nature of perceptual distinctiveness is the extent to which it is language-specific. This question has implications for the extent to which rankings among distinctiveness constraints are universal. If all aspects of perceptual distinctiveness are language-independent then we would expect all rankings among distinctiveness constraints to be fixed, but if perceptual discriminability can vary between languages then the ranking of distinctiveness constraints could vary also.

On the face of it there is abundant evidence for language-specific differences in discriminability of speech sounds from studies of cross-linguistic speech perception. For example, Japanese speakers discriminate [ɹ] and [l] less well than English speakers (e.g., Miyawaki et al., 1975). But it is not clear to what extent these differences are due to the influences of phonological categories on perceptual tasks (Feldman, Griffiths, & Morgan, 2009) as opposed to true differences in the psychoacoustic distinctiveness of sounds (see Babel & Johnson, 2010 for discussion). However there is evidence that the perceptual discriminability of non-linguistic but speech-like sounds can be affected by perceptual learning in the absence of categorization effects (Guenther, Husain, Cohen, & Shinn-Cunningham, 1999), so it is plausible that perceptual learning could give rise to language-specific differences in the distinctiveness of speech sounds.

This type of perceptual learning has been modeled as involving learning a particular distribution of attention across perceptual dimensions, with greater attention resulting in better psychophysical discrimination along that dimension (Goldstone, 1998; Nosofsky, 1986), and the same mechanism has been hypothesized to be involved in acquisition of speech perception (Jusczyk, 1993). Variation in attention to dimensions would imply the possibility of re-ranking Mindist constraints that refer to different dimensions, for example, the ranking of Mindist = voice:1 and Mindist = VOT:1 could vary across languages. However the magnitude of these learning effects has not been established, and there is phonological evidence that some rankings between constraints on distinct dimensions are fixed cross-linguistically (e.g., Flemming, 2008a), so this question remains open.

3. Contextual Neutralization

So far we have primarily considered the role of distinctiveness constraints in selecting the inventory of contrastive segments in a language: distinctiveness constraints favor the selection of more distinct contrasts over less distinct contrasts, and thus favor inventories of segments that are well-dispersed in perceptual space. Once context is taken into account, the same mechanism derives neutralization of contrasts in contexts where they would be insufficiently distinct. In dispersion theory, the contextual neutralization of contrasts arises because the distinctiveness of a given type of contrast varies across contexts because the segmental context may make it difficult or impossible to realize certain cues to that contrast. If a contrast would be insufficiently distinct in a particular context—that is, it would violate a distinctiveness constraint that outranks Maximize Contrasts—then the contrast is neutralized in that context. Thus distinctiveness constraints derive Steriade’s (1999a) generalization that environments in which a contrast is neutralized are those in which “cues to the relevant contrast would be diminished” (Steriade, 1999a, p. 26).

For example, Stanton (2016) analyzes in these terms patterns of contextual neutralization involving prenasalization contrasts of the kind discussed in Section 1. As discussed there, prenasalized stops differ from voiced stops in the intensity of voicing during the stop closure, but an additional cue to the contrast is contextual nasalization on a vowel preceding a prenasalized stop (Beddor & Onsuwan, 2003). However this cue is only available where there is a preceding vowel, so it is not available in word-initial position, for example. As a result D−ND contrasts are less distinct in initial position than in post-vocalic contexts. Accordingly, the distinctiveness constraint *D−ND must be split into context-specific constraints in a fixed ranking: *D−ND/#_ >> *D−ND/V_, penalizing voiced vs. prenasalized stop contrasts word-initially and post-vocalically, respectively. (Alternatively Mindist constraints can be formulated to refer directly to differences based on presence vs. absence of a nasal murmur and presence vs. absence of contextual nasalization, as in Stanton, 2016). Thus we expect to find languages that allow prenasalization contrasts in post-vocalic contexts, but not in word-initial position, a pattern that is attested in languages such as Sinhala (Feinstein, 1979) and Kobon (Davies, 1980). This distribution of contrasts arises if *D−ND/#_ ranks above Maximize contrasts, which in turn ranks above *D−ND/V_, as illustrated in (3) (adapted from Stanton, 2016).

In order to derive neutralization of contrasts in context, it is necessary to move from evaluating inventories of contrasting segments, as in Section 1, to evaluating contrasting forms. The general implications of this move will be discussed further, but here we demonstrate the operation of the constraints using forms that illustrate the two relevant contexts, word-initial and post-vocalic. Maximize contrasts favors realizing a prenasalization contrast in all contexts, as in candidate (a), but a minimal contrast between word-initial prenasalized and voiced stops, as in [nda] vs. [da], violates higher-ranked *D−ND/#_. Candidates (b) and (c) satisfy this distinctiveness constraint because they neutralize the pre-nasalization contrast in initial position, but differ in whether the contrast is neutralized in favor of the plain voiced stop (b) or the pre-nasalized stop (c). Candidate (b) is preferred by the articulatory markedness constraint *ND. But it is important to note that these two candidates satisfy the distinctiveness constraints equally well—the direction of neutralization is determined by other constraints. Neutralizing post-vocalic pre-nasalization contrasts, as in candidates (d) and (e), is sub-optimal because the distinctiveness constraint against these contrasts, *D−ND/V_, ranks below Maximize Contrasts.


Dispersion Theory and Phonology

As summarized in (4), the ranking of distinctiveness constraints *D−ND/#_ >> *D−ND/V_ leads to the prediction that if a language allows D−ND contrasts in initial position, then it also allows them in post-vocalic position, but not vice versa. The contexts in which prenasalized stops can contrast with voiced stops depends on the ranking of Maximize Contrasts with respect to the fixed ranking of distinctiveness constraints: If Maximize Contrasts ranks above both constraints, then prenasalization contrasts are permitted both word-initially and post-vocalically. If Maximize Contrasts ranks between the two distinctive constraints, then prenasalization contrasts are neutralized in word-initial position only, as illustrated previously. If Maximize Contrasts ranks below both distinctiveness constraints then prenasalization contrasts are not permitted in either context. Crucially, it is not possible to derive a language in which prenasalization contrasts are permitted in initial position but not post-vocalically because if the constraint against post-vocalic contrasts, *D−ND/V_, ranks above Maximize contrasts, then the constraint against word-initial prenasalization contrasts necessarily ranks above Maximize contrasts as well. Stanton (2016) confirms this typological asymmetry, based on a survey of 50 languages that permit prenasalization contrasts.


Dispersion Theory and Phonology

There are many additional examples of contextual neutralization in environments where the contrast would be insufficiently distinct, often presented under the rubric of “licensing by cue” (Steriade, 1999a, 1999b). For example obstruent voicing contrasts are commonly restricted to pre-sonorant contexts, and neutralized elsewhere because crucial Voice Onset Time cues are only available in pre-sonorant contexts (Steriade, 1999b), and major place (labial vs. coronal vs. dorsal) contrasts are often restricted to pre-vocalic (or pre-approximant) contexts and neutralized elsewhere because cues from stop bursts and release formant transitions are generally only available before sounds with an open constriction (Steriade, 1999a).

All of these cases can be analyzed in terms of a distinctiveness constraint ranked above Maximize contrasts, which is violated in the contexts where only poorer cues are available but satisfied in the better-cued contexts. For example, a Mindist constraint requiring a difference in VOT for voicing contrasts can only be satisfied in pre-sonorant contexts, not in word-final or pre-obstruent contexts, so ranking this constraint above Maximize Contrasts derives neutralization of voicing contrasts in the latter contexts (Flemming, 2004).

3.1. The Nature of Candidate Sets

As illustrated, distinctiveness constraints evaluate contrasts between forms, so the candidates that are evaluated by these constraints must consist of sets of contrasting forms, whereas conventional markedness constraints can be evaluated with respect to a single form. There are unresolved questions about the size of the candidate sets that are required for the evaluation of distinctiveness constraints.

For example, distinctiveness constraints on contrasts between voiced and prenasalized stops imply that the wellformedness of a candidate containing a prenasalized stop, for example, [nda], must be evaluated in relation to a potential minimally distinct form containing a plain voiced stop, [da]. But all contrasts are subject to distinctiveness constraints, so we must also consider minimal contrasts between prenasalized stops and plain nasals ([nda] vs. [na]), and then [da] and [na] must be adequately distinct from all the possible forms that minimally contrast with them, and so on. So it appears that the entire set of possible words must ultimately be evaluated together.

Flemming (2004) proposes that all possible words can be evaluated together if the set of words is specified in terms of a compact and computationally tractable representation, such as a Finite State Machine, and constraints evaluate that representation. An alternative line of analysis attempts to derive limits on the size of the comparison set needed to evaluate a particular candidate word (e.g., Ní Chiosáin & Padgett, 2009). Flemming (2008b) takes steps toward this goal by separating inventory selection from the selection of possible words constructed from that inventory.

4. Contextual Enhancement

The final basic class of phenomena predicted by distinctiveness constraints is contextual enhancement of contrasts. A distinctiveness constraint like *D−ND/#_ is a markedness constraint that penalizes word initial contrasts between prenasalized and voiced stops because they are not sufficiently perceptually distinct. Given inputs that would violate this distinctiveness constraint if realized faithfully, like /nda/ and /da/, there are in principle two ways to modify those inputs to satisfy the constraint: (i) neutralize the contrast, for example, realizing both as [da], making the distinctiveness constraint moot, or (ii) modify one or both of the contrasting sounds to increase the distinctiveness of the contrast. The first possibility constitutes the case of contextual neutralization, discussed previously (Section 3), while we will refer to the second option as contextual enhancement (cf. Stevens, Keyser, & Kawasaki, 1986).

It is not clear whether there are examples of enhancement to satisfy *D−ND/#_, a point we will return to, but contextual enhancement is exemplified by the process of post-nasal aspiration. This is a pattern in which stops are aspirated after nasals but unaspirated elsewhere, as in Kongo (Meinhof, 1932, pp. 158–159) (5). In the same context, voiced stops are realized faithfully, for example, /N+biazi/ → [mbiazi] ‘ruler’, so aspiration of voiceless stops in the post-nasal context serves to enhance voicing contrasts by increasing the difference between voiceless and voiced stops (Pater, 1999; cf. Hamann & Downing, 2015).


Dispersion Theory and Phonology

Without enhancement, voicing contrasts are liable to be less distinct following nasals because nasals favor voicing in a following stop. Voicing of stops following nasals is a common process cross-linguistically (Pater, 1999; Hayes, 1999), and it can result in neutralization of voicing contrasts, for example, in Kikuyu, Ki-Nande, and Bukusu (Hyman, 2001; Flemming, 2001, p. 14). Hayes (1999) argues that this process has an articulatory-aerodynamic motivation: coarticulatory velum lowering during the stop slows the rise in oral pressure that normally facilitates devoicing of a stop. So, unless additional measures are taken to suppress voicing, part of the closure of a voiceless stop following a nasal will be voiced, making it perceptually similar to a voiced stop. Partial voicing of voiceless stops after nasals is documented for English by Hayes and Stivers (2000) and for Romanian by Steriade and Zhang (2001). We can posit a constraint, *NT, against fully voiceless stops after nasals, motivated by these factors. This constraint is satisfied by partially voicing post-nasal stops, but that reduces the distinctiveness of the contrast with voiced stops. Partially voicing a voiceless stop makes it less distinct from voiced stops, so *D− DT >> *D−T, where DT represents a partially voiced stop.

Aspiration of voiceless stops constitutes an enhancement of the voicing contrast because it increases the difference in Voice Onset Time between voiced and voiceless stops, making the contrast more distinct. So * D−DT ranks above the constraint against contrasting voiced and aspirated partially voiced stops, *D−DTŒ.

Given these constraints, the ranking illustrated in (6) derives post-nasal aspiration. Maximize Contrasts ranks above the general constraint against stop voicing contrasts, *D−T, so these contrasts are acceptable, but the constraint against fully voiceless stops following nasals, *NT, and the constraint against partial voicing contrasts, *D−DT, are undominated, making voicing contrasts impossible following nasals. *NT rules out candidate (a), with a fully voiceless stop in this context, but a partially voiced stop, as in candidate (b), yields an insufficiently distinct contrast with the voiced stop, violating *D−DT. So voicing contrasts after nasals must be neutralized or enhanced. Enhancement is preferred because the constraint against aspirated stops, *Tʰ, ranks below Maximize Contrasts (as well as ranking below *NT and *D−DT), so making the post-nasal contrast more distinct through aspiration, as in candidate (d), violates a lower-ranked constraint than neutralizing voicing after nasals, as in candidate (c). Post-nasal voicing neutralization would be derived if this ranking were reversed.


Dispersion Theory and Phonology

Aspirating all voiceless stops would improve the distinctiveness of all voicing contrasts, that is, *D−T >> *D−Tʰ, but as long as *Tʰ ranks above *D−T, enhancement is restricted to the post-nasal context. In other words, incurring the cost of aspirating stops is only motivated in environments where the distinctiveness of voicing contrasts would otherwise be reduced below their usual prevocalic level.

Other examples of contextual enhancement are identified in Stanton (2015, 2016). For example, in languages like Neverver (Barbour, 2012) and Kobon (Davies, 1980), prenasalization contrasts are enhanced in word-final position by devoicing the stop portion of the prenasalized stop. Contrasts between prenasalized and nasal stops, for example, [and−an], are less distinct in final position due to the absence of oral vs. nasal transitions into a following vowel, which provide an important cue to this contrast, so this is an environment where the contrast is less distinct, and is neutralized in many languages, for example, Acehnese (Durie, 1985) and Lua (Boyeldieu, 1985). Devoicing the stop portion of the prenasalized stop enhances the contrast because the release bursts of voiceless stops are generally longer and louder than those of voiced stops (e.g., Slis & Cohen, 1969), and thus provide more salient cues to the presence of an oral release. So devoicing can partially compensate for the unavailability of cues based on transitions into a following vowel.

There are many languages in which vowel nasalization contrasts are enhanced adjacent to nasal consonants by denasalizing the portion of the nasal consonant adjacent to an oral vowel (/ma/ → [mba], /am/ → [abm]), a pattern mentioned in Section 1. Vowel nasalization contrasts are commonly neutralized adjacent to nasal consonants because oral vowels are partially nasalized in this context due to nasal coarticulation and thus are perceptually similar to nasalized vowels. An alternative to neutralization is to enhance the vowel nasalization contrast by denasalizing the portion of nasal consonants adjacent to oral vowels, ensuring that oral vowels are fully oral (Herbert, 1986; Stanton, 2015).

The analysis of contextual enhancement leads to the expectation that there should be a symmetry between contextual neutralization and enhancement processes given that they are motivated by the same hierarchies of distinctiveness constraints. It is an open question whether this prediction is borne out. The examples of contextual enhancement described previously have parallel patterns of contextual neutralization: the context in which we find enhancement of contrasts between prenasalized and nasalized stops is the same as the context in which we observe contextual neutralization of this contrast (Stanton, 2016), and the contexts in which vowel nasalization contrasts are more likely to neutralize adjacent to nasals are also contexts in which they are more likely to be enhanced by partially denasalizing the nasal (Stanton, 2015). However, there are many patterns of contextual neutralization for which no corresponding pattern of enhancement is reported. For example, there is no reported pattern of enhancement of word-initial contrasts between prenasalized and voiced stops to parallel the pattern of contextual neutralization discussed in Section 3. Similarly, there are no reported typological generalizations about contextual enhancement of obstruent voicing contrasts to parallel the implicational universals governing contextual neutralization of these contrasts described in Steriade (1999b).

It is unclear whether there is a real asymmetry between contextual neutralization and enhancement, or whether the apparent asymmetries result from the fact that neutralization of a contrast is generally reliably reported in language descriptions based on impressionistic transcription whereas enhancement can involve relatively subtle phonetic adjustments that are not consistently or reliably recorded in transcriptions. For example, enhancement of obstruent voicing contrasts can involve modifications such as sub-phonemic lengthening of vowels preceding voiced obstruents or release of word-final stops.

5. Alternative Analyses

5.1. P-map Correspondence Constraints

The licensing-by-cue phenomena discussed in Section 3 have been analyzed in terms of perceptually ranked correspondence constraints (Boersma, 1998; Jun, 1995; Steriade, 1995, 2009). For example, Steriade’s (2001, 2009) P-map hypothesis posits that the acceptability of an unfaithful mapping between phonological forms depends on the perceptual magnitude of the differences between those forms. So the fact that the contrast between prenasalized and plain voiced stops is more distinct in post-vocalic contexts than in word-initial contexts implies that mapping an underlying prenasalized stop onto a plain voiced stop in a post-vocalic context is a greater perceptual change than the same mapping applied in a word-initial context. So, by the P-map hypothesis, loss of pre-nasalization in a post-vocalic context is a greater violation of faithfulness than loss of pre-nasalization in a word-initial context. Accordingly we can posit a fixed ranking of correspondence constraints penalizing correspondence between prenasalized and plain voiced stops:


Dispersion Theory and Phonology

Neutralization of prenasalization contrasts can then be derived by ranking the markedness constraint against prenasalized stops, *ND, above IdentPrenasalization constraints. The possible rankings of *ND with respect to the ranking in (7) derive the same typology as the dispersion-theoretic analysis outlined previously: If *ND ranks above IdentPrenasalization/_V, it also ranks above IdentPrenasalization/_#, so neutralization of prenasalization contrasts in post-vocalic contexts implies neutralization in word-initial contexts, but not vice versa.

There are two basic differences between the P-map Correspondence-based analysis of contextual neutralization and the dispersion theoretic analysis. The first is that distinctiveness constraints can derive both dispersion effects and licensing-by-cue patterns of contextual neutralization, whereas Correspondence constraints can derive only patterns of neutralization (Flemming, 2005, p. 172). This is because dispersion effects involve unfaithful mappings: if a language favors prenasalized stops over voiced stops, then any plain voiced stop in the input must be unfaithfully mapped onto a licit output segment, such as a prenasalized stop. According to the dispersion-theoretic analysis outlined, this is motivated by a markedness constraint against the contrast between plain voiced and voiceless stops. Given the P-map hypothesis, the generalization that contrasts between prenasalized and voiceless stops are more distinct than contrasts between plain voiced and voiceless stops is reflected in a fixed ranking of correspondence constraints such that correspondence between prenasalized and voiceless stops (e.g., /nda/ → [ta]) violates a higher-ranked constraint than correspondence between voiced and voiceless stops (e.g., /da/ → [ta]). But that ranking does not favor realizing voiced stops as prenasalized because only a markedness constraint can favor an unfaithful mapping.

To derive prenasalization of voiced stops it would be necessary to posit the ranking *D >> *ND, but we have already seen in Section 1 that this ranking is untenable because it incorrectly predicts that prenasalized stops could be favored over voiced stops even in the absence of a contrast with voiceless stops. So distinctiveness constraints are independently required in order to account for dispersion effects, even if we posit perceptually-ranked Correspondence constraints. It should be noted that distinctiveness constraints also cannot subsume the functions of perceptually ranked Correspondence constraints. The P-map hypothesis leads to the prediction that unfaithful mappings should generally involve perceptually minimal changes (Steriade, 2001, 2009), whereas distinctiveness constraints are markedness constraints and thus do not regulate alternations. So both types of constraints can be motivated independently (Flemming, 2005).

The second difference between the two analyses of contextual neutralization is that the dispersion-theoretic analysis makes the outcome of neutralization independent of the motivation for neutralization. For example, in the analysis of neutralization of prenasalization contrasts in word-initial position (3), neutralization is motivated by the distinctiveness constraint *D−ND/#_. This constraint is satisfied by neutralizing the contrast in favor of prenasalized stops (3c) or voiced stops (3b), or neutralization to a third segment type—all it requires is that voiced and prenasalized stops not appear in the same word-initial contexts. The fact that the outcome of neutralization is a voiced stop is determined by the articulatory markedness constraint *ND. This pattern is general: the outcome of neutralization is determined by markedness constraints that are independent of the distinctiveness constraints that motivate neutralization. On the other hand, in the P-map based analysis, the outcome of neutralization is determined by the same constraint that motivates neutralization, in this case *ND, so neutralization is generally expected to result in the other member of the contrast, in this case the plain voiced stop. Given the dispersion-theoretic analysis, there is no need for the outcome of neutralization to correspond precisely to either of the sounds that occurs in contexts of contrast. This makes it straightforward to account for cases in which the outcome of neutralization is intermediate between the contrasting sounds, a pattern that is exemplified by neutralization of retroflexion contrasts in Australian languages (Flemming, 2005, pp. 174–175).

Many Australian languages have contrasts between retroflex and apical alveolar consonants but neutralize them in contexts where there is no preceding vowel, including word-initial position (Steriade, 1995, 2001). As Steriade shows, this is an example of neutralization in contexts where the contrast would be insufficiently distinct: the strongest cues to the distinction between retroflexes and apical alveolars lie in the formant transitions into to the consonant closure, but these transitions can only be realized in the presence of a preceding vowel, so the contrast is less distinct in word-initial and post-consonantal contexts than in post-vocalic contexts. Based on palatographic evidence, Butcher (1995) shows that the neutralized stops in several Australian languages are intermediate between the contrasting sounds: they are post-alveolar, unlike the apical alveolars, but they are apical while the contrastive retroflexes are sub-laminal. The apical alveolar and sub-laminal retroflex consonants are preferred in positions where the two sounds contrast because they yield a distinct contrast, but where the contrast is neutralized, its distinctiveness is irrelevant. The intermediate realization of neutralized apical stops can be analyzed as a compromise between effort minimization, which disfavors sub-laminal retroflexes, and distinctiveness from contrasting laminal coronals, which favors a post-alveolar realization because the more retracted apicals appear to be better differentiated from laminal dentals by stop burst quality (Tabain, 2012; Butcher, 2012), as expected given that the location of the closure at the release of a retroflex is slightly more retracted than the release of an apical alveolar, and thus further from the closure of a dental (Butcher, 1995).

Other examples of neutralization to an intermediate realization have been reported:

  1. (i) Ernestus (2000) and Jansen (2004, pp. 71 and following) argue that neutralization of obstruent voicing contrasts in word-final position in Dutch results in stops that are phonetically voiceless but produced without some of the active devoicing measures observed in contrastive voiceless stops. This pattern can be analyzed as minimization of effort in the absence of contrast.

  2. (ii) McCrary (2004, pp. 236 and following) provides evidence that where the contrast between singleton and geminate stops in Italian is neutralized, stops are realized with a duration intermediate between contrastive singleton and geminate stops. Maximizing the distinctiveness of the duration contrast requires a big difference in the duration of short and long consonants, but where there is no duration contrast, consonants can default to a less marked, intermediate duration.

Without distinctiveness constraints, the analyses of all of these patterns would require constraints against both contrasting sounds occurring in the context of neutralization and a constraint against the intermediate sound occurring in contrastive contexts. For example, the analysis of Australian languages would require constraints penalizing sublaminal retroflexes and apical alveolars in word-initial position, *#ʈ and *#t̺, and a constraint against apical post-alveolars occurring in post-vocalic contexts, *Vṭ. These constraints can derive a variety of processes that do not appear to be attested, such as allophonic alternation between sub-laminal retroflexes in post-vocalic contexts and apical post-alveolars in word-initial position, and neutralization of contrasts involving apical post-alveolars in post-vocalic contexts.

5.2. Perceptual Distinctiveness and Sound Change

An alternative line of analysis argues that the cross-linguistic preference for perceptually distinct contrasts results from a tendency for less distinct contrasts to be lost through the process of sound change, not from any synchronic constraints penalizing less distinct contrasts (e.g., Ohala, 1990; Blevins, 2004). For example, Ohala (1990) argues that neutralization of stop place contrasts through assimilation to the place of a following stop is cross-linguistically common because stop place is likely to be misperceived in this context, making loss of these contrasts more likely to be lost over time through the process of sound change.

The predictions of this line of analysis differ from the predictions of Dispersion Theory in a number of respects. First, an analysis of neutralization based on misperception predicts that whether the result of neutralizing a contrast between A and B is A or B should follow from patterns of misperception of the A−B contrast, whereas dispersion theory predicts that the direction of neutralization should depend on additional markedness constraints, for example, effort constraints or constraints on the distinctiveness of the remaining contrasts (cf. Section 5.1). The outcome of neutralization processes does not generally correspond to experimentally observed patterns of misperception (Steriade, 2001, p. 233). For example, studies of vowel perception in French show that [i] is misperceived as [y] about as often as [y] is misperceived as [i], so we would expect [i] > [y] sound changes to be as likely as the converse, but in fact unconditioned neutralization of an [i]−[y] contrast always yields unrounded [i], as in Old English and Modern Greek. This outcome is favored by constraints on the distinctiveness of contrasts with back vowels because unrounding the front vowel increases the difference in F2 from back vowels (Flemming, 2005, p. 174).

Second, Dispersion Theory derives enhancement of indistinct contrasts as an alternative to neutralization, but enhancement processes do not follow from misperception of indistinct contrasts. In some cases the appearance of an enhancement effect can be derived via neutralizing sound changes. For example, starting from a language that contrasts voiceless, voiced, and prenasalized stops (/t, d, nd/), neutralization of the contrast between voiced and prenasalized stops in favor of the prenasalized stops results in a language that contrasts plain voiceless stops and prenasalized stops (/t, nd/). That is, a language in which voicing is enhanced by prenasalization, a pattern discussed in Section 1. However, historical evidence indicates that, in at least some languages, this pattern developed from a system with voiced vs. voiceless stops through prenasalization of the voiced stops, not through neutralization (Herbert, 1986, pp. 16 and following). Similarly, comparative evidence indicates that the pattern in which voiceless stops are aspirated after nasals, described in Section 4, developed through a sound change of post-nasal aspiration (Kerremans, 1980).

There are also patterns of contextual enhancement where an account based on neutralization of indistinct contrasts is implausible because it would have to posit antecedent languages with unattested contrasts. For example, the pattern in which vowel nasalization contrasts are enhanced by denasalizing the edges of nasals that are adjacent to oral vowels results in nasals being realized as medionasals between oral vowels, for example, [abmba], in languages like Karitiana (Storto, 1999). As Stanton (2015) points out, analyzing this pattern of distribution as the result of neutralizing indistinct contrasts involving partially oralized nasals implies that the ancestor languages had contrastive medionasals, for example, [abmba−ama], which are unattested. Medionasals only arise as allophones of plain nasals via the enhancement process under discussion here (Ladefoged & Maddieson, 1996, p. 119).

Accordingly, attempts to derive dispersion effects from properties of sound change, such as Boersma and Hamann (2008) and Wedel (2012), posit synchronically active constraints preferring more distinct contrasts, as in Dispersion Theory. However the precise form of these constraints is rather different from the Mindist constraints previously proposed. Boersma and Hamann hypothesize a bias to produce realizations of sound categories that are least likely to be misperceived as a contrasting category. Given standard assumptions about how listeners categorize sounds in the face of variation in the realization of categories, this bias favors realizations that are more dispersed in perceptual space than is typical of the realizations that the listener has experienced. So if this bias is not counteracted by other production biases, such as effort minimization, then each generation of learners produces sound categories that are a little more dispersed than those of the previous generation. As a result contrasting sound categories are predicted to spread further apart over time until the dispersive pressure of the preference for unambiguous realizations reaches equilibrium with the effects of a bias to minimize articulatory effort. Wedel’s model is conceptually similar and derives similar predictions, although the details of implementation differ.

It remains unclear how these evolutionary analyses account for the full range of dispersion phenomena discussed previously because neither Wedel (2012) nor Boersma and Hamann (2008) shows how neutralization or contextual enhancement arise in their model. It is also unclear how cross-linguistic variation in the extent of dispersion of contrasts is accounted for in these models. In Dispersion Theory, this variation follows from variation in the ranking of distinctiveness constraints with respect to effort constraints and Maximize Contrasts. For example, distinctiveness constraints favor contrasts between voiceless and prenasalized stops (/t−nd/) over contrasts between voiceless and plain voiced stops (/t−d/), but both kinds are attested cross-linguistically. According to the analysis in Section 1, these two systems of contrasts result from differences in the ranking of the distinctiveness constraint *D−T with respect to the articulatory constraint *ND. The evolutionary models locate the preference for distinct contrasts outside the phonological grammar, in the process of sound change, so typological variation in dispersion effects cannot be attributed to differences in the ranking of the distinctiveness bias with respect to other phonological constraints.

This issue arises particularly clearly in Boersma and Hamann’s (2008) model because the grammars in that model include explicit, rankable articulatory effort constraints analogous to *ND, but these effort constraints cannot be ranked with respect to distinctiveness constraints, so the balance between effort and distinctiveness is derived by an evolutionary process that is the same for all languages. As a result their model predicts that all languages with the same number of contrasting categories on a given dimension should move toward identical realizations of those categories (Boersma & Hamann, 2008, pp. 247–249). The structure of Wedel’s (2012) model is similar: a production bias toward the center of each perceptual dimension (cf. effort minimization) is opposed by a bias against ambiguous realizations of words.6 In Wedel’s simulations all speakers have the same biases, so presumably the biases are language-independent, in which case cross-linguistic variation cannot be derived from variation in the equilibrium between them. So these models must be supplemented with additional mechanisms in order to derive the typological variation that is analyzed in terms of variation in the ranking between effort and distinctiveness constraints in Dispersion Theory.

Further Reading

Gallagher, G. (2010). Perceptual distinctness and long-distance laryngeal restrictions. Phonology, 27, 435–480.Find this resource:

Liljencrants, J., & Lindblom, B. (1972). Numerical simulation of vowel quality systems: The role of perceptual contrast. Language, 48, 839–862.Find this resource:

Ní Chiosáin, M., & Padgett, J. (2001). Markedness, segment realization, and locality in spreading. In Linda Lombardi (Ed.), Segmental phonology in optimality theory (pp. 118–157). Cambridge, U.K.: Cambridge University Press.Find this resource:

Padgett, J. (2003). Contrast and post-velar fronting in Russian. Natural Language and Linguistic Theory, 21, 39–87.Find this resource:


Babel, M., & Johnson, K. (2010). Accessing psycho-acoustic perception and language-specific perception with speech sounds. Laboratory Phonology, 1, 179–205.Find this resource:

Barbour, J. (2012). A grammar of Neverver. Boston: De Gruyter Mouton.Find this resource:

Beddor, P. S., & Onsuwan, C. (2003). Perception of prenasalized stops. In M. J. Solé, D. Recasens, & J. Romero (Eds.), Proceedings of the 15th International Congress of Phonetic Sciences (pp. 407–410). Barcelona: Universitat Autònoma de Barcelona.Find this resource:

Blevins, J. (2004). Evolutionary phonology: The emergence of sound patterns. Cambridge, U.K.: Cambridge University Press.Find this resource:

Boersma, P. (1998). Functional phonology: Formalizing the interactions between articulatory and perceptual drives (PhD diss.). University of Amsterdam. The Hague: Holland Academic Graphics.Find this resource:

Boersma, P., & Hamann, S. (2008). The evolution of auditory dispersion in bidirectional constraint grammars. Phonology, 25, 217–270.Find this resource:

Bowern, C. (2012). A grammar of Bardi. Boston: De Gruyter Mouton.Find this resource:

Boyeldieu, P. (1985). La langue Lua (‘Niellim’). Cambridge, U.K.: Cambridge University Press.Find this resource:

Bundgaard-Nielsen, R. L., Baker, B. J., Kroos, C., Harvey, M., & Best, C. T. (2012). Vowel acoustics reliably differentiate three coronal stops of Wubuy across prosodic contexts. Laboratory Phonology, 3, 133–161.Find this resource:

Burton, M., Blumstein, S., & Stevens, K. N. (1992). A phonetic analysis of prenasalized stops in Moru. Journal of Phonetics, 20, 127–142.Find this resource:

Butcher, A. (1995). The phonetics of neutralization: The case of Australian coronals. In J. Windsor Lewis (Ed.), Studies in general and English phonetics: Essays in honour of Professor J. D. O’Connor (pp. 10–38). New York: Routledge.Find this resource:

Butcher, A. R. (2012). On the phonetics of long, thin phonologies. In C. Donohue, S. Ishihara, & W. Steed (Eds.), Quantitative approaches to problems in linguistics (pp. 133–154). Munich: LINCOM Europa.Find this resource:

Campbell, S. S., Johnson Peterson, A., & Lorenzo Cruz, F. (1986). Diccionario Mixteco de San Juan Colorado. Mexico City: Instituto Lingüístico de Verano.Find this resource:

Chomsky, N., & Halle, M. (1965). Some controversial questions in phonological theory. Journal of Linguistics, 1, 97–138.Find this resource:

Davies, H. J. (1980). Kobon phonology. Canberra, Australia: Department of Linguistics, Research School of Pacific Studies, Australian National University.Find this resource:

Dayley, J. P. (1989). Tümpisa (Panamint) Shoshone grammar. Berkeley: University of California Press.Find this resource:

de Lacy, P. (2004). Markedness conflation in Optimality Theory. Phonology, 21, 145–199.Find this resource:

Durie, M. (1985). A grammar of Acehnese on the basis of a dialect of North Aceh. Dordrecht, The Netherlands: Foris Publications.Find this resource:

Elbert, S. H., & Pukui, M. K. (1979). Hawaiian grammar. Honolulu: University of Hawai’i Press.Find this resource:

Ernestus, M. (2000). Voice assimilation and segment reduction in casual Dutch: A corpus-based study of the phonology-phonetics interface. Utrecht, The Netherlands: LOT.Find this resource:

Feinstein, M. (1979). Prenasalization and syllable structure. Linguistic Inquiry, 10, 245–278.Find this resource:

Feldman, N. H., Griffiths, T. L., & Morgan, J. L. (2009). The influences of categories on perception: Explaining the perceptual magnet effect as optimal statistical inference. Psychological Review, 116, 752–782.Find this resource:

Flemming, E. (2001). Scalar and categorical phenomena in a unified model of phonetics and phonology. Phonology, 18, 7–44.Find this resource:

Flemming, E. (2002). Auditory representations in phonology. New York: Garland Press.Find this resource:

Flemming, E. (2004). Contrast and perceptual distinctiveness. In B. Hayes, R. Kirchner, & D. Steriade (Eds.), Phonetically-based phonology (pp. 232–276). Cambridge, U.K.: Cambridge University Press.Find this resource:

Flemming, E. (2005). Speech perception in phonology. In D. Pisoni & R. Remez (Eds.), The handbook of speech perception (pp. 156–181). Oxford: Blackwell.Find this resource:

Flemming, E. (2008a). Asymmetries between assimilation and epenthesis (Unpublished manuscript). Cambridge, MA: Massachusetts Institute of Technology.Find this resource:

Flemming, E. (2008b). The realized input (Unpublished manuscript). Cambridge, MA: Massachusetts Institute of Technology.Find this resource:

Goldstone, R. L. (1998). Perceptual learning. Annual Review of Psychology, 49, 585–612.Find this resource:

Graff, P. (2012). Communicative efficiency in the lexicon (PhD diss.). MIT, Cambridge, MA.Find this resource:

Guenther, F. H., Husain, F. T., Cohen, M. A., & Shinn-Cunningham, B. G. (1999). Effects of categorization and discrimination training on auditory perceptual space. Journal of the Acoustical Society of America, 106, 2900–2912.Find this resource:

Hamann, S., & Downing, L. (2015). NT revisited again: An approach to postnasal laryngeal alternations with perceptual Cue constraints. Journal of Linguistics. doi:10.1017/S0022226715000213"Find this resource:

Hayes, B. (1999). Phonetically-driven phonology: The role of Optimality Theory and inductive grounding. In M. Darnell, E. Moravscik, M. Noonan, F. Newmeyer, & K. Wheatly (Eds.), Functionalism and formalism in linguistics, Volume I: General papers (pp. 243–285). Amsterdam: John Benjamins.Find this resource:

Hayes, B., & Stivers, T. (2000). Postnasal voicing (Unpublished manuscript). Los Angeles: University of California at Los Angeles.Find this resource:

Herbert, R. K. (1986). Language universals, markedness theory and natural phonetic processes. Berlin: Mouton de Gruyter.Find this resource:

Hyman, L. (2001). The limits of phonetic determinism in phonology: *NC revisited. In: E. Hume & K. Johnson (Eds.), The role of speech perception in phonology (pp. 141–185). New York: Academic Press.Find this resource:

Iverson, G. K., & Salmons, J. C. (1996). Mixtec prenasalization as hypervoicing. International Journal of American Linguistics, 62, 165–175.Find this resource:

Jansen, W. (2004). Laryngeal contrast and phonetic voicing: A laboratory phonology approach to English, Hungarian, and Dutch (diss.). Proefschrift Rijksuniversiteit Groningen. Groningen Dissertations in Linguistics 47.Find this resource:

Jones, D., & Ward, D. (1969). The phonetics of Russian. Cambridge, U.K.: Cambridge University Press.Find this resource:

Jun, J. (1995). Perceptual and articulatory factors in place assimilation: An optimality-theoretic approach (Unpublished PhD diss.). Los Angeles: University of California at Los Angeles.Find this resource:

Jusczyk, P. W. (1993). From general to language-specific capacities: The WRAPSA model of how speech perception develops. Journal of Phonetics, 21, 3–28.Find this resource:

Kerremans, R. (1980). Nasale suivie de consonne sourde en proto-bantou. Africana Linguistica, 8, 159–198.Find this resource:

Ladefoged, P., & Maddieson, I. (1996). The sounds of the world’s languages. Oxford: Blackwell.Find this resource:

Lee-Kim, S. (2014). Contrast neutralization and enhancement in phoneme inventories: Evidence from sibilant place contrast and typology (Unpublished PhD diss.). New York University, New York.Find this resource:

Lindblom, B. (1990a). Phonetic content in phonology. Phonetic Experimental Research, Institute of Linguistics, University of Stockholm, 11, 101–118.Find this resource:

Lindblom, B. (1990b). Models of phonetic variation and selection. Phonetic Experimental Research, Institute of Linguistics, University of Stockholm, 11, 65–100.Find this resource:

Lisker, L., & Abramson, A. S. (1970). The voicing dimension: Some experiments in comparative phonetics. In B. Hála, M. Romportl, & P. Janota (Eds.), Proceedings of the 6th International Congress of Phonetic Sciences (pp. 563–567). Prague: Academia.Find this resource:

Martinet, A. (1952). Function, structure, and sound change. Word, 8, 1–32.Find this resource:

Martinet, A. (1955). Economie des changements phonétiques. Bern, Switzerland: Francke.Find this resource:

McCrary, K. (2004). Reassessing the role of the syllable in Italian phonology: An experimental study of consonant cluster syllabification, definite article allomorphy and segment duration (Unpublished PhD diss.). Los Angeles: University of California at Los Angeles.Find this resource:

Meinhof, C. (1932). Introduction to the phonology of the Bantu languages. Translated by N. J. Van Warmelo. Berlin: Dietrich Reimer.Find this resource:

Miyawaki, K., Strange, W., Verbrugge, R., Liberman, A., Jenkins, J., & Fujimura, O. (1975). An effect of linguistic experience: The discrimination of /r/ and /l/ by native speakers of Japanese and English. Perception and Psychophysics, 18, 331–340.Find this resource:

Nosofsky, R. M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115, 39–57.Find this resource:

Nosofsky, R. M. (1992). Similarity scaling and cognitive process models. Annual Review of Psychology, 43, 25–53.Find this resource:

Ní Chiosáin, M., & Padgett, J. (2009). Contrast, comparison sets, and the perceptual space. In S. Parker (Ed.), Phonological argumentation: Essays on evidence and motivation (pp. 103–121). London: Equinox.Find this resource:

Ohala, J. J. (1990). The phonetics and phonology of aspects of assimilation. In J. Kingston & M. E. Beckman (Eds.), Papers in laboratory phonology I: Between the grammar and physics of speech (pp. 258–275). Cambridge, U.K.: Cambridge University Press.Find this resource:

Osborne, C. R. (1974). The Tiwi language. Canberra, Australia: Australian Institute of Aboriginal Studies.Find this resource:

Passy, P. (1891). Etude sur les changements phonétiques et leur caractères généraux. Paris: Librairie Firmin-Didot.Find this resource:

Pater, J. (1999). Austronesian nasal substitution and other NC effects. In R. Kager, H. van der Hulst, & W. Zonneveld (Eds.), The prosody-morphology interface (pp. 310–343). Cambridge, U.K.: Cambridge University Press.Find this resource:

Prince, A., & Smolensky, P. (2004). Optimality theory: Constraint interaction in generative grammar. Malden, MA: Blackwell.Find this resource:

Repp, B. H. (1979). Relative amplitude of aspiration noise as a voicing cue for syllable-initial stop consonants. Language and Speech, 27, 173–189.Find this resource:

Revoile, S., Pickett, J. M., Holden, L. D., & Talkin, D. (1982). Acoustic cues to final stop voicing for impaired- and normal-hearing listeners. Journal of the Acoustical Society of America, 72, 1145–1154.Find this resource:

Samarin, W. (1966). The Gbeya language: Grammar, texts, and vocabularies. Berkeley: University of California Press.Find this resource:

Schütz, A. J. (1985). The Fijian language. Honolulu: University of Hawai’i Press.Find this resource:

Shepard, R. N. (1957). Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika, 22, 325–345.Find this resource:

Slis, I. H., & Cohen, A. (1969). On the complex regulating the voiced-voiceless distinction I. Language & Speech, 12(2), 80–102.Find this resource:

Stanton, J. (2015). Environmental shielding is contrast preservation (Unpublished manuscript). Cambridge, MA: Massachusetts Institute of Technology.Find this resource:

Stanton, J. (2016). Predicting distributional restrictions on prenasalized stops. Natural Language and Linguistic Theory, 34, 1089. doi:10.1007/s11049-015-9318-4"Find this resource:

Steriade, D. (1995). Neutralization and the expression of contrast (Unpublished manuscript). Los Angeles: University of California at Los Angeles.Find this resource:

Steriade, D. (1999a). Alternatives to syllable-based accounts of consonantal phonotactics. In O. Fujimura, B. D. Joseph, & B. Palek (Eds.), Proceedings of LP ’98: Item order in language and speech (pp. 205–245). Prague: Karolinum Press.Find this resource:

Steriade, D. (1999b). Phonetics in phonology: the case of laryngeal neutralization. UCLA Working Papers in Linguistics, 3, 25–146.Find this resource:

Steriade, D. (2001). Directional asymmetries in place assimilation. In E. Hume & K. Johnson (Eds.), The role of speech perception in phonology (pp. 219–250). New York: Academic Press.Find this resource:

Steriade, D. (2009). The phonology of perceptibility effects: The P-map and its consequences for constraint organization. In K. Hanson & S. Inkelas (Eds.), The nature of the word: Studies in honor of Paul Kiparsky (pp. 151–179). Cambridge, MA: MIT Press.Find this resource:

Steriade, D., & Zhang, J. (2001). Context-dependent similarity: Phonetics and phonology of Romanian semi-rhymes. Paper presented at the 37th Annual Meeting of the Chicago Linguistic Society, Chicago.Find this resource:

Stevens, K. N., Keyser, S. J., & Kawasaki, H. (1986). Toward a phonetic and phonological theory of redundant features. In J. S. Perkell & D. H. Klatt (Eds.), Invariance and variability in speech processes (pp. 426–449). Hillsdale, NJ: Lawrence Erlbaum.Find this resource:

Storto, L. (1999). Aspects of a Karitiana grammar (PhD diss.). Cambridge, MA: MIT.Find this resource:

Tabain, M. (2012). Jaw movement and coronal stop spectra in Central Arrernte. Journal of Phonetics, 40, 551–567.Find this resource:

Tranel, B. (1987). The sounds of French: An introduction. Cambridge, U.K.: Cambridge University Press.Find this resource:

Wedel, A. (2007). Feedback and regularity in the lexicon. Phonology, 24, 147–185.Find this resource:

Wedel, A. (2012). Lexical contrast and the organization of sublexical contrast systems. Language and Cognition, 4(4), 319–355.Find this resource:

Zygis, M., & Padgett, J. (2010). A perceptual study of Polish sibilants, and its implications for historical sound change. Journal of Phonetics, 38(2), 207–226.Find this resource:


(1.) But see Graff (2012) for evidence that a preference for actual words to be perceptually distinct also shapes the lexicons of languages. Wedel (2012) argues that dispersion of sound categories arises from a preference for distinct words, which creates a pressure toward dispersion of similar words (e.g., minimal pairs), which is then generalized to dispersion between sound categories by a bias toward uniform realization of sound categories across contexts. However, this second bias predicts that sound properties that are motivated in one context only—for example, nasalization of vowels before nasals—could be generalized to all contexts, if the relevant sound occurs frequently in the conditioning context (Wedel, 2007, pp. 163 and following). This mechanism is likely to generate a wide variety of unattested patterns.

(2.) A variety of rankings will give rise to the same outcome: (i) *ND−T >> *D−T >> Maximize Contrasts, *ND, *D, (ii) *ND >> *D >> Maximize Contrasts, *ND−T, *D−T, (iii) *ND−T, *D >> Maximize Contrasts, *ND, *D−T, (iv) *D−T, *ND >> Maximize Contrasts, *ND−T, *D.

(3.) Note that these Mindist constraints form a stringency hierarchy (de Lacy, 2004) because Mindist = voice:2 penalizes all of the contrasts that violate Mindist = voice:1 and more. Accordingly, these constraints will never prefer a less distinct voicing contrast over a more distinct voicing contrast, regardless of their relative ranking. So it may not be necessary to stipulate the ranking shown in the text; however, free ranking of Mindist constraints would predict the existence of conflation effects (de Lacy, 2004), where a language treats a range of distinctiveness values (e.g., voice:1 and voice:2) as equally good. This prediction has not yet been confirmed.

(4.) This is not to imply that voicing contrasts are equally distinct at all places of articulation or that the perceptual distances between [mb−b] and [b−p] are equal, but all voicing and prenasalization contrasts are less distinct than all prenasalized vs. voiceless contrasts, so a single Mindist constraint can set a threshold between these groups. The distances between [mb−b] and [b−p] could be distinguished in a finer-grained voicing dimension, as could place-based differences in the distinctiveness of voicing contrasts.

(5.) Mindist constraints that specify differences on more than one dimension, for example, Mindist = voice:2 & VOT:1, must also be satisfied by contrasts that exceed the specified distance on one dimension, even if they fall below the specified distances on other dimensions, so a difference of VOT:2 alone satisfies Mindist = voice:2 & VOT:1. This is necessary because it is possible that the larger difference in VOT compensates for the absence of a voicing difference, so the evaluation of such a contrast must be left to constraints like Mindist = VOT:2.

(6.) Note that Wedel (2012) differs from Boersma and Hamann (2008) in hypothesizing a bias against ambiguous realizations of words (rather than sound categories), together with a bias toward uniform realization of sound categories across contexts—see note 1.