You are looking at 111-120 of 123 articles
Kodi Weatherholtz and T. Florian Jaeger
The seeming ease with which we usually understand each other belies the complexity of the processes that underlie speech perception. One of the biggest computational challenges is that different talkers realize the same speech categories (e.g., /p/) in physically different ways. We review the mixture of processes that enable robust speech understanding across talkers despite this lack of invariance. These processes range from automatic pre-speech adjustments of the distribution of energy over acoustic frequencies (normalization) to implicit statistical learning of talker-specific properties (adaptation, perceptual recalibration) to the generalization of these patterns across groups of talkers (e.g., gender differences).
Patrice Speeter Beddor
In their conversational interactions with speakers, listeners aim to understand what a speaker is saying, that is, they aim to arrive at the linguistic message, which is interwoven with social and other information, being conveyed by the input speech signal. Across the more than 60 years of speech perception research, a foundational issue has been to account for listeners’ ability to achieve stable linguistic percepts corresponding to the speaker’s intended message despite highly variable acoustic signals. Research has especially focused on acoustic variants attributable to the phonetic context in which a given phonological form occurs and on variants attributable to the particular speaker who produced the signal. These context- and speaker-dependent variants reveal the complex—albeit informationally rich—patterns that bombard listeners in their everyday interactions.
How do listeners deal with these variable acoustic patterns? Empirical studies that address this question provide clear evidence that perception is a malleable, dynamic, and active process. Findings show that listeners perceptually factor out, or compensate for, the variation due to context yet also use that same variation in deciding what a speaker has said. Similarly, listeners adjust, or normalize, for the variation introduced by speakers who differ in their anatomical and socio-indexical characteristics, yet listeners also use that socially structured variation to facilitate their linguistic judgments. Investigations of the time course of perception show that these perceptual accommodations occur rapidly, as the acoustic signal unfolds in real time. Thus, listeners closely attend to the phonetic details made available by different contexts and different speakers. The structured, lawful nature of this variation informs perception.
Speech perception changes over time not only in listeners’ moment-by-moment processing, but also across the life span of individuals as they acquire their native language(s), non-native languages, and new dialects and as they encounter other novel speech experiences. These listener-specific experiences contribute to individual differences in perceptual processing. However, even listeners from linguistically homogenous backgrounds differ in their attention to the various acoustic properties that simultaneously convey linguistically and socially meaningful information. The nature and source of listener-specific perceptual strategies serve as an important window on perceptual processing and on how that processing might contribute to sound change.
Theories of speech perception aim to explain how listeners interpret the input acoustic signal as linguistic forms. A theoretical account should specify the principles that underlie accurate, stable, flexible, and dynamic perception as achieved by different listeners in different contexts. Current theories differ in their conception of the nature of the information that listeners recover from the acoustic signal, with one fundamental distinction being whether the recovered information is gestural or auditory. Current approaches also differ in their conception of the nature of phonological representations in relation to speech perception, although there is increasing consensus that these representations are more detailed than the abstract, invariant representations of traditional formal phonology. Ongoing work in this area investigates how both abstract information and detailed acoustic information are stored and retrieved, and how best to integrate these types of information in a single theoretical model.
Heidi Harley and Shigeru Miyagawa
Ditransitive predicates select for two internal arguments, and hence minimally entail the participation of three entities in the event described by the verb. Canonical ditransitive verbs include give, show, and teach; in each case, the verb requires an agent (a giver, shower, or teacher, respectively), a theme (the thing given, shown, or taught), and a goal (the recipient, viewer, or student). The property of requiring two internal arguments makes ditransitive verbs syntactically unique. Selection in generative grammar is often modeled as syntactic sisterhood, so ditransitive verbs immediately raise the question of whether a verb may have two sisters, requiring a ternary-branching structure, or whether one of the two internal arguments is not in a sisterhood relation with the verb.
Another important property of English ditransitive constructions is the two syntactic structures associated with them. In the so-called “double object construction,” or DOC, the goal and theme both are simple NPs and appear following the verb in the order V-goal-theme. In the “dative construction,” the goal is a PP rather than an NP and follows the theme in the order V-theme-to goal. Many ditransitive verbs allow both structures (e.g., give John a book/give a book to John). Some verbs are restricted to appear only in one or the other (e.g. demonstrate a technique to the class/*demonstrate the class a technique; cost John $20/*cost $20 to John). For verbs which allow both structures, there can be slightly different interpretations available for each. Crosslinguistic results reveal that the underlying structural distinctions and their interpretive correlates are pervasive, even in the face of significant surface differences between languages. The detailed analysis of these questions has led to considerable progress in generative syntax. For example, the discovery of the hierarchical relationship between the first and second arguments of a ditransitive has been key in motivating the adoption of binary branching and the vP hypothesis. Many outstanding questions remain, however, and the syntactic encoding of ditransitivity continues to inform the development of grammatical theory.
Erich R. Round
The non–Pama-Nyugan, Tangkic languages were spoken until recently in the southern Gulf of Carpentaria, Australia. The most extensively documented are Lardil, Kayardild, and Yukulta. Their phonology is notable for its opaque, word-final deletion rules and extensive word-internal sandhi processes. The morphology contains complex relationships between sets of forms and sets of functions, due in part to major historical refunctionalizations, which have converted case markers into markers of tense and complementization and verbal suffixes into case markers. Syntactic constituency is often marked by inflectional concord, resulting frequently in affix stacking. Yukulta in particular possesses a rich set of inflection-marking possibilities for core arguments, including detransitivized configurations and an inverse system. These relate in interesting ways historically to argument marking in Lardil and Kayardild. Subordinate clauses are marked for tense across most constituents other than the subject, and such tense marking is also found in main clauses in Lardil and Kayardild, which have lost the agreement and tense-marking second-position clitic of Yukulta. Under specific conditions of co-reference between matrix and subordinate arguments, and under certain discourse conditions, clauses may be marked, on all or almost all words, by complementization markers, in addition to inflection for case and tense.
The concept of Africa requires reflection: what does it mean to study a social phenomenon “in Africa”? Technology use in Africa is complex and diverse, showing various degrees of access across the continent (and in the Diaspora, and digital social inequalities—which are part and parcel of the political economy of communication—shape digital engagement. The rise of mobile phones, in particular, has enabled the emergence of technologically mediated literacies, text-messaging among them. Text-messaging is defined not only by a particular mode of communication (typically written on mobile phones, visual, digital), but it also favors particular topics (intimate, relational, sociable, ludic) and ways of writing (short, non-standard texts that are creative as well as multilingual). The genre of text-messaging thus includes not only short message service (SMS) and (mobile) instant-messaging (which one might call prototypical one-to-one text messages), but also Twitter, an application that, like texting, favors brevity of expression and allows for one-to-many conversations. Access to Twitter is still limited for many Africans, but as ownership of smartphones is growing, so is Twitter use, and the African “Twittersphere” is emerging as an important pan-African space. At times, discussions are very local (as on Ghanaian Twitter), at other times regional (East African Twitter) or global (African Twitter and Black Twitter); all these are emic, folksonomic terms, assigned and discussed by users. Although former colonial languages, especially English, dominate in many prototypical text messages and on Twitter, the genre also provides important opportunities for writing in African languages. The choices made in the digital space echo the well-known debate between Chinua Achebe and Ngũgĩ wa Thiong’o: the Africanization of the former colonial languages versus writing in African languages. In addition, digital writers engage in multilingual writing, combining diverse languages in one text, and thus offer new ways of writing locally as well as shaping a digitally-mediated pan-African voice that draws on global strategies as well as local meaning.
Hearers and readers make inferences on the basis of what they hear or read. These inferences are partly determined by the linguistic form that the writer or speaker chooses to give to her utterance. The inferences can be about the state of the world that the speaker or writer wants the hearer or reader to conclude are pertinent, or they can be about the attitude of the speaker or writer vis-à-vis this state of affairs. The attention here goes to the inferences of the first type. Research in semantics and pragmatics has isolated a number of linguistic phenomena that make specific contributions to the process of inference. Broadly, entailments of asserted material, presuppositions (e.g., factive constructions), and invited inferences (especially scalar implicatures) can be distinguished.
While we make these inferences all the time, they have been studied piecemeal only in theoretical linguistics. When attempts are made to build natural language understanding systems, the need for a more systematic and wholesale approach to the problem is felt. Some of the approaches developed in Natural Language Processing are based on linguistic insights, whereas others use methods that do not require (full) semantic analysis.
In this article, I give an overview of the main linguistic issues and of a variety of computational approaches, especially those stimulated by the RTE challenges first proposed in 2004.
In the linguistic literature, the term theme has several interpretations, one of which relates to discourse analysis and two others to sentence structure. In a more general (or global) sense, one may speak about the theme or topic (or topics) of a text (or discourse), that is, to analyze relations going beyond the sentence boundary and try to identify some characteristic subject(s) for the text (discourse) as a whole. This analysis is mostly a matter of the domain of information retrieval and only partially takes into account linguistically based considerations. The main linguistically based usage of the term theme concerns relations within the sentence. Theme is understood to be one of the (syntactico-) semantic relations and is used as the label of one of the arguments of the verb; the whole network of these relations is called thematic relations or roles (or, in the terminology of Chomskyan generative theory, theta roles and theta grids). Alternatively, from the point of view of the communicative function of the language reflected in the information structure of the sentence, the theme (or topic) of a sentence is distinguished from the rest of it (rheme, or focus, as the case may be) and attention is paid to the semantic consequences of the dichotomy (especially in relation to presuppositions and negation) and its realization (morphological, syntactic, prosodic) in the surface shape of the sentence. In some approaches to morphosyntactic analysis the term theme is also used referring to the part of the word to which inflections are added, especially composed of the root and an added vowel.
Paul de Lacy
Phonology has both a taxonomic/descriptive and cognitive meaning. In the taxonomic/descriptive context, it refers to speech sound systems. As a cognitive term, it refers to a part of the brain’s ability to produce and perceive speech sounds. This article focuses on research in the cognitive domain.
The brain does not simply record speech sounds and “play them back.” It abstracts over speech sounds, and transforms the abstractions in nontrivial ways. Phonological cognition is about what those abstractions are, and how they are transformed in perception and production.
There are many theories about phonological cognition. Some theories see it as the result of domain-general mechanisms, such as analogy over a Lexicon. Other theories locate it in an encapsulated module that is genetically specified, and has innate propositional content. In production, this module takes as its input phonological material from a Lexicon, and refers to syntactic and morphological structure in producing an output, which involves nontrivial transformation. In some theories, the output is instructions for articulator movement, which result in speech sounds; in other theories, the output goes to the Phonetic module. In perception, a continuous acoustic signal is mapped onto a phonetic representation, which is then mapped onto underlying forms via the Phonological module, which are then matched to lexical entries.
Exactly which empirical phenomena phonological cognition is responsible for depends on the theory. At one extreme, it accounts for all human speech sound patterns and realization. At the other extreme, it is little more than a way of abstracting over speech sounds. In the most popular Generative conception, it explains some sound patterns, with other modules (e.g., the Lexicon and Phonetic module) accounting for others. There are many types of patterns, with names such as “assimilation,” “deletion,” and “neutralization”—a great deal of phonological research focuses on determining which patterns there are, which aspects are universal and which are language-particular, and whether/how phonological cognition is responsible for them.
Phonological computation connects with other cognitive structures. In the Generative T-model, the phonological module’s input includes morphs of Lexical items along with at least some morphological and syntactic structure; the output is sent to either a Phonetic module, or directly to the neuro-motor interface, resulting in articulator movement. However, other theories propose that these modules’ computation proceeds in parallel, and that there is bidirectional communication between them.
The study of phonological cognition is a young science, so many fundamental questions remain to be answered. There are currently many different theories, and theoretical diversity over the past few decades has increased rather than consolidated. In addition, new research methods have been developed and older ones have been refined, providing novel sources of evidence. Consequently, phonological research is both lively and challenging, and is likely to remain that way for some time to come.
When the phonological form of a morpheme—a unit of meaning that cannot be decomposed further into smaller units of meaning—involves a particular melodic pattern as part of its sound shape, this morpheme is specified for tone. In view of this definition, phrase- and utterance-level melodies—also known as intonation—are not to be interpreted as instances of tone. That is, whereas the question “Tomorrow?” may be uttered with a rising melody, this melody is not tone, because it is not a part of the lexical specification of the morpheme tomorrow. A language that presents morphemes that are specified with specific melodies is called a tone language. It is not the case that in a tone language every morpheme, content word, or syllable would be specified for tone. Tonal specification can be highly restricted within the lexicon. Examples of such sparsely specified tone languages include Swedish, Japanese, and Ekagi (a language spoken in the Indonesian part of New Guinea); in these languages, only some syllables in some words are specified for tone. There are also tone languages where each and every syllable of each and every word has a specification. Vietnamese and Shilluk (a language spoken in South Sudan) illustrate this configuration. Tone languages also vary greatly in terms of the inventory of phonological tone forms. The smallest possible inventory contrasts one specification with the absence of specification. But there are also tone languages with eight or more distinctive tone categories. The physical (acoustic) realization of the tone categories is primarily fundamental frequency (F0), which is perceived as pitch. However, often other phonetic correlates are also involved, in particular voice quality. Tone plays a prominent role in the study of phonology because of its structural complexity. That is, in many languages, the way a tone surfaces is conditioned by factors such as the segmental composition of the morpheme, the tonal specifications of surrounding constituents, morphosyntax, and intonation. On top of this, tone is diachronically unstable. This means that, when a language has tone, we can expect to find considerable variation between dialects, and more of it than in relation to other parts of the sound system.
Harry van der Hulst
The subject of this article is vowel harmony. In its prototypical form, this phenomenon involves agreement between all vowels in a word for some phonological property (such as palatality, labiality, height or tongue root position). This agreement is then evidenced by agreement patterns within morphemes and by alternations in vowels when morphemes are combined into complex words, thus creating allomorphic alternations. Agreement involves one or more harmonic features for which vowels form harmonic pairs, such that each vowel has a harmonic counterpart in the other set. I will focus on vowels that fail to alternate, that are thus neutral (either inherently or in a specific context), and that will be either opaque or transparent to the process. We will compare approaches that use underspecification of binary features and approaches that use unary features. For vowel harmony, vowels are either triggers or targets, and for each, specific conditions may apply. Vowel harmony can be bidirectional or unidirectional and can display either a root control pattern or a dominant/recessive pattern.