This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
Basque is a language isolate spoken along the Atlantic coast in an area on both sides of the French-Spanish border covering approximately 7,000 square kilometres. There are currently around 700,000 speakers, over 90% of whom live on the Spanish side. Up to the second half of the 20th century, the Basque language would have been more appropriately described as a language family, as regional variation was so extreme it could prevent mutual intelligibility, with different regions also having distinctive literary varieties. The standardization of the 20th century has reduced, but not eliminated this variation, so that Basque is now best described as a pluricentric language with several standards reflecting current administrative borders. The co-existence of Basque with Romance goes back about two millennia and, despite the recent standardization and transition from diglossia to co-official use of Basque, the traces of the long and intensive contact with Romance remain visible in all areas of the Basque language. Although Castilian Spanish originated in geographical proximity to the Basque-speaking areas, the impact of Basque on Romance is much more limited.
The influence of Spanish on Basque is particularly manifest in the tense-aspect system of Southern Basque, which has come to be modeled on that of Spanish, with every Spanish tense-aspect category having a Basque counterpart. This parallelism extends into aspectual periphrastic constructions involving such verbs as “bring” and “go.” Just like Spanish, the Basque varieties in contact with it distinguish between “be,” expressing a property versus a state, and “have,” used as an auxiliary versus a verb of possession. While the default constituent order of Basque is verb-final, object clauses and other subordinate clauses are often found in post-predicate position, matching the order found in Spanish. Basque has various strategies to express relative and passive constructions, some of which are again modelled on Romance. Furthermore, there are many calques in derived words and idiomatic expressions. Finally, we find some striking phonetic resemblances between Basque and Spanish, some of which may be the result of bilingualism.
Creole languages have a curious status in linguistics, and at the same time they often have very low prestige in the societies in which they are spoken. These two facts may be related, in part because they circle around notions such as “derived from” or “simplified” instead of “original.” Rather than simply taking the notion of “creole” as a given and trying to account for its properties and origin, this essay tries to explore the ways scholars have dealt with creoles. This involves, in particular, trying to see whether we can define “creoles” as a meaningful class of languages. There is a canonical list of languages that most specialists would not hesitate to call creoles, but the boundaries of the list and the criteria for being listed are vague. It also becomes difficult to distinguish sharply between pidgins and creoles, and likewise the boundaries between some languages claimed to be creoles and their lexifiers are rather vague.
Several possible criteria to distinguish creoles will be discussed. Simply defining them as languages of which we know the point of birth may be a necessary, but not sufficient, criterion. Displacement is also an important criterion, necessary but not sufficient. Mixture is often characteristic of creoles, but not crucial, it is argued. Essential in any case is substantial restructuring of some lexifier language, which may take the form of morphosyntactic simplification, but it is dangerous to assume that simplification always has the same outcome. The combination of these criteria—time of genesis, displacement, mixture, restructuring—contributes to the status of a language as creole, but “creole” is far from a unified notion. There turn out to be several types of creoles, and then a whole bunch of creole-like languages, and they differ in the way these criteria are combined with respect to them.
Thus the proposal is made here to stop looking at creoles as a separate class, but take them as special cases of the general phenomenon that the way languages emerge and are used to a considerable extent determines their properties. This calls for a new, socially informed typology of languages, which will involve all kinds of different types of languages, including pidgins and creoles.
Diglossia refers to a situation where two linguistic varieties coexist within a given speech community. One variety, labeled the ‘high variety’, is used in formal domains including education, while the other variety, labeled the ‘low variety’, is used principally in instances of informal extemporaneous communication. The domains of use, however, are not strictly separate and especially so with the increase in electronic modes of communication. This results in what has been described as diglossic code-switching, and the gradual encroaching of, in the case under consideration here, vernacular Arabic upon the domains of use of Standard Arabic.
While the genetic relationship between the two varieties is central in the definition of a classical diglossic situation as in the case of Arabic, the concept of diglossia has often been extended in the literature to cover situations of a functional distribution between languages that are genetically distant, such as with the situation of Spanish and Guaraní in Paraguay.
In North Africa, vernacular Arabic is in a classical diglossic distribution with Standard Arabic, while the Berber languages are often described as existing in a situation of extended diglossia with Arabic. However, distinguishing between diglossia as it exists between the Arabic dialects and Standard Arabic and the situation of bilingualism that involves Arabic, Berber, and European languages provides the best framework for describing the linguistic situation in North Africa. Diglossia is a key element in understanding the mechanisms of the region’s language contact and change as it plays a central role in shaping language attitude, language policy, and language planning.
Since the start of the Islamic conquest of the Maghreb in the 7th century
Linguistic influence is found on all levels: phonology, morphology, syntax, and lexicon. In those cases where only innovative patterns are shared between the two language groups, it is often difficult to make out where the innovation started; thus the great similarities in syllable structure between Maghrebian Arabic and northern Berber are the result of innovations within both language families, and it is difficult to tell where it started. Morphological influence seems to be mediated exclusively by lexical borrowing. Especially in Berber, this has led to parallel systems in the morphology, where native words always have native morphology, while loans either have nativized morphology or retain Arabic-like patterns. In the lexicon, it is especially Berber that takes over scores of loanwords from Arabic, amounting in one case to over one-third of the basic lexicon as defined by 100-word lists.
Victor A. Friedman
The Balkan languages were the first group of languages whose similarities were explained in modern linguistic terms as a result of language contact rather than as a result of descent from a common ancestor. Nikolai Trubetzkoy coined the term Sprachbund ‘linguistic league’ (as opposed to Sprachfamilie ‘language family’) to describe this relationship. Balkan linguistics, as both a subset of and precursor to contact linguistics, is, at its base, an historical linguistic discipline. It seeks to explain similarities among the relevant languages as the result of diffusion rather than of either transmission or of putative universal, typological properties of human language (which latter assumes parallel developments whose causation is ahistorical, i.e., unconnected with either contact or ancestry). The relevant languages are, with the exception of Turkic, all part of the Indo-European language family, but they belong to five distinct groups that are known to have been separated for a significant length of time (presumably millennia). Moreover, for four out of five Indo-European groups as well as for Turkic, there exists documentation that goes back more than a millennium, and in some cases several millennia. The Balkan languages are thus the oldest example of a well-documented and still living Sprachbund.
The primary questions that Balkan linguistics seeks to answer are these: What are the results of language contact in the Balkan languages, and how did they come about? The Balkan languages are traditionally defined as Albanian, Modern Greek, Balkan Romance (Romanian, Aromanian, and Meglenoromanian), and Balkan Slavic (Bulgarian, Macedonian, and the southernmost dialects of the former Serbo-Croatian). In recent decades, it has been recognized that the relevant dialects of Romani, Judezmo, and Turkish and Gagauz also participate in at least some of the convergent processes that are taken as definitive of the Balkan linguistic league. While the language family is defined by regular sound correspondences, which in turn help define shared morphology and a core lexicon, the Balkan linguistic league is defined principally by shared morphosyntactic developments and a shared lexicon of borrowings often called “cultural.” In the Balkan linguistic league, phonological developments are sometimes shared among different languages at the dialectal level, but there are no such features that characterize the Balkan languages as a group. Just as in the language family not every diagnostic item is represented in every branch, so, too, in the Balkan linguistic league not every feature is equally represented in all languages and dialects.
Among the most characteristic morphosyntactic features are the following: (1) replacement of infinitives by analytic subjunctives, (2) the use of a particle derived from etymological ‘want’ to mark the future, (3) replacement of synthetic gradation of adjectives with analytic constructions, (4) replacement of conditionals by anterior futures, (5) resumptive clitic pronouns for certain direct and indirect objects, (6) various simplifications in the declensional system, (7) postposed definite articles (for Balkan Slavic, Balkan Romance, and Albanian), (8) grammaticalized evidentials (Balkan Slavic, Albanian, Turkic, and to some extent Balkan Romance and Romani). While some of these convergences began in the ancient or medieval periods, the Balkan linguistic league took its definitive modern shape during the centuries of the Ottoman Empire (14th to early 20th centuries).
Klaus Beyer and Henning Schreiber
The Social Network Analysis approach (SNA), also known as sociometrics or actor-network analysis, investigates social structure on the basis of empirically recorded social ties between actors. It thereby aims to explain e.g. the processes of flow of information, spreading of innovations, or even pathogens throughout the network by actor roles and their relative positions in the network based on quantitative and qualitative analyses. While the approach has a strong mathematical and statistical component, the identification of pertinent social ties also requires a strong ethnographic background. With regard to social categorization, SNA is well suited as a bootstrapping technique for highly dynamic communities and under-documented contexts. Currently, SNA is widely applied in various academic fields. For sociolinguists, it offers a framework for explaining the patterning of linguistic variation and mechanisms of language change in a given speech community.
The social tie perspective developed around 1940, in the field of sociology and social anthropology based on the ideas of Simmel, and was applied later in fields such as innovation theory. In sociolinguistics, it is strongly connected to the seminal work of Lesley and James Milroy and their Belfast studies (1978, 1985). These authors demonstrate that synchronic speaker variation is not only governed by broad societal categories but is also a function of communicative interaction between speakers. They argue that the high level of resistance against linguistic change in the studied community is a result of strong and multiplex ties between the actors. Their approach has been followed by various authors, including Gal, Lippi-Green, and Labov, and discussed for a variety of settings; most of them, however, are located in the Western world.
The methodological advantages could make SNA the preferred framework for variation studies in Africa due to the prevailing dynamic multilingual conditions, often on the backdrop of less standardized languages. However, rather few studies using SNA as a framework have yet been conducted. This is possibly due to the quite demanding methodological requirements, the overall effort, and the often highly complex linguistic backgrounds. A further potential obstacle is the pace of theoretical development in SNA. Since its introduction to sociolinguistics, various new measures and statistical techniques have been developed by the fast growing SNA community. Receiving this vast amount of recent literature and testing new concepts is likewise a challenge for the application of SNA in sociolinguistics.
Nevertheless, the overall methodological effort of SNA has been much reduced by the advancements in recording technology, data processing, and the introduction of SNA software (UCINET) and packages for network statistics in R (‘sna’). In the field of African sociolinguistics, a more recent version of SNA has been implemented in a study on contact-induced variation and change in Pana and Samo, two speech communities in the Northwest of Burkina Faso. Moreover, further enhanced applications are on the way for Senegal and Cameroon, and even more applications in the field of African languages are to be expected.
Pidgin languages sometimes form in contact situations where a means of communication is urgently needed between groups lacking a common code. They are typically less elaborate than any of the languages involved in their formation, and in comparison to those, reduction characterizes all linguistic levels.
The process is relatively uncommon, and the life span of pidgins is usually short – most disappear when the contact situation changes, or when another medium of intergroup communication becomes available. In some rare cases, however, they expand (both socially and structurally), and may even nativize, i. e. become mother tongues to their speakers (when they may be re-labelled “creoles”).
Pidgins are severely understudied, and while they are often mentioned as precursors to creoles, few linguists have shown a serious interest in them. As a result, many generalizations have been based on extremely limited amounts of data or even on intuition. Some frequently occurring ones is that pidginization is a case of second language acquisition, that power and prestige are important factors, and that most structures are derived from the input languages. My work with pidgins has led me to believe the opposite to be true in these cases: pidgins form through a trial-and-error process, where anything that is understood by the other party is sanctioned, this process is one of collaborative language creation (rather than one involving one group of teachers and one group of learners), and much of what finds its way in the resultant contact language do so independently of what the creators spoke prior to their encounter.
As for theoretical implications, pidgins may shed light on which features in traditional languages are necessary for communication, and which are superfluous from the point of view of pure information transmission.
As might be expected from the difficulty of traversing it, the Sahara Desert has been a fairly effective barrier to direct contact between its two edges; trans-Saharan language contact is limited to the borrowing of non-core vocabulary, minimal from south to north and mostly mediated by education from north to south. Its own inhabitants, however, are necessarily accustomed to travelling desert spaces, and contact between languages within the Sahara has often accordingly had a much greater impact. Several peripheral Arabic varieties of the Sahara retain morphology as well as vocabulary from the languages spoken by their speakers’ ancestors, in particular Berber in the southwest and Beja in the southeast; the same is true of at least one Saharan Hausa variety. The Berber languages of the northern Sahara have in turn been deeply affected by centuries of bilingualism in Arabic, borrowing core vocabulary and some aspects of morphology and syntax. The Northern Songhay languages of the central Sahara have been even more profoundly affected by a history of multilingualism and language shift involving Tuareg, Songhay, Arabic, and other Berber languages, much of which remains to be unraveled. These languages have borrowed so extensively that they retain barely a few hundred core words of Songhay vocabulary; those loans have not only introduced new morphology but in some cases replaced old morphology entirely. In the southeast, the spread of Arabic westward from the Nile Valley has created a spectrum of varieties with varying degrees of local influence; the Saharan ones remain almost entirely undescribed. Much work remains to be done throughout the region, not only on identifying and analyzing contact effects but even simply on describing the languages its inhabitants speak.
This is an advance summary of a forthcoming article in the Oxford Research Encyclopedia of Linguistics. Please check back later for the full article.
About 7,000 languages are spoken around the world today. The actual number depends on where the line is drawn between language and dialect—an arbitrary decision because languages are always in flux. But specialists applying a reasonably uniform criterion across the globe count well over two thousand languages in Asia and Africa, while Europe has just shy of three hundred. In between are the Pacific region, with over thirteen hundred languages, and the Americas, with just over 1,000. Many of the world’s languages are spoken by small populations and are thought likely to disappear over the next few decades, as speakers of endangered languages turn to more widely spoken ones.
The languages of the world are grouped into 141 language families, based on their origin, as determined by comparing similarities among languages and deducing how they evolved from earlier ones. While the world’s language families may well go back to a smaller number of original languages, even to a single mother tongue, scholars disagree on how far back current methods permit us to trace the history of languages.
While it is normal for languages to borrow from other languages, occasionally a totally new language is created by mixing elements of two distinct languages to such a degree that we would not want to identify one of the source languages as the mother tongue. This is the situation with Media Lengua, a language of Ecuador formed through contact among speakers of Spanish and speakers of Quechua. In this language, practically all the word stems are from Spanish, while all of the endings are from Quechua. Just a handful of languages have come into being in this way, but a less extreme form of language mixture has resulted in several dozen creoles around the world. Most arose during Europe’s colonial era, when European colonists used their language to communicate with local inhabitants, who in turn blended vocabulary from the European language with grammar largely from their native language. These so-called creole languages became so well established that they were passed on to the next generation, becoming a first language to many people, and continuing in use to this day.
Also among the languages of the world are about three hundred sign languages, used mainly in communicating with the deaf. The structure of sign languages typically has little historical connection to the structure of nearby spoken languages.
Languages have also been constructed expressly, often by a single individual, to meet communication demands. The prime example is Esperanto, designed to serve as a universal language and used as a second language by some two million, according to some estimates. But there are hundreds of others falling under the rubric of constructed international auxiliary languages.
Aidan Pine and Mark Turin
The world is home to an extraordinary level of linguistic diversity, with roughly 7,000 languages currently spoken and signed. Yet this diversity is highly unstable and is being rapidly eroded through a series of complex and interrelated processes that result in or lead to language loss. The combination of monolingualism and networks of global trade languages that are increasingly technologized have led to over half of the world’s population speaking one of only 13 languages. Such linguistic homogenization leaves in its wake a linguistic landscape that is increasingly endangered.
A wide range of factors contribute to language loss and attrition. While some—such as natural disasters—are unique to particular language communities and specific geographical regions, many have similar origins and are common across endangered language communities around the globe. The harmful legacy of colonization and the enduring impact of disenfranchising policies relating to Indigenous and minority languages are at the heart of language attrition from New Zealand to Hawai’i, and from Canada to Nepal.
Language loss does not occur in isolation, nor is it inevitable or in any way “natural.” The process also has wide-ranging social and economic repercussions for the language communities in question. Language is so heavily intertwined with cultural knowledge and political identity that speech forms often serve as meaningful indicators of a community’s vitality and social well-being. More than ever before, there are vigorous and collaborative efforts underway to reverse the trend of language loss and to reclaim and revitalize endangered languages. Such approaches vary significantly, from making use of digital technologies in order to engage individual and younger learners to community-oriented language nests and immersion programs. Drawing on diverse techniques and communities, the question of measuring the success of language revitalization programs has driven research forward in the areas of statistical assessments of linguistic diversity, endangerment, and vulnerability. Current efforts are re-evaluating the established triad of documentation-conservation-revitalization in favor of more unified, holistic, and community-led approaches.