date: 27 July 2017


Summary and Keywords

Polysemy is characterized as the phenomenon whereby a single word form is associated with two or several related senses. It is distinguished from monosemy, where one word form is associated with a single meaning, and homonymy, where a single word form is associated with two or several unrelated meanings. Although the distinctions between polysemy, monosemy, and homonymy may seem clear at an intuitive level, they have proven difficult to draw in practice.

Polysemy proliferates in natural language: Virtually every word is polysemous to some extent. Still, the phenomenon has been largely ignored in the mainstream linguistics literature and in related disciplines such as philosophy of language. However, polysemy is a topic of relevance to linguistic and philosophical debates regarding lexical meaning representation, compositional semantics, and the semantics–pragmatics divide.

Early accounts treated polysemy in terms of sense enumeration: each sense of a polysemous expression is represented individually in the lexicon, such that polysemy and homonymy were treated on a par. This approach has been strongly criticized on both theoretical and empirical grounds. Since at least the 1990s, most researchers converge on the hypothesis that the senses of at least many polysemous expressions derive from a single meaning representation, though the status of this representation is a matter of vivid debate: Are the lexical representations of polysemous expressions informationally poor and underspecified with respect to their different senses? Or do they have to be informationally rich in order to store and be able to generate all these polysemous senses?

Alternatively, senses might be computed from a literal, primary meaning via semantic or pragmatic mechanisms such as coercion, modulation or ad hoc concept construction (including metaphorical and metonymic extension), mechanisms that apparently play a role also in explaining how polysemy arises and is implicated in lexical semantic change.

Keywords: lexical semantic representation, homonymy, polysemy, monosemy, lexical rules, coercion, lexical pragmatics, psycholinguistics, sense enumeration, underspecificaion, overspecification, literalism, metaphor, metonymy

1. What is Polysemy?

Polysemy is characterized as the phenomenon whereby a single word form is associated with two or several related senses, as in (1) below:















The relations between the senses are often metonymic (part-for-whole), as in (1d) to (1f), or metaphorical, as in (1g). Polysemy is contrasted with monosemy, on the one hand, and with homonymy, on the other. While a monosemous word form has only one meaning, a homonymous word form is associated with two or several unrelated meanings (e.g., coach: ‘bus’, ‘sports instructor’), and is standardly viewed as involving different lexemes (e.g., COACH1, COACH2).

Polysemy is pervasive in natural languages, and affects both content and function words. While deciding which sense is intended on a given occasion of use rarely seems to cause any difficulty for speakers of a language, polysemy has proved notoriously difficult to treat both theoretically and empirically. Some of the questions that have occupied linguists, philosophers, and psychologists interested in the phenomenon concern: (i) the representation, access, and storage of polysemous senses in the mental lexicon; (ii) how to deal with polysemous words in a compositional theory of meaning; and (iii) how novel senses of a word arise and are understood in the course of communication. In psycholinguistics, the debate revolves mainly around the differences in access, storage, and representation of polysemous senses vis-à-vis homonymous meanings (the different related meanings of polysemous expressions are usually called senses). Computational and theoretical linguistics (Asher, 2011; Copestake & Briscoe, 1995; Jackendoff, 2002; Pustejovsky, 1995) describe models that can integrate various forms of polysemy into a compositional theory of meaning. Pragmaticists (Carston, 2002, 2016; Falkum, 2011), psychologists (Srinivasan & Rabagliati, 2015), philosophers of language (Recanati, 2004, 2016; Vicente, 2015) and recently also cognitive linguistics (Evans, 2009, 2015) propose accounts of how polysemous senses arise and are understood, with an eye on the issue of whether the generation of senses reflects our conceptual structures. Distributional semantics approaches describe and distinguish senses on the basis of words’ distributional properties, extracted by statistical analysis of the contexts in which words occur (under the assumption that words with similar distributional properties have similar semantic properties (Lenci, 2008; Baroni, Bernardi, & Zamparelli, 2014). Lexicographers (Kilgarriff, 1992, Hank, 2013) also try to tackle the question of how many senses a polysemous expression can be said to have mainly by looking at collocation patterns. A trend towards an increasing interaction between these fields can be observed, as the different research topics just listed are intimately related.

2. Historical Background

The fact that a word can be associated with multiple related senses was addressed at least as early as in the writings of Aristotle, although he did not use the label polysemy (Barnes, 1984). In Categories, Aristotle makes a distinction between synonymy (or univocity) and homonymy (or multivocity, ‘being spoken of in many ways’). Two things, a and b, are “synonymous” if they are both called by the same name F and they have identical definitions (Aristotle’s notion of synonymy is thus distinct from contemporary usage, where it refers to different words with the same meaning), while a and b are “homonymous” if they are called by the same name F, but the definition of F for a only partially overlaps with the definition of F for b (Shields, 2009). Thus, Aristotle’s notion of homonymy also covers cases that would be described as polysemy in contemporary linguistic terminology, such as the uses of healthy in (2):







Aristotle observed that the meanings of healthy in (2) are not univocal, and that the uses in (2b) and (2c) are both dependent on the meaning of healthy in (2a) by being contained as part of their definitions. He referred to this as a kind of “core-dependent homonymy” (Shields, 1999), an intermediate case between “synonymy” (univocity) and full homonymy (this is sometimes also referred to as “focal meaning”; Owen, 1960).

Until relatively recently, almost all theories of linguistic semantics were based on the “classical” theory of meaning, adopted by Aristotle, according to which the meaning of a word can be stated in terms of necessary and sufficient application conditions (the major influence in the demise of this theory being Wittgenstein, 1953). This approach makes specific predictions regarding the representation of polysemy: A word will have as many meanings (or senses) as there are necessary and sufficient conditions for its application. Section 5.1 will look at an influential development of this general view (Katz, 1972; Katz & Postal, 1964). A modern version of the view is held by linguists working within the framework of Natural Semantic Metalanguage (NSM) (Goddard & Wierzbicka, 2014; Wierzbicka, 1996). Polysemy, on this account, is posited only when the meaning of a word cannot be stated in the form of a single reductive paraphrase, but requires further specification in order to capture its full range of application (Goddard, 2000).

Another early appearance of the topic of lexical meaning variation in the history of Western philosophy is Locke (1975 [1689]) and Leibniz’s (1996 [1765]) disagreement regarding the meaning of the English connective but (cf. Fieke Van der Gucht & De Cuypere, 2007). For Locke, the many different senses associated with but (e.g., opposition, coordination, etc.) suggested that they could not all be instantiations of a single more abstract meaning, but had to be distinct. Against this multiplicity of meanings view, Leibniz (1996 [1765]: III, §4) argued that all uses of a word should be reduced to “a determinate number of significations” by searching for a paraphrase that is able to cover as much of the semantic variation as possible. This brief discussion between Locke and Leibniz sums up the broad lines of the traditional debate over polysemy: For a long time, theories of how polysemy is represented in the (mental) lexicon have been divided into two camps: sense enumeration and one-representation approaches (see Section 5). Sense enumeration approaches, which take polysemous lexical items to be represented in the form of lists of possible and/or attested senses, bear a clear resemblance to Locke’s position. One-representation approaches, which may treat polysemous lexical items as being represented in terms of highly abstract core meanings that remain constant across their different uses, have a strong affinity with Leibnitz’s position.

In general linguistics, Bréal ([1897] 1924) was the first to use the term polysemy (la polysémie) to describe single word forms with several related meanings. For Bréal, polysemy was primarily a diachronic phenomenon, arising as a consequence of lexical semantic change. When words acquire new meanings through use, their old meanings typically remain in the language. So polysemy involves the parallel existence of new and old meanings and is a result of new senses becoming conventionalized: It is the synchronic outcome of lexical semantic change. At the same time, Bréal ([1897]1924) observed that, at the synchronic level, polysemy is not really an issue, since the context of discourse determines the sense of a polysemous word and eliminates its other possible meanings.

Following the advent of transformational-generative grammar, with its main focus on syntax, and of truth-conditional theories in semantics, the topic of polysemy received little attention for several decades (notable exceptions are Anderson & Ortony, 1975; Apresjan, 1974; Caramazza & Grober, 1976; Weinreich, 1964, 1966). But with the emergence of cognitive linguistics during the 1980s, polysemy reappeared as a key topic on the research agenda, in particular as a result of Lakoff and Brugman’s pioneering studies of prepositional polysemy (Brugman, 1988; Brugman & Lakoff, 1988; Lakoff, 1987). The central claim of these cognitive linguists was that polysemy is not so much a linguistic phenomenon as a cognitive one, resulting from the way in which our conceptual categories are structured (see Section 5.1).

Outside cognitive linguistics, however, polysemy was a relatively neglected phenomenon in linguistic semantics and philosophy of language. Part of this neglect was due to the strong focus on sentential, truth-conditional meaning and the little attention devoted to the issue of word meaning, in particular the meaning of lexical content words. In philosophy of language, it has not been uncommon for authors to pursue their semantic theorizing without addressing the notion of word meaning at all (Davidson, 2001). In Montagovian semantics (Montague, 1970), where the main concern is how to obtain sentential meanings from individual denotations using functional application, scholars have focused on how different classes of words interact with each other in the composition of sentential meanings, and on the role that functional items play in the composition process. In this endeavor, semanticists have typically (either explicitly or tacitly) adopted one version or another of literalism, that is, the idea that, barring homonymy and indexicality, each word-type has a unique simple denotation, such as a certain individual or a certain (nonconjunctive or disjunctive) property.

Generally, within these approaches to semantics, variations in a word’s contribution to sentential meaning have been treated in one of four ways: (i) as simple cases of ambiguity (aka homonymy), (ii) as denotations of indexical expressions, (iii) as meanings resulting from the operation of coercion mechanisms (see Section 5), or (iv) as pragmatically derived meanings (see Section 5) (which may or may not affect truth conditions). In other words, polysemy is reduced to something that is assumed to be nonproblematic from the point of view of standard semantic compositionality (see Pelletier, Semantic Compositionality).

This does not mean, however, that polysemy necessarily jeopardizes compositional, truth-conditional semantics, even though scholars like Chomsky (2002) and Pietroski (2017) believe it does. But it seems clear that the variability in a word’s meaning does complicate the picture semanticists have been working with, at least if this variability is taken to be a property of the meaning of the word itself. There are examples of polysemy where this seems to be the case: The word book, for instance, can have an information sense (‘I have read a book’) or a physical object sense (‘Put the book on the top shelf’), which seems to be not in virtue of book being an indexical, or in virtue of some coercion or pragmatic mechanism, but, rather, because it has the meaning it has.

Contemporary research on polysemy can be divided into four broad camps. One is the well of polysemy studies conducted within the cognitive linguistic framework, inspired by Lakoff and Brugman’s early studies and Langacker’s (1987) foundational work in cognitive grammar (Cuyckens & Zawada, 1997; Dunbar, 2001; Evans, 2009; Geeraerts, 1993; Nerlich & Clarke, 2001; Taylor, 2003, 2006; Tuggy, 1993; Tyler & Evans, 2003, and many others). Another is the growing number of formal and computational accounts of polysemy, with Pustejovsky’s (1995) generative lexicon theory and Asher’s (2011) type composition logic as the most prominent representatives (see also Arapinis, 2013; Asher, 2015; Asher & Pustejovsky, 2006; Copestake & Briscoe, 1995; Pustejovsky, 1998; Spalek, 2015; Zarcone, 2014). Furthermore, recent work in pragmatics and philosophy of language focusing the nature of word meaning and its interaction with contextual information in the derivation of speaker meanings, has a direct bearing on the issue of polysemy (Blutner, 1998, 2002; Bosch, 2007; Carston, 2002, 2012, 2016; Recanati, 1995, 2004; Vicente, 2015; Wilson, 2011; Wilson & Carston, 2006, 2007). Finally, psycholinguists study how the mental lexicon represents polysemy compared with homonymy, a long-standing debate in the polysemy literature (Frazier & Rayner, 1990, Foraker & Murphy, 2012; Frisson, 2015; Klein & Murphy, 2001; Klepousniotou & Baum, 2007; Pylkkänen, Llinás, & Murphy, 2006), as well as the differences in processing different kinds of polysemy in composition (Schumacher, 2013).

3. Defining and Delimiting the Polysemy Phenomenon

The definition and delimitation of the polysemy phenomenon remains a source of theoretical discussion across disciplines: How do we tell polysemy apart from monosemy on the one hand, and from homonymy on the other? At first glance, the contrast with monosemy is clearer: While a monosemous term has only a single meaning, a polysemous term is associated with several related senses. This intuitive contrast is, however, not theory-neutral. Scholars who argue that polysemous senses derive from an abstract, core meaning (Ruhl, 1989), or that they remit to an atomic concept (Fodor, 1998; Fodor & Lepore, 2002), and explain the variations associated with word-tokens by appealing to pragmatics or world knowledge, would hold that polysemy is a spurious phenomenon, and that there is no actual monosemy/polysemy distinction. The distinction has to be drawn elsewhere, either in metaphysics (Fodor & Lepore, 2002) or in the conceptual realm.

Several linguistic tests have been devised to distinguish polysemy from monosemy. Particularly well known is Zwicky and Sadock’s (1975) identity test by conjunction reduction, where the conjunction of two different senses or meanings of a word in a single construction has an awkward effect (this is usually glossed as “giving rise to zeugma”). For instance, the verb expire has (at least) the two senses ‘cease to be valid’ and ‘die’, and so the sentence ‘?Arthur and his driving license expired yesterday’ is zeugmatic. Another type of test exploits the impossibility of anaphorically referring to different senses (Cruse, 2004), as in the sentence ‘?John read a line from his new poem. It was straight’, the pronoun cannot simultaneously refer to a sense of line combinable with the modifier straight (e.g., ‘long, narrow mark or band’) and the sense of line in the previous sentence (‘row of written/printed words’).

However, such tests for identity of meaning do not give clear-cut answers (for a review, see Geeraerts, 1993). In particular, only a slight manipulation of the context can yield a different result, as shown by the following example (Norrick, 1981, p. 115):





Note also that these tests typically do not distinguish between polysemy and homonymy—that is, they do not distinguish between senses or meanings that are related and those that are unrelated—both of which come out as instances of a more general phenomenon of lexical ambiguity.

Linguistic tests have also been used to distinguish lexical ambiguity (including homonymy and “accidental” polysemy) from so-called “logical” polysemy on the assumption that the different senses of a logically polysemous expression can be felicitously conjoined and referred back to by use of an anaphoric pronoun (Asher, 2011). An example of successful conjunction is the sentence “Lunch was delicious but took forever,” where lunch refers consecutively to a type of food and to an event type. An example of a felicitous anaphora is found in the sentence “That book is boring. Put it on the top shelf,” where the pronoun it refers anaphorically to the physical object sense of the noun book, even though the sense of book activated in the previous sentence is the information sense. In contrast, lexically ambiguous terms give rise to zeugma when conjoined and do not allow for anaphoric reference. In this case, the linguistic tests distinguish between a particular kind of polysemy and lexical ambiguity but cannot distinguish the former from instances of monosemy. So it appears that some forms of polysemy have more in common with monosemy, and other forms are more similar to homonymy, and it is difficult to see what sort of linguistic test might help identify when a term is polysemous simpliciter.

Another criterion that has been suggested as a way to distinguish between polysemy and homonymy is speaker intuitions about sense relations. According to this “folk-etymological” criterion, two senses are polysemous if they are judged by native speakers to be related, and homonymous if they are judged to be unrelated. A problem with this criterion is that sense relatedness appears to be a matter of degree, and that judgments about the relatedness of the senses of a given word are likely to be subjective, so some speakers may claim to see a relation between the senses of a word form where others do not. Furthermore, it is not clear that such speaker intuitions have any bearing on the way in which individuals use and understand words (quite unlike grammaticality judgments, which are considered the basic data to be explained within generative syntax). This might be because intuitions about sense relations are largely metalinguistic, that is, arrived at by thinking about language, and not a direct reflex of the way in which word meanings are represented in the mental lexicon.

4. Types of Polysemy: Regular, Inherent, and Irregular/Idiosyncratic

According to a standard linguistic taxonomy, instances of polysemy belong to one of two classes: regular polysemy and irregular, idiosyncratic, or accidental polysemy. In a classic paper, Apresjan (1974, p. 16) defined the polysemy of a word a with the senses Ai and Aj as regular if there exists at least one word b with the polysemous senses Bi and Bj, being semantically distinguished in exactly the same way as Ai and Aj, and irregular if the semantic distinction between Ai and Aj is not exhibited by any other word in the language, exemplified by patterns such as: author for works of author (Beethoven); container for content (bottle), animal for meat of animal/fur of animal (rabbit), tree for wood (oak), liquid for portion of liquid (beer), and so on (see Falkum, 2011; Dölling, forthcoming, for further examples). Some of these patterns also occur cross-linguistically (Srinivasan & Rabagliati, 2015). A typical case of irregular polysemy is the English verb line (draw a line, a line around eyes, a wash on a line, wait in a line, a line of bad decisions, etc.).

According to Apresjan (1974), regular polysemy is typically associated with senses generated by metonymical relations and irregular polysemy with senses that are derived metaphorically. Metaphor and metonymy are pragmatic processes that may target individual lexical items, and their role in the generation of polysemy has long been recognized (Bowdle & Gentner, 2005). While metaphor is taken to involve a similarity relation between two entities (e.g., ‘Jane is (like a) princess’), metonymy is seen as involving real-world contiguity relations (e.g., ‘The ham sandwich left without paying’). These associations are based on the observation that many systematic sense alternations (e.g., ‘fruit for tree’, ‘publication for publisher’, exhibited by words such as cherry, apple, New York Times, etc.) seem to involve metonymic relations between senses. They are clearly different from creative metaphors, which are usually one-off and, if conventionalized, would typically be instances of irregular polysemy in Apresjan’s sense. They are also different from “everyday” metaphors (e.g., “Bill is a bulldozer”, “The critics slaughtered his new book”), which, although often conventional, cannot be said to involve any kind of regularity in the above sense. However, there are numerous cases of creative metonymy that are not obviously regular either (e.g., “Jane is just a pretty face”, “The loudmouth is coming to the party”, “John has married a free ticket to the opera”, etc.). And there are instances of metaphor which appear to be at least partly regular, such as metaphorical uses of animal-denoting nouns to refer to some human characteristic (e.g., “Peter is a lion/chicken/pig”) (Copestake & Briscoe, 1995) or uses of body parts to refer to analogous parts of inanimate objects (e.g., foot of a mountain/tree/table, mouth of a bottle/river/cave). In other words, the association between regular polysemy and metonymy on the on hand, and irregular polysemy and metaphor on the other seems far from clear-cut.

Pustejovsky (1995) distinguishes between different types of regular polysemy, introducing the notion of inherent polysemy. Inherent polysemy involves related senses of contradictory semantic types. For instance, book in (3a) is of the semantic type Polysemy(information) while in (3b) it is of the type Polysemy (physical object). The type of lunch in (4a) is Polysemy, and in (4b) it is Polysemy:









Typically, inherent polysemy passes co-predication, anaphoric binding and conjunction reduction tests, as exemplified below:





Asher (2011), in turn, distinguishes between logical and accidental polysemy, where logical polysemy is characterized as cases of polysemy that pass co-predication tests. Although co-predication tests are not completely reliable, they seem to reveal that we can think of objects that belong to different, complementary kinds, as coherent individual entities. At the same time, we can also conceptualize these objects as involving different entities. For instance, we can think about a book as a physical object only, as an informational entity only, or as both things at the same time. The idea is that such polysemous terms encode a new kind of type, a dot object (Asher, 2011; Pustejovsky, 1995), which is the result of merging two different types into a composite So the noun lunch in “Lunch was delicious but took forever” is taken to be of the type Polysemy. The components of the composite are sometimes referred to as aspects (Cruse, 1986).

In this way, we may distinguish between three main types of polysemy: inherent or logical (dot-object) polysemy, regular polysemy, and irregular or idiosyncratic polysemy. While the first two kinds affect mostly nouns (proper or common) and are often metonymically derived, irregular polysemy affects all types of words and has metaphor as one important source.

There are other phenomena discussed in the context of polysemy in the literature that do not fall under any of the previous kinds. One example is so-called logical metonymy, first discussed by Pustejovsky (1995). These are cases where a verb that subcategorizes for an NP or a gerundive VP syntactically (e.g., “Sam began reading the book’ vs. ‘Sam began the book’), semantically requires a complement with an eventive interpretation. This is taken to involve metonymy because the entity is used to “stand for” the event in question. We will not address these cases any further here, since we think that they are only indirectly related to the phenomenon of polysemy. Strictly speaking, the phenomena that fall under the label logical metonymy do not involve several related senses being associated with a single word.

From the point of view of processing, however, it is not clear that there is any deep difference between dot-object, regular, and (at least some kinds of) idiosyncratic polysemy. Many psycholinguistic studies have focused on regular polysemy, investigating whether it is processed differently from homonymy. Most studies suggest that polysemy resolution differs from homonymy resolution, where (a) readers have to select a specific meaning when the homonym is encountered, (b) there is a clear bias towards the dominant meaning of the homonym, and (c) different homonymous meanings are in competition, so that the meaning that is not selected decays fast. In polysemy resolution, however, (a) there appears not to be a strong bias for the most frequent, or dominant sense, (b) the related senses prime each other, and (c) their mutual activation sustains for some time (MacGregor, Bouwsema, & Klepousniotou, 2015). In addition, words with multiple senses are easier to recognize in lexical decision tasks than words with fewer senses and, in particular, homonyms (Azuma & van Orden, 1997; Rodd, Gaskell, & Marslen-Wilson, 2002). Together, these results suggest that (regular) polysemous senses are stored and represented differently from homonymous meanings (Frisson, 2009, 2015; Klepousniotou, Pike, Steinhauer, & Gracco, 2012; MacGregor et al., 2015).

However, what does seem to make a difference with respect to processing is whether the senses of a polysemous lexical item are closely or distantly related (Klepousniotou, Titone, & Romero, 2008). In a much-discussed paper, Klein and Murphy (2001) argued in favor of a sense enumeration approach to polysemy. In a series of sensicality judgment tasks, they did not find any of the processing differences between polysemous and homonymous senses reported in the studies above (see also Foraker & Murphy, 2012). However, Klein and Murphy’s results could be partly due to the stimuli used in their experiments. The senses of their polysemous words were quite distantly related, such as, for instance ‘shredded paper’ and ‘liberal paper.’ Distant senses such as these seem to behave more like the meanings of homonymous terms (i.e., there tends to be competition between them, cf. Klepousniotou et al., 2008).

5. Polysemy and Word Meaning

The debate regarding polysemy representation is intrinsically connected to the more general question of what word meanings are, and, specifically, what kind of mental representation a lexical form encodes. As already mentioned, theories of polysemy representation can broadly be divided into two camps: sense enumeration lexicons (SELs), where the different senses are taken to be represented separately, and one-representation approaches, where polysemous lexical items are seen as being represented either as core meanings from which their different senses are derived, or as overspecified meanings whose component parts are selected in context. Both these options are compatible with a number of different approaches to word meaning.

5.1 Sense Enumeration Lexicons

As mentioned in Section 2, an influential linguistic embodiment of the classical idea that word meanings can be spelled out in terms of necessary and sufficient conditions (i.e., definitions) is Katz’s semantic theory (Katz, 1972; Katz & Fodor, 1963). Katz aimed to provide a theory of natural language semantics that was able to explain semantic relations and contrasts between word meanings (synonymy, antonymy, contradiction, analyticity, entailment, etc.), as well as the relation between word meanings and sentence meanings. On Katz’s account, the semantic component of the grammar contains a dictionary, which lists under a single lexical entry the different senses of a word (which together constitute the meaning of that word), each of which can be broken down into a set of semantic markers (or primitives).

Katz’s theory is a prime example of a sense enumeration lexicon, where different readings (both polysemous and homonymous) of a lexical item are listed under a single dictionary entry. Katz suggested that the distinction between polysemy and homonymy could be drawn on the basis of the notion of “semantic similarity.” According to his definition (Katz, 1972, p. 48), the senses of two constituents are similar if they have a semantic marker in common. However, unlike other cases of semantic relations, such as synonymy, antonymy, analyticity, and entailment, which can be “read off” elegantly from the theory by means of sameness, overlap, or incompatibility of semantic markers contained in the semantic representations of words, it appears that polysemy (and ultimately the distinction between polysemy and homonymy) is not so easily accounted for this way. It is likely, for instance, that there would be polysemous senses that do not share any semantic markers, especially when the senses are related by metonymic relations, such as newspaper as a physical object and newspaper as an institution (or the already mentioned case of liberal paper and shredded paper). In addition to any specific problems concerning polysemy and homonymy representation, philosophers at least since Wittgenstein (1953) have pointed out a number of more general problems for definitional theories, which, taken together, have made it nearly impossible to maintain as an account of lexical semantic representation (see Laurence & Margolis, 1999, for a comprehensive review of problems associated with the classical theory of concepts).

Another influential sense enumeration approach to word meaning and polysemy, which rejects the classical theory and builds on the assumption that categories exhibit prototypicality effects (Rosch, 1999 [1978]), is Lakoff’s (1987) theory of knowledge representation. In Lakoff’s framework, idealized cognitive models (ICMs) are relatively stable mental structures that represent theories about the world with respect to a particular domain, and which guide categorization and reasoning. On this approach, a single concept can be represented in terms of a combination of a number of individual ICMs in a “cluster concept.” Cluster concepts ground sense-extensions and give thus rise to radial categories (formed by the cluster concepts and the noncentral extensions or variants) This notion of radial categories forms the basis for Lakoff’s account of polysemy (Brugman, 1988; Brugman & Lakoff, 1988; Lakoff, 1987), which has inspired a host of studies of polysemy within the strand of linguistics known as cognitive semantics. On this approach, which takes linguistic categories to be no different from other kinds of conceptual categories, most word meanings are seen as a type of radial category in which the different senses of a word are organized with respect to a prototypical sense. The paradigmatic example is the preposition over, first discussed by Brugman (1988):















The idea is that over constitutes a radial category composed of a range of distinct but related senses, organized around the prototypical, or central, sense (which Brugman and Lakoff take to be the ‘above and across’ sense in (5a)), in a lexical network structure. (A slightly different, albeit similar manifestation of the network model of polysemy representation is given by Langacker, 1988.) The different senses of over exhibit typicality effects; more typical senses are located “closer” to the prototypical sense in the network, while less typical senses are located in its periphery. Such peripheral senses are derived from more typical senses by a set of cognitive principles for meaning extension (e.g., “conceptual metaphors,” cf. Lakoff & Johnson, 1980), giving rise to meaning chains (e.g., sense a is related to sense b in virtue of some shared attribute(s), sense b is related to sense c, which is related to sense d, and so on). For instance, the “control” sense in (5g) is seen as being derived from the “above” sense in (5b) on the basis of the metaphorical schema control is up, lack of control is down (Lakoff, 1987). Sense relations, then, concern, in the first instance, adjacent members of the category, while members that are only indirectly connected in the semantic network may be very different in semantic content. Wittgenstein’s (1953) metaphor of “family resemblance” (cf. Rosch & Mervis, 1975) is often used to describe such polysemous categories within the cognitive linguistics framework (Taylor, 2003).

A central aspect of Lakoff and Brugman’s approach is that radial categories are stored in the long-term semantic memory of speakers. In this respect, the radial category account of polysemy is a radical version of a sense enumeration lexicon, in that the full range of senses are taken to be stored as part of a semantic network (hence, it is sometimes referred to as the “full-specification approach,” cf. Evans & Green, 2006). A common criticism of the full-specification approach, and of sense enumerative accounts more generally, is that they seem to entail a (potentially) indefinite proliferation of mentally stored senses in order to cover the range of uses of lexical forms (for instance, Brugman, 1988, identifies nearly a hundred different uses of over), and thereby fails to distinguish those aspects of meaning that are part of the word meaning proper and those that result from its interaction with the context, a problem that is sometimes referred to as the “polysemy fallacy” (Sandra, 1998). It is possible that work in distributional semantics, which makes use of nondiscrete representations, might give a different shape to the notion of a radial category, one which is not committed to the storage of each individual sense that forms part of the network.

More recently, scholars working within the cognitive linguistics paradigm have refined their accounts of polysemy, acknowledging the context-dependence of word meanings (Allwood, 2003; Evans, 2005, 2009; Taylor, 2006; Tyler & Evans, 2001, 2003; Zlatev, 2003). In particular, Tyler and Evans (2001, 2003) have developed an account of polysemy that, while espousing the Lakoff-Brugman idea that polysemous senses are represented in terms of sense networks centered around a prototypical sense, proposes a set of criteria that tries to (i) determine whether a particular sense of a word counts as a distinct sense, and (ii) establish the central sense of a polysemous lexical item. The “Principled Polysemy approach” is also an attempt at avoiding the polysemy fallacy by distinguishing between those senses that are stored in semantic memory and those that are pragmatically constructed during online processing.

5.2 One-Representation Approaches

Ruhl (1989) was the first to spell out in some detail what we call the one-representation approach (sometimes also referred to as the core meaning approach). Ruhl’s work was a critique of the idea that words in general have multiple meanings, which was the mainstream view in linguistic theory at the time, and which had led to the postulation of sense enumeration lexicons in various forms. According to Ruhl’s monosemy approach, words should initially be presumed to have only a single meaning, and polysemy (or homonymy) be posited only when extended attempts to describe this meaning fail. Ruhl’s methodology involved collecting a large set of data corresponding to a word’s actual uses and extracting from this set a highly abstract, unitary schema that covered all its uses. For instance, the corpus analyses presented in Ruhl (1989) included several hundred attested uses of the verbs bear, hit, kick and slap, typically taken to be highly polysemous. Ruhl claimed that each of these uses revealed an abstract unified meaning, albeit not one that could that could be comprehensively captured by a single word or phrase (Ruhl, 1989, p. 63). One important insight of Ruhl’s proposal was that a large part of what had traditionally been treated as part of the lexical meaning of a word, and which had led to a proliferation of senses on previous accounts, was more adequately described as senses that were generated from an underspecified meaning as a result of its interaction with the linguistic and/or extralinguistic context.

Another version of the one-representation approach is the generative lexicon theory (Pustejovsky, 1991, 1995). This theory stands out from the other accounts discussed so far by being developed with the sole purpose of explaining polysemy. Pustejovsky sought to provide a more explanatory account of polysemy than that given by sense enumeration lexicons. In his view, such accounts are inadequate, primarily because they are unable to explain how words may take on an infinite number of meanings in novel contexts. Not only is it impossible for such accounts to list all the possible meanings of a lexical item, they also miss the generalizations that can be made on the basis of what appear to be regular patterns of sense alternations, and fail to capture how polysemous senses may partially overlap and be logically related to one another. A more promising approach, he argues, which is able to meet these explanatory requirements, is a lexicon where items are represented as templates combined with a generative framework for the composition of lexical meanings. The generative lexicon theory is thus an example of a one-representation approach.

On Pustejovsky’s account, the semantics of a lexical item is viewed as a structure consisting of several components. A key component is the qualia structure, which consists of a specification of four different roles: The constitutive role captures the relation between an object and its constituents, or proper parts; the formal role specifies what distinguishes the object within a larger domain; the telic role defines the purpose and function of the object (if there is one); and the agentive role describes the factors involved in the origin or coming into existence of the object. For instance, the qualia structure of the noun novel will specify that a novel is a narrative (constitutive role), it is a book (formal role), its purpose is to be read (telic role), and it comes into being by a process of writing (agentive role). Complementing these underspecified (yet informationally rich) lexical entries is a set of generative mechanisms, which operate to yield compositional interpretations. For instance, selective binding is exemplified by the process by which an adjective takes one event expression contained in the qualia structure of the head noun as input to the interpretation process, as in the uses of good in (6):





Here, good selectively modifies the event description given by the telic roles of the nouns (novels are for reading; knifes are for cutting), giving rise to the interpretations ‘good read’ and ‘a knife that cuts well.’

Pustejovsky’s generative lexicon provides a considerably more explanatory account of polysemy than sense enumeration lexicons, and it has had a profound impact on research in lexical semantics. However, it also has to confront several problems related mainly to inflexibility (Asher, 2011; Blutner, 2002; de Almeida & Dwivedi, 2008; Falkum, 2007; Fodor & Lepore, 1998; Willems, 2006). Other formal approaches build on Pustejovsky’s original insights but seek to avoid some of its problems (Asher, 2011; Babonnaud, Kallmeyer, & Osswald, 2016).

5.3 Contemporary Debate

The contemporary debate has been reshaped by psycholinguistics. Experimental work on polysemy started in the 1990s and is gaining influence also within theoretical approaches. Studies have been conducted using a variety of methodologies, including lexical decision and sensicality judgment tasks, eye tracking, magnetoencephalography (MEG) and electroencephalography (EEG) recordings. The main topic in the psycholinguistics literature is how polysemous senses are represented vis-a-vis homonymous meanings in the mental lexicon. Studies conducted during the 1990s had suggested that polysemous senses were stored differently from homonymous meanings (Azuma & van Orden, 1997; Frazier & Rayner, 1990; Williams, 1992). These studies discovered facilitation effects for different senses of polysemous terms and competition between homonymous meanings. Using lexical decision tasks, Azuma and van Orden found that words with many related senses were easier to recognize than words with unrelated meanings, which suggested that, unlike the meanings of homonyms, the senses of polysemous terms do not compete for activation. More recent studies have given consistent results (Beretta, Fiorentino, & Poeppel, 2005; Frisson, 2015; Klepousniotou & Baum, 2007; Klepousniotou et al., 2008; Pickering & Frisson, 2001; Pylkkänen et al., 2006, but cf. Klein & Murphy, 2001).

Let us illustrate the kind of work done in psycholinguistics by explaining in some detail Frisson’s (2015) study of how we process so-called ‘book’ polysemy (e.g., book, manuscript, notice, journal, etc.), with results that can be plausibly extended to other kinds of inherent polysemy. The study consisted of two experiments: one sensicality judgment task and one eye-tracking study. In the sensicality judgment task, subjects were presented with a prime NP in which the adjective focused on either the physical object sense (e.g., bound book) or the information sense (e.g., scary book). Then they were asked to make a sensicality judgment about a target NP in which the adjective focused on either the consistent (e.g., [well-plotted book], scary BOOK) or the inconsistent (e.g., [bound book], scary BOOK] sense. The results showed a clear consistency effect, with increased processing time in the inconsistent condition compared with the consistent condition, but no effect of either sense dominance or direction of sense switch (physical object to information or information to physical object) in the inconsistent condition. In the eye-tracking study, there were three conditions: The neutral condition aimed at testing how quickly a specific sense is assigned to a polysemous word without prior contextual indication. The repeat condition aimed at testing the effect of sense repetition on ease of processing. Finally, the switch condition tested whether switching from one sense involves an extra processing cost. In the neutral condition, subjects did not have more difficulty disambiguating towards the subordinate sense than toward the dominant sense of the polysemous noun. In the repeat condition, subjects spent more time reading the polysemous noun than in the neutral condition, but the time to select a particular sense was not affected by sense frequency. In the switch condition, processing was more difficult than in the neutral context, and switching from a subordinate to a dominant sense induced a greater cost than vice versa.

Frisson’s results suggest that “book” polysemy is processed differently from both homonyms and cases of polysemy where the senses are semantically distant. They also suggest that the different senses of inherently polysemous expressions might be stored together as part of a single representation. However, similar results have been found for other, non-inherent types of polysemy. For instance, MacGregor et al. (2015) investigated the processing of the similarity-based polysemy of mouth as in “mouth of a person”, “mouth of a river”, and “mouth of a cave” and observed the same overall pattern of co-priming and facilitation effects observed for “book” polysemy.

Overall, researchers agree that this pattern of results indicates that the senses of a polysemous expression relate to a single representation. What is not clear at this stage, as MacGregor and colleagues (2015, p. 137) point out, is the nature of such a representation (see also Frisson, 2009, p. 122).

The alternatives psycholinguists consider are:

  1. (i) An underspecification (thin or core meaning) account: The meaning of a polysemous expression is an underspecified, abstract, and summary representation that encompasses and gives access to its different senses in context.

  2. (ii) An overspecification (rich) account: The meaning of a polysemous expression includes all its different senses, which are stored as part of a single representation. Senses are component parts of the meaning of the word, which are selected in context.

  3. (iii) Literalism: Each polysemous expression has a literal, denotational meaning. Its other senses are generated via linguistic rules, coercion mechanisms, or pragmatic inferences.

The details of the underspecified and overspecified meaning representations remain unclear. Underspecified meaning representations are usually understood as including features that are shared by all the different senses of the polyseme (this is the “core meaning” approach), while overspecified representations are supposed to include all the senses that the polysemous expression gives access to, which can include a sheer collection, correspond to some kind of structured representation (such as, for instance, one of Pustejovsky’s qualia structures, cf. Pustejovsky, 1995; Frisson, 2009; Vicente, 2015; Del Pinal, 2015), or to distributional semantic representations (Lenci, 2008; Baroni, Bernardi, & Zamparelli, 2014). Literalism has been scarcely considered and tested by psycholinguists (but see Frisson & Frazier, 2005; Frisson, 2015), but has a strong position in many other fields, ranging from computational semantics to cognitive pragmatics. It is endorsed by computational linguists such as Asher (2015), who aim to explain most meaning variations in terms of coercion mechanisms, Copestake and Briscoe (1995), who account for regular polysemy in terms of lexical rules, and some version of it is also assumed by representatives of Relevance Theory’s pragmatic approach (Carston, 2002; Falkum, 2011, 2015; Wilson & Carston, 2007). Although underexplored in the psycholinguistic literature, the results seem, on the face of it, to speak against literalism, given that there is no record of a preference for alleged literal meanings except in some cases such as mass–count polysemy (Frisson & Frazier, 2005), or of the operation of processes—such as coercion—working on them. However, the psychological plausibility of literalism would be a fruitful topic to explore in future experimental studies.

The issue of how polysemous expressions are semantically represented is of course strongly relevant to the broader issue of lexical meaning representation. Given that most, if not all, words in a language are polysemous (Zipf, 1945), it seems safe to transfer the above views about the meaning of polysemous expressions to views about word meaning simpliciter (Falkum & Vicente, 2015; Vicente, 2017). Mainstream semantics has been operating with an idealized literalist position on standing word meaning, and this position has been questioned on various fronts. There is a number of authors from different schools who defend an underspecification approach (e.g., Bierwisch & Schreuder, 1992; Carston, 2013; Evans, 2009; Yalcin, 2014) or an overspecification account (e.g., Elman, 2011; Pustejovsky, 1995; Rayo, 2013; Vicente & Martínez Manrique, 2016; Zwarts, 2004). Some interesting recent work also points in the direction of a renewed interest in the sense enumeration hypothesis, though from a more cognitively plausible viewpoint in which the interaction between stable senses and contextual factors plays a key role (Carston, 2016, forthcoming; Recanati, 2016). In short, there is an increasing interest in lexical meaning, and a deep but still scattered debate across the different traditions regarding the nature of the meaning of lexical content words. Future research on polysemy is bound to have a decisive impact on this question.

6. Polysemy in Context and Communication

Beyond the complex question of what lexical items encode, the ubiquity of polysemy in natural languages raises two further questions, which have implications for accounts of language processing, comprehension and production. The first question is how lexical meanings get extended into different senses. If we grant that some, and possibly many senses of polysemous lexical items are derived or constructed during on-line processing, what are the processes or mechanisms involved, and how do they operate? The second is the fundamental question of why polysemy exists at all. What is it about our language systems—specifically their lexical component—that makes them so susceptible to polysemy? Why do language users rather use the same word to refer to different things or properties than have a distinct word for each sense?

Accounts of how polysemy arises in linguistic and extralinguistic context can broadly be divided into two main camps: rule- or coercion-based approaches and pragmatic inferential approaches. In formal and computational semantics it has been common to analyze regular polysemy as being generated by an inventory of lexical rules (Asher & Lascarides, 2003; Copestake & Briscoe, 1995; Gillon, 1992; Ostler & Atkins, 1992). For instance, Copestake and Briscoe (1995) suggest that the rule of universal grinding (Pelletier, 1975) and a set of conventionalized subcases of it (meat-grinding, fur-grinding, etc.) can account for regular polysemy of the kind in (7):







Proponents of this sort of approach claim that lexical rules are necessary to explain the productivity of regular polysemy and the availability of default interpretations in uninformative contexts (see Frisson & Frazier, 2005, for a psycholinguistic study testing this account).

Asher (2011) takes a somewhat different approach to regular polysemy in his coercion-based approach. He suggests that instances of regular polysemy—those that pass the co-predication and anaphoric binding tests (e.g., “The rabbit was cute and (it was also) delicious”)—require the postulation of dot objects, where the related senses are represented together as part of a complex representation. Other types of polysemy—those that do not pass the co-predication and anaphoric binding tests (e.g., ?? “The ham sandwich is delicious and (it is) impatient”)—are taken to be generated by the process of coercion, which takes as its input a literal meaning and, forced by a type mismatch in the composition process, delivers a different sense as output. On rule- and coercion-based approaches, polysemy could be motivated by a goal of economy of expression, representing an effort-saving strategy for the speaker and contributing to linguistic-communicative efficiency. The speaker can rely on the addressee to apply the requisite linguistic mechanism(s) to generate the contextually appropriate sense. Although rule- and coercion-based approaches do not explicitly adhere to this position, it provides them with a plausible explanation for why polysemy is such a pervasive phenomenon in natural languages (see Falkum, 2015, for discussion).

A radically different approach to polysemy generation can be found in the field of lexical pragmatics, which specifically studies the interaction between an expression’s standing meaning and aspects of the context, where context is not limited, as in rule- and coercion-based approaches, to material provided by linguistic structure (Nunberg, 1995; Blutner, 1998; Bosch, 2007; Carston, 2002; Recanati, 2004; Wilson & Carston, 2007). A central claim of lexical pragmatics is that word meanings typically undergo pragmatic modulation (in the form of conceptual specification, broadening, metaphorical or metonymic extension) in the course of utterance interpretation, and this is what gives rise to polysemy. The ubiquity of polysemy in natural languages suggests that speakers and hearers might find it easier to extend already existing words to related domains than to invent new words for each sense, and lexical pragmatic processes are thought to play a key role in enabling communicators to do this. Indeed, some “radical” pragmatic accounts tend to see polysemy as an epiphenomenon of pragmatic processes applying at the level of individual words: “In general … polysemy is the outcome of a pragmatic process whereby intended senses are inferred on the basis of encoded concepts and contextual information” (Sperber & Wilson, 1998, p. 197). The existence of polysemy has a strong motivation on this pragmatic account, where it arises to meet the communicative needs of speakers and hearers (Falkum, 2015). Such speaker–hearer interactions, which result in the formation of novel, polysemous senses are also thought to be the main driving force of lexical semantic change (Bowdle & Gentner, 2005; Hopper & Traugott, 1993/2003; Sweetser, 1990; Traugott & Dasher, 2002). Polysemy is seen as one stage along the path to semantic change, where related senses of a word—which may have emerged at historically different periods—coexist for a certain time in a language, both in individual speakers and in language communities (see Traugott, Semantic Change), before one takes over from the other in conventional usage. While it is widely agreed that processes such as metaphor and metonymy play an important role in lexical semantic change, the question of what consequences the source of a polysemy—and the semantic change that often follows—may have for lexical representation and sense activation/construction remains a largely unexplored question.

Further Reading

Asher, N. (2011). Lexical meaning in context: A web of words. Cambridge, U.K.: Cambridge University Press.Find this resource:

Carston, R. (2002). Thoughts and utterances: The pragmatics of explicit communication. Oxford: Blackwell Publishers.Find this resource:

Cruse, A. (2010). Meaning in language: An introduction to semantics and pragmatics (3d ed.). Oxford: Oxford University Press.Find this resource:

Evans, V. (2009). How words mean: Lexical concepts, cognitive models and meaning construction. Oxford: Oxford University Press.Find this resource:

Falkum, I. L., & Vicente, A., (Eds.). (2015). Polysemy: Current Perspectives and Approaches. Special Issue of Lingua, 157.Find this resource:

Falkum, I. L. (2011). The semantics and pragmatics of polysemy: A relevance-theoretic account. PhD diss, University College London.Find this resource:

Lakoff, G. (1987). Women, fire, and dangerous things: What categories reveal about the mind. Chicago: The University of Chicago Press.Find this resource:

Pustejovsky, J. (1995). The generative lexicon. Cambridge, MA: MIT Press.Find this resource:

Ravin, Y., & Leacock, C. (2000). Polysemy: Theoretical and computational approaches. Oxford: Oxford University Press.Find this resource:

Recanati, F. (2004). Literal meaning. Cambridge, U.K.: Cambridge University Press.Find this resource:

Ruhl, C. (1989). On monosemy: A study in linguistic semantics. Albany: State University of New York Press.Find this resource:

Taylor, J. (2003). Linguistic categorization (3d ed.). Oxford: Oxford University Press.Find this resource:


