Type Theory for Natural Language Semantics
Summary and Keywords
Type theory is a regime for classifying objects (including events) into categories called types. It was originally designed in order to overcome problems relating to the foundations of mathematics relating to Russell’s paradox. It has made an immense contribution to the study of logic and computer science and has also played a central role in formal semantics for natural languages since the initial work of Richard Montague building on the typed λ-calculus. More recently, type theories following in the tradition created by Per Martin-Löf have presented an important alternative to Montague’s type theory for semantic analysis. These more modern type theories yield a rich collection of types which take on a role of representing semantic content rather than simply structuring the universe in order to avoid paradoxes.
At the beginning1 of the 20th century, Bertrand Russell was trying to overcome a serious problem that he had discovered in Gottlob Frege’s formalization of Georg Cantor’s set theory. The problem has subsequently come to be known as Russell’s paradox; it concerns a contradiction that arises when sets are allowed to be members of themselves and any intuitive property can be used to characterize a set. Russell pointed out that a set X, defined as the set of all sets that are not members of themselves, leads to a contradiction when one asks the question of whether X is a member of itself. If X ∈ X, then by definition holds, and if holds, then again by definition X ∈ X. Russell proposed to solve this problem by imposing a regime on set theory which did not allow sets to be members of themselves (Russell, 1992; Whitehead & Russell, 1925). He introduced different types: individuals, sets of individuals, sets of sets of individuals, and so on. An object of a given type could not be an element of an object of the same type—thus no set could be a member of itself.2
This led to what has come to be known as the simple theory of types, first formulated by Alonzo Church (1940) based on the λ-calculus which Church had introduced earlier. This formulation of the simply typed λ-calculus has had wide applications in computer science and, most importantly for our purposes, linguistic semantics, given that Montague’s Intensional Logic (IL) (Montague, 1973, 1974) is an extension of Church’s simply typed λ-calculus. The simple theory of types as embodied in Montague’s IL3 has dominated the field of formal semantics since its beginning in the early seventies (despite significant alternatives such as property theory, Chierchia & Turner, 1988; Fox & Lappin, 2005, and situation semantics, Barwise & Perry, 1983; Cooper, 1996).4
Starting with the work of Sundholm and Ranta (Sundholm, 1989; Ranta, 1991, 1994), approaches to natural language semantics have been developed relating to a kind of type theory developed by Per Martin-Löf (Martin-Löf, 1984; Nordström et al., 1990) and known as constructive type theory (CTT) or Martin-Löf type theory (MLTT). The initial goal of CTT was to provide an alternative foundation for mathematics based on constructivism (see the discussion on constructivism in Section 1). Similar ideas have been pursued using Girard’s system F, a version of the λ-calculus more expressive than the simply typed lambda calculus because of its ability to quantify over types (Girard, 1971; Girard et al., 1990).
From the perspective of linguistic semantics, there are three important differences between CTT and simple type theory:
Rich selection of types: CTT is a rich type theory in that it does not just provide a way of dividing a domain of mathematical objects into major ontological categories such as individuals, sets, sets of sets, and so on, as one finds in simple type theory. It also provides types which can correspond to the content of linguistic expressions. For example, there may be a type Dog, the type to which all dogs might belong, or, as suggested by Ranta (1994), Flight-by-Amundsen-over-the-North-Pole, a type to which any event of a flight by Amundsen over the North Pole belongs. The richness of modern type theories as compared with simple type theory significantly changes the potential contribution of type theory to natural language semantics.
Proof theoretic approach to types: Montague’s use of simple type theory was firmly rooted in the model theoretic tradition. Type theory was seen as a way of organizing set-theoretic objects which could serve as the extensions (or intensions) of phrases of natural language. In contrast, CTT comes from the proof theoretic tradition in logic, which concentrates on inference rules defined on the syntax of logical expressions. If we are interested in linguistically relevant types like Dog and flight-by-Amundsen-over-the-North-Pole, this might, for example, lead us to try to characterize what follows from something being of the type Dog (for example, also being of the type Mammal) or flight-by-Amundsen-over-the-North-Pole (for example, also being of the types Event and has-agent-Amundsen).
Constructivism (intuitionism): Constructive mathematics claims that it is necessary to construct a mathematical object in order to prove that it exists. Intuitionism (which is often equated with constructivism) is a kind of constructivism where the construction of mathematical objects (like sets) is founded on the intuitions of mathematicians. CTT, as its name might suggest, provides an intuitionistic approach to type theory founded on the idea that to show the existence of an object, you have to construct, or at least provide a method of constructing, the object. Montague’s use of simple type theory, on the other hand, is Platonist in that it countenances the existence of abstract objects and sets that we do not know how to construct. This contrast is related to issues concerning computation such as decidability and the related issue of whether semantics is mathematics or psychology (Partee, 1979, 2014). As will be seen, it is not a straightforward choice between intuitionism as a psychological view of semantics and Platonism as providing a view of semantics as mathematics.
There is now a considerable body of work which applies this kind of type theory to natural language semantics, including Ranta (1994); Luo (2010, 2011); Chatzikyriakidis and Luo (2014b); Boldini (2000); Piwek and Krahmer (2000); Retoré (2013); Bekki (2014); and Ranta (2015), as well as the Grammatical Framework (GF, Ranta, 2011), a computational grammar formalism based on type theory, showing that type-theoretic semantics generalizes to a multilingual setting where it can be used, for example, for defining translation interlinguas. Since the turn of the century, a third strand of type theoretic approaches has arisen which tries to combine the best of the model theoretic approach of Montague with the type theoretical techniques from CTT. This includes work on Type Theory with Records (TTR, Cooper, 2005a, 2005b, 2012; Cooper & Ginzburg, 2015) and work by Grudzinska and Zawadowski (2014).
2. Choices in Type Theory
This section discusses a number of basic choices made in the various versions of type theory that are relevant for linguistic semantics.
2.1 Basic Types
There is a fundamental difference between simple type theory, with a very limited number of basic types, and CTT, which makes use of a multitude of basic types (or at least leaves the type theory open to adding new basic types). As formulated by Church (1940) the simply typed lambda calculus employed two basic types: type ι for individuals and o for propositions. Montague (1973) uses e (for entities or individuals), corresponding to Church’s ι, and t (for truth values). Truth values serve as the extension of propositions which for Montague belong to a complex type (see Section 2.2).
The central idea in simple type theory is that you can build all the types you need from two basic types corresponding to individuals and propositions/truth-values and all possible functions which can be constructed from these two (see Section 2.3). Variants have been proposed in the semantics literature which have more than two basic types. Gallin (1975) proposed a type theory (TY2 ) that adds a third basic type, s (for world-time pairs).5
If you want to make subdivisions within these types—for example, if you want to talk about different kinds of individuals like dogs and mammals—you do this by using function types of some kind, for example, functions from individuals to propositions (Church) or truth-values (Montague). Alternatively, one can introduce sorts which correspond to subsets of the elements of types (Kohlhase, 1992, 1994). For example, one can structure the domain of individuals to partially ordered subsets and thus have sorts like MAN, HUMAN, ANIMAL, etc. Thomason (1980) also introduced a type theory with three basic types: e (entities), t (truth-values), and p (propositions). In this way, it is a kind of merge of Church and Montague. (See Section 2.2 for more discussion.)
In rich type theories like CTT, there is no restriction on the basic types one can use. In this way, the types can do the work that sorts might do in a simple type theory. If one wants, one can choose to have a basic type Dog to which any dog belongs, or one can choose to define a complex type Dog, constructed from other type theoretic elements. While this freedom also exists in TTR, in practice relatively few basic types have been used, such as Ind (the type of individuals) and RecType (the type of record types). (Types are regarded as first class objects and can themselves belong to types. A technique called stratification is used in order to avoid this giving rise to the paradoxes; see Turner, 2005.) Thus in TTR Ind does the work of Church’s ι or Montague’s e, whereas in CTT there is normally not such a type but rather (in type theories developed for linguistic applications) more contentful basic types like Dog and Boy. Section 2.2 will discuss what corresponds to Church’s o and Montague’s t.
While Church (1940) introduces the type o for propositions, he does not make any commitment to the nature of propositions. He says (p. 57), “We purposely refrain from making more definite the nature of the types o and ι, the formal theory admitting of a variety of interpretations in this regard.” For Montague (1973) the situation is different. For him propositions are constructed as functions from pairs of a possible world and a moment of time to truth-values (i.e., the characteristic function of a set of world-time pairs). The type of propositions for Montague is the complex type 〈s, t〉 (see the section on complex types, 2.3). Thus the type theory is being used here to encode a particular theory of what propositions are. While it is, of course, desirable for a semantic theory of natural language to include a theory of what propositions are, there is a well-known problem with the idea that propositions are sets of world-time pairs. It means that you cannot have two distinct logically equivalent propositions. If p and q are logically equivalent, then they are true at the same worlds and times, which according to this theory would mean that they are the same proposition. Thus it becomes puzzling to give an account of how somebody could learn about the logical equivalence of p and q or how it could be that somebody could believe p but not believe q. In the linguistic literature, this is often referred to as the problem of hyperintensionality, “hyper-” in the sense of going beyond the kind of intensionality introduced by considering propositions as sets of world-time pairs. For an account of the problem and a survey of some attempts to solve the puzzle, see Muskens (2007) and Fox and Lappin (2005, chapters 1–2). It was to address this problem that Thomason (1980) went back to Church’s idea of having a basic type for propositions to avoid the commitment to propositions as sets of possible worlds and times. This strategy has the consequence, of course, that if you have a theory of propositions, it has to reside elsewhere than in the type theory you use.
Rich type theories like CTT follow a dictum which derives from the intuitionistic logic of the 1930s: “propositions as types,” also known as the Curry-Howard Correspondence.6 For an introduction to this idea from the perspective of computer science, see Wadler (2015), and for a linguistic perspective, see Ranta (1994). The idea is related to the richness of these type systems. If you have a type (of events or situations) Flight-by-Amundsen-over-the-North-Pole then you can use this type as the proposition represented by the sentence Amundsen flew over the North Pole (ignoring issues of tense). The proposition is true just in case there is an event of the type; that is, the type is non-empty, or inhabited in Martin-Löf ’s terminology. If you have an extensional type theory—one where types are characterized by the sets of objects that belong to them, their witnesses or inhabitants; that is, you cannot have two distinct types with exactly the same witnesses—you will fall into a similar trap as Montague did. But if you have an intensional type theory—one where types are characterized independently of their witnesses; that is, you can have two distinct types with exactly the same witnesses—then you have an approach to hyperintensionality which is related to that proposed by Thomason (and indeed also to property theory as discussed by Fox & Lappin, 2005). If we use this strategy to treat natural language, then it is natural to treat types as first-class citizens in the sense that they can be arguments to predicates. Thus in order to treat a sentence like Sam believes that Amundsen flew over the North Pole we could use the type corresponding to Amundsen flew over the North Pole as the second argument of the predicate ‘believe.’ This means that a rich type theory plays a role in a linguistic theory that is not played by simple type theory. Both kinds of type theory provide a regime for organizing the universe of semantic objects according to the types. Rich type theories in addition have the possibility of contributing the types themselves to the universe of semantic objects.
Since all types either have or do not have a witness (or, in the case of the probabilistic type theory introduced in Cooper et al., 2015, have a certain probability of having a witness), they all have the potential to be used as a proposition (true if there is a witness and false otherwise). Martin-Löf introduced the term proof object for witnesses of types used as propositions. In some versions of type theory, only some of the types are identified as propositions. That is, they are witnesses of a type Prop, a type of types, of which only the types which count as propositions are witnesses. This distinction is not drawn in TTR, although in practice, since the contents of declarative sentences are always record types (see Section 2.3), in linguistic semantics using TTR it is record types that correspond to propositions expressed by natural language. Since record types also can be thought of as modeling discourse representation structures (DRSs), this is a bit like a semantic theory where DRSs play the role of propositions, whereas in standard discourse representation theory propositions would be Montague-style propositions which serve as the interpretation of a DRS.
2.3 Complex Types
2.3.1 Function Types
In simple type theory, the idea is to choose a small number of basic types and then close the set of types under function type formation. Thus Church’s set of types could be given by the following recursive definition (using more modern notation for function types than Church’s notation):
1. It is true that ι and o are types.
2. If α and β are types, then (α → β) is a type.
3. Nothing else is a type.
Here the type (α → β) is the type of (total) functions from objects of type α to objects of type β. An example of a type in this system would be (ι → (ι → o)), the type corresponding to two-place predicates.
Montague’s set of types is given by the following definition (again using the arrow-notation for function types rather than Montague’s original):
1. It is true that e and t are types.
2. If α and β are types, then (α → β) is a type.
3. If α is a type, then (s → α) is a type.
4. Nothing else is a type.
Here the type (s → α) is the type of (total) functions from world-time pairs to objects of type α. The status of s in Montague’s system is a little strange. He presumably did not introduce it as a basic type (as Gallin did, as discussed in Section 2.1) because he did not envision expressions denoting world-time pairs and did not want functions with world-time pairs in their range.
In characterizing a type system, there are two important things to be defined: (1) what types there are (as in the recursive definitions above) and (2) what kind of objects are witnesses for the types. In the case of basic types, there may not be much to say about the nature of the witnesses; it may even be assumed, as Church did in the quotation in Section 2.2, that the type theory will be used in different domains and that the witnesses for the basic types will be given by something like a model or an interpretation for the type theory. In the case of the complex types, however, an exact characterization of the witnesses for the types is normally required, given an assumption about what witnesses are associated with the basic types. The two tasks (1) and (2) become particularly important in rich type theories like those within the tradition of Martin-Löf or Girard’s system F, or systems like TTR, where in addition to function types there are many other kinds of complex types. The plethora of available complex types is another source of richness in these type theories. Here a few key examples of the additional complex types that have been introduced will be presented.
2.3.2 Dependent Types
A dependent type is something that requires one or more arguments to yield a type (as opposed to an object belonging to a type). It can be thought of as a function which takes object(s) of one or more types and yields a type as a result. For example, in a classical Montague semantics a verb-phrase corresponds to a function from individuals (objects of type e) to a truth-value (an object of type t). In TTR a verb-phrase instead returns a type (of situation in which the argument to the function has a certain property, that is, not a truth-value or a situation, but a type). Thus verb-phrase interpretations in TTR are dependent types. In TTR dependent types are encoded explicitly as functions. In the literature on CTT, they are often referred to as a family of types that is indexed by objects of one or more types. In CTT a common use of dependent types of this kind is as n-ary predicates, which take n arguments to form a type. For example, ‘see’ would be a dependent type which requires two arguments to form a type such as ‘see(a,b)’. If we follow Ranta’s suggestion, discussed in Section 2.2, then a witness for this type would be an event, intuitively an event that proves that a sees b.
2.3.3 Dependent Function Types (Π-Types)
The term “dependent type” is also (somewhat confusingly perhaps) used in the type theory literature to refer to complex types which are constructed using a dependent type in the sense of Section 2.3.2. Thus, for example, there are dependent function types in which the type of the range of the function depends on which particular object you choose in the domain of the function. Suppose, for example, we have a type Girl whose witnesses are girls and we want to define the type of mappings from girls, a, to an event where a runs. One kind of notation for this type corresponds to the notation for simple function types: ((x:Girl) → run(x)). These dependent function types are also called Π-types, using the notation Πx:Girl. run(x). A Π type degenerates into a normal function type in the non-dependent case, when the second type does not depend on the value of the first. Somewhat in more detail and closer to the formulations one finds in CTT, when A is a type and P is a predicate over A, Πx:A.P (x) is the dependent function type that, in the embedded logic, stands for the universally quantified proposition ∀x:A.P (x).
According to the Curry-Howard Correspondence, such Π-types correspond to universally quantified propositions like every girl runs. The intuition is this: if the type corresponds to a true proposition, then there must be a witness for the type. Such a witness will be a (total) function from girls, a, to a witness for the type ‘run(a),’ an event where a runs. If there is such a function then every girl runs, and if there is not such a function then it is not true that every girl runs.
Π types can also be useful for characterizing the compositional mechanisms of natural language semantics. Suppose that Prop is the type of types which can serve as propositions. We could characterize these as the types which can be the content of a declarative sentence. Thus Πx:Girl. run(x) would be of this type (in symbols: Πx:Girl. run(x) : Prop). What then would be the type of the content of an intransitive verb like runs? If we have a type of individuals like Church’s ι or Montague’s e, a type which we will represent as Ind, following the convention in TTR, we could say that the type of intransitive verb contents is (Ind→Prop), that is any intransitive verb content will be a function from individuals to propositions. However, in most if not all type theories following Martin-Löf (except for TTR), there is no type Ind as such, but rather a collection of types such as Girl, Boy, Dog which can serve as the content of common noun phrases. In case one wants to have the predicate over the whole universe of common nouns, a Π type can be used to do that. This article follows Luo (2012) and Chatzikyriakidis and Luo (2013) in calling this type cn7—thus Girl : cn and so on—and in general represents the content of a phrase α as ⟦α⟧. Thus ⟦girl⟧ = Girl and ⟦girl⟧ :cn. Given such a treatment of common nouns, what should the type of ⟦runs⟧ be? One option is to say that it is a function from types A:cn to a function from objects of type A to propositions, that is ⟦runs⟧ : ΠA:cn. (A → Prop).8 We can similarly treat the types for quantifiers and VP-adverb contents respectively as:9
According to the Curry-Howard Correspondence, existential quantification corresponds to products or disjoint unions. The constructor/operator Σ is a generalization of the Cartesian product of two sets that allows the second set to depend on values of the first. If A is a type and B is an A-indexed family of types (that is a function from objects of type A to types), then Σx:A. B(x), is a type, whose witnesses are pairs (a, b) such that a is of type A and b is of type B(a). So, for example, the type which serves as the content for a girl runs could be Σx:Girl. run(x). A witness for this type would be a pair consisting of a witness, a, for Girl and a witness for run(a), that is, a proof that a runs, which in turn can be seen as an event where a runs.10
As the witnesses of Σ-types are ordered pairs, they can be used not only as the contents of existentially quantified sentences but also in other cases where such an ordered pair might seem appropriate. An example which goes back at least as far as Ranta (1994) is the interpretation of modified common nouns. (For recent developments of the idea see Luo, 2011; Chatzikyriakidis & Luo, 2013.) A modified common noun like tall girl can be interpreted as the following Σ type (where the type of ⟦tall⟧ is ΠA:cn. (A → Prop), the same type as discussed for ⟦runs⟧ in Section 2.3.3):
Witnesses of this type are pairs consisting of a girl and a proof that the girl is tall for a girl. For intersective adjectives, we can think of the content of the adjective as being a constant function which always returns the same function no matter what type we give it as argument. An example might be Swedish.11 The modification process is iterable. Thus ⟦tall Swedish girl⟧ could be the following: first we define Swedish girl:
Then, we have the following:
A witness of this type would be again a pair, but this time the first projection is a Σ type. More concretely, the pair consists of a proof that y is a Swedish girl and a proof that y is tall (given that it is a Swedish girl (you might take this to mean that y is tall for a Swedish girl)).12 For further discussion and the use of subtyping for adjectival modification see Chatzikyriakidis and Luo (2013).
Π-types and Σ-types can be used together in order to obtain an analysis of donkey anaphora (Sundholm, 1989; Ranta, 1994), which is equivalent to the original analysis in Discourse Representation Theory (Kamp, 1981). In order to do this projection operations π1 and π2 on order pairs will be needed so that and . Following Ranta’s (1994) presentation, Σ-types are used both for existential quantification and for common nouns as discussed in Section 2.3.4. Thus the content of man who owns a donkey is the type:
Note that a witness, z, for this type will be an ordered pair (m, (d, s)), where m : Man, d : Donkey and s : own(m,d). Thus and . The content of every man who owns a donkey beats it is now the following Π-type:13
2.3.5 Record Types
TTR uses record types in place of Σ-types. We will make the relationship between the two clear below, after we have explained what a record type is. There is a large body of work on adding records to CTT: Tasistro (1997), Betarte (1998), and Betarte and Tasistro (1998), among many others. Records and record types are feature structure-like objects which have been used in TTR for modeling discourse representation structures (Kamp & Reyle, 1993), frames (Fillmore, 1982), situation types (Barwise & Perry, 1983), and dialogue gameboards (Ginzburg, 2012). (See Cooper, 2005b, 2015; Ginzburg, 2012 for discussion.) In this discussion the graphical notation for record types from TTR will be used, which displays their similarity to feature structures.
Consider first a record type corresponding to the Σ-type Σx:Girl. run(x):
A record type consists of a set of fields14 each of which is a pair consisting of a label (such as ‘x’ and ‘e’ in this example) and a type, which may be a dependent type like ‘run(x)’ in this example. The label and the type are displayed with a ‘:’ between them. A record will be of this type just in case it has two fields with the same labels as the type and objects of the types in the respective fields. Thus an example of a record of this type would be:
where a:Girl and s:run(a), that is, s is a proof that a runs, which can be understood as saying that s is a situation in which a runs. The fields in records are displayed with ‘=’ between the label and the object. An important aspect of the theory of record types is that a record may have more fields than required by the record type and still be a witness for the type. Thus, for example, the following record would also be of this type:
where a and s are as before and b could be of any type. Section 2.4 will return to this.
The correspondence between the Σ-type and the record type is that they both have witnesses which are a pair consisting of a girl and a proof that she runs. In the case of the Σ-type the pair is ordered, and in the case of the record type the pair is indexed by the labels ‘x’ and ‘e.’ In place of the projections π1 and π2 which were associated with ordered pairs, there are projections based on labels for records. If r is a record containing the label ℓ, then r.ℓ is the object occurring in the field with label ℓ. Thus, if r is either of the records above, r.x is a and r.e is s. This system of labeling, together with the fact that there can be any finite number of fields in both records and record types, yields the possibility of using objects with flatter structures than are provided by Σ-types. Thus a type corresponding to a man owns a donkey could be:
Using this machinery there can be the following function type, using the notation for dependent function types introduced at the beginning of Section 2.3.3. It corresponds to the Π-type for donkey anaphora presented in Section 2.3.4:
In simple type theory, it is standardly assumed that there will be no overlap between the sets of objects of two distinct types. That is, no object may be of more than one type. This assumption is normally carried over to rich type theories like CTT. However, when the kind of types needed for linguistic semantics are included, this constraint seems untenable. For example, if there are types corresponding to common nouns it seems clear intuitively that a witness for Man is also a witness for Person, and in turn a witness for Person is a witness for AnimateBeing, giving rise to the kind of type hierarchies that are familiar to linguists from feature based systems and also work on ontologies in artificial intelligence. If intuitively any witness for T1 must also be a witness for T2 then T1 is said to be a subtype of T2 (in symbols, T1 ≤ T2 or ). Of course, an object can also intuitively belong to two types which do not stand in the subtype relation. For example, a single person can be both of type Man and type Doctor, although neither is a subtype of the other.
This intuitive notion of subtyping, which is what is found in systems such as feature structure logic, is known as subsumptive subtyping. In a proof-theoretical approach, subsumptive subtyping gives rise to the following inference rule:
Subsumptive subyping is reflexive and transitive and basically allows one to use a term of type A in a context where a term of type B is required instead just in case A ≤ B.
An alternative to subsumptive subtyping is coercive subtyping (Luo, 2011, 2010, 2012). This allows one to maintain that each object has exactly one type and in particular allows for two types in the subtype relation to have different ways of constructing canonical objects.15 In coercive subtyping, when A is a subtype of B this is associated with a designated coercion function, c, from type A to type B. Thus, technically, an object a of type A need not be identical with an object b in type B, but the function c such that c(a) = b shows that b is a “counterpart” of a in type B. In this way, a can be a canonical object of type A and c(a) a canonical object of type B even though the definitions of canonicity for the two types are distinct. Similar mechanisms of subtyping are also found in Retoré (2012), Moot and Retoré (2012), Retoré (2013), Asher and Luo (2012), Asher (2011), Callaghan and Luo (2001), and Barras et al. (2000).
Record types (as in, e.g., Tasistro, 1997; Betarte, 1998; Betarte & Tasistro, 1998; Coquand et al., 2004) can be used to introduce subsumptive subtyping in a restrictive and controlled way. As pointed out in Section 2.4, a record r is judged to be of a record type T just in case there are fields in r with the same labels as the fields in T and the objects in those fields in r are of the types in the corresponding fields in T. There may be more fields in r with labels that are not mentioned in T. This means, for example, that any record of type
will also be of type
and also of type
That is, the first type is a subtype of the second, which in turn is a subtype of the third. If one construes these types as types of situations, as is done in TTR, this subtyping relationship can be glossed as saying that any situation in which a man owns a donkey is a situation in which there are a man and a donkey, which in turn is a situation in which there is a man. In TTR much of the work on subtyping is done by record types, although non-record types are also allowed to enter into the subtype relation. (For example, Man and Person might be treated as basic types, with the first a subtype of the second.) There is no requirement that objects belong to exactly one type. Ending the discussion on subtyping, it is important to add that systems of subtyping exist for simple type theory as well as other systems, like for example system F. The interested reader is encouraged to check Cardelli (1988), Cardelli et al. (1991), and Retoré (2013) for more details.
There is a notion of context in CTT which has been used to analyze various aspects of linguistic semantics such as text and discourse representation (Ranta, 1994), linguistic context (Boldini, 2000; Chatzikyriakidis & Luo, 2014c; Boldini, 2001; Piwek & Krahmer, 2000), and intensionality (Ranta, 1991, 1994; Chatzikyriakidis & Luo, 2013). In the case of intensionality, type theoretic contexts are used in place of possible worlds. They can be thought of as incomplete possible worlds or situations (Ranta, 1994; Boldini, 2000; Chatzikyriakidis & Luo, 2014c).
A context in CTT can be thought of in various ways; one way is as a list of variable declarations, where variables stand for proofs of the corresponding assumptions (Boldini, 2000), and another is as a sequence of type judgements (Ranta, 1994). The standard way to write a context is as an expression of the form:
Here we have a series of types and a series of variables which are assumed to be witnesses (which are often referred to as “proofs”) of these types. Each type may depend on the witnesses chosen for any of the previous types in the sequence which is represented in this notation by A2(x1), meaning that A2 may depend on the value of x1, and , meaning that An may depend on the values of any of the previous variables.
One way to gloss a context in CTT is as a “let”-statement: “let x1 be an object of type A1, x2 an object of type A2(x1),…, and xn be an object of type .”
In CTT such contexts can be used to express an assumption on a rule of inference. Ranta (1994) pointed out that they can be used to represent the content of previous discourse. Consider the discourse:
A farmer owns a donkey. He loves it.
At the end of the first sentence we can consider ourselves to be in the context:
In processing the second sentence, we can extend this context, picking up on the variables already declared, using the projection operators:
As Ranta notes, there is an obvious correspondence between this use of contexts and the notion of discourse representation structure as used in Kamp and Reyle (1993).
While such contexts are an important notion in type theory, they are not, in the standard formulation, objects which are themselves assigned a type. Rather they are assumptions expressed in a proof theoretic language. This means that, for example, one cannot use contexts as standardly characterized in type theory to talk about a function from a particular kind of context to a proposition (which would be construed as a type, as discussed in Section 2.2). Such a function could be used, for example, to model Kaplan’s notion of character (Kaplan, 1978). Introducing record types (as discussed in Section 2.3.5) allows contexts to be treated as objects which have a type. Thus the type of a context in which a farmer owns a donkey could be identical with the type that models the proposition for the sentence a farmer owns a donkey:
This corresponds in an obvious way to the context above (and also the corresponding discourse representation structure). Now, for example, a function can be defined which takes contexts in which a farmer owns a donkey to a type of context in which that farmer loves that donkey:
The correspondence between records and contexts is exploited a great deal in the TTR literature (see, e.g., Ginzburg, 2012; Cooper, in prep).
Ranta also proposes that contexts can be used to represent the belief state of an agent. An agent, a, believes a proposition p (that is, a type), just in case p is in a’s belief context (denoted by Γa). Thus suppose that Kim’s belief state is characterized by:
That is, Kim believes that some donkey kicks. Then Kim’s belief state, Γk could be:
Ranta further uses a generalized belief operator BΓa which is obtained by binding the variables of Γa with Π. The idea is that if A is a proposition in context Γk, then
is also a proposition. This proposition will be true just in case A follows from any instantiation of Kim’s belief context Γ:k.
Cooper (in prep) has a similar proposal using record types. Recreating this example with the kind of example used there would have Kim’s belief state as the record type corresponding to the context above:
The object of Kim’s belief would would be the record type corresponding to the Σ-type in Ranta’s analysis (see Section 2.3.5):
The basic idea for checking whether Kim stands in the belief-relation to some type T is to check whether the type representing her belief state is a subtype of T′, where T′ involves a relabeling of the paths in T. The relabeling here would be x⤳ y.x, e⤳ y.e, which would obtain in this case the exact same type as was used to represent Kim’s belief state.16 This involves checking that any record of the belief state type would also be of type T, appropriately relabeled. Thus both these analyses involve a universal quantification, either in terms of an explicit Π-type or implicitly in terms of the definition of subsumptive subtyping.
A slightly more complicated example of this kind of analysis is the case of an intensional adjective like alleged in the analysis proposed by Chatzikyriakidis and Luo (2013). Let AN : CN be the interpretation of a common noun N. Then,
where B(a, A) signifies that a believes A according to Ranta’s analysis. Thus John is an alleged criminal means that John is of the type
Thus an alleged criminal is who is believed by somebody to be a criminal.17
A good introduction to the use of simple type theory in Montague Semantics is Dowty et al. (1981). See also Girard et al. (1990b) and Krivine (1993) for discussions of simple type theory. The classic introduction to the use of CTT for natural language semantics is Ranta (1994). Luo (2010) is a good place to start reading about the use of coercive subtyping for semantics. The most recent short introduction to TTR is Cooper and Ginzburg (2015). A detailed introduction is in preparation, and drafts are available for download (Cooper, in prep). For an introduction to the application of system F to natural language semantics, see Retoré (2013). For Asher’s TCL, see Asher (2011). For a discussion of introducing probabilities within TTR, see Cooper et al. (2015). For the use of proof assistants for NL semantics, see Ranta (2004); Chatzikyriakidis and Luo (2014a); and Bernardy and Chatzikyriakidis (2017). For type theories within the tradition of Martin Löf, the standard reference is Martin-Löf (1984), but see also Luo (1994). For a recent collection of papers on the application of various kinds of type theory to natural language, see Chatzikyriakidis and Luo (2017b). There is also a forthcoming special issue of the Journal of Language Modelling devoted to the application of type theories to lexical semantics (Retoré & Cooper, 2017).
Asher, N. (2011). Lexical meaning in context: A web of words. Cambridge, MA: Cambridge University Press.Find this resource:
Asher, N., & Luo, Z. (2012). Formalisation of coercions in lexical semantics. In E. Chemla, V. Homer, & G. Winterstein (Eds.), Proceedings of Sinn und Bedeutung 17 (pp. 63–80). Paris: ENS.Find this resource:
Barras, B., et al. (2000). The Coq proof assistant reference manual (Version 6.3.1). Paris: INRIA-Rocquencourt.Find this resource:
Barwise, J., & Perry, J. (1983). Situations and attitudes. Cambridge, MA: MIT Press.Find this resource:
Bekki, D. (2014). Representing anaphora with dependent types. In Logical aspects of computational linguistics: 7th international conference, LACL 2012, Nantes, France, July 2–4, 2012, Proceedings (pp. 14–29). Heidelberg: Springer.Find this resource:
Bernardy, J.-P., & Chatzikyriakidis, S. (2017). A type-theoretical system for the FraCaS test suite: Grammatical framework meets Coq. University of Gothenburg.
Betarte, G. (1998). Dependent record types and algebraic structures in type theory (Unpublished doctoral dissertation). University of Gothenburg and Chalmers University of Technology.Find this resource:
Betarte, G., & Tasistro, A. (1998). Extension of Martin-Löf ’s type theory with record types and subtyping. In G. Sambin & J. Smith (Eds.), Twenty-five years of constructive type theory (pp. 21–40). Oxford Logic Guides 36. Oxford: Oxford University Press.Find this resource:
Boldini, P. (2000). Formalizing context in intuitionistic type theory. Fundamenta Informaticae, 42(2), 1–23.Find this resource:
Boldini, P. (2001). The reference of mass terms from a type-theoretical point of view. Paper presented at the 4th International Workshop on Computational Semantics.Find this resource:
Callaghan, P., & Luo, Z. (2001). Plastic: An implementation of typed LF with coercive subtyping and universes. Journal of Automated Reasoning, 27(1), 3–27.Find this resource:
Cardelli, L. (1988). Structural subtyping and the notion of power type. In J. Ferrante (Ed.), Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on principles of programming languages (pp. 70–79). New York: ACM.Find this resource:
Cardelli, L., Martini, S., Mitchell, J. C., & Scedrov, A. (1991). An extension of system F with subtyping. In T. Ito & A. R. Meyer (Eds.), International symposium on theoretical aspects of computer software (pp. 750–770). Berlin: Springer.Find this resource:
Chatzikyriakidis, S., & Luo, Z. (2012). An account of natural language coordination in type theory with coercive subtyping. In Y. Parmentier & D. Duchier (Eds.), 7th international workshop on constraint solving and language processing (CSLP12): Selected and revised papers (pp. 31–51). Berlin: Springer.Find this resource:
Chatzikyriakidis, S., & Luo, Z. (2013). Adjectives in a modern type-theoretical setting. In G. Morrill & J. Nederhof (Eds.), Proceedings of formal grammar 2013 (pp. 159–174). Heidelberg: Springer.Find this resource:
Chatzikyriakidis, S., & Luo, Z. (2014a). Natural language inference in Coq. Journal of Logic, Language and Information, 23(4), 441–480.Find this resource:
Chatzikyriakidis, S., & Luo, Z. (2014b). Natural language reasoning using proof assistant technology: Rich typing and beyond. In R. Cooper, S. Dobnik, S. Lappin, & S. Larsson (Eds.), Proceedings of the EACL 2014 workshop on type theory and natural language semantics (TTNLS) (pp. 37–45). Gothenburg, Sweden: Association for Computational Linguistics.Find this resource:
Chatzikyriakidis, S., & Luo, Z. (2014c). Using signatures in type theory to represent situations. In T. Murata, K. Mineshima, & D. Bekki (Eds.), New frontiers in artificial intelligence (pp. 172–183). Berlin: Springer.Find this resource:
Chatzikyriakidis, S., & Luo, Z. (2017a). Adjectival and adverbial modification: The view from modern type theories. Journal of Logic, Language and Information, 26(1), 45–88.Find this resource:
Chatzikyriakidis, S., & Luo, Z. (Eds.). (2017b). Modern perspectives in type-theoretical semantics. Berlin: Springer.Find this resource:
Chatzikyriakidis, S., & Luo, Z. (2017c). On the interpretation of common nouns: Types versus predicates. In S. Chatzikyriakidis & Z. Luo (Eds.), Modern perspectives in type-theoretical semantics (pp. 43–70). Berlin: Springer.Find this resource:
Chierchia, G., & Turner, R. (1988). Semantics and property theory. Linguistics and Philosophy, 11(3), 261–302.Find this resource:
Church, A. (1940). A formulation of the simple theory of types. Journal of Symbolic Logic, 5(1), 56–68.Find this resource:
Cooper, R. (1996). The role of situations in generalized quantifiers. In S. Lappin (Ed.), The handbook of contemporary semantic theory (pp. 65–86). Oxford: Blackwell.Find this resource:
Cooper, R. (2005a). Austinian truth, attitudes and type theory. Research on Language and Computation, 3, 333–362.Find this resource:
Cooper, R. (2005b). Records and record types in semantic theory. Journal of Logic and Computation, 15(2), 99–112.Find this resource:
Cooper, R. (2012). Type theory and semantics in flux. In R. Kempson, N. Asher, & T. Fernando (Eds.), Handbook of the philosophy of science. Vol. 14: Philosophy of linguistics (pp. 271–323). Amsterdam: Elsevier.Find this resource:
Cooper, R. (2015). Frames as records. In A. Foret, G. Morrill, R. Muskens, & R. Osswald (Eds.), Preproceedings of the 20th Conference on Formal Grammar (pp. 1–17). Berlin: Springer.Find this resource:
Cooper, R. (forthcoming). Type theory and language: From perception to linguistic communication. Draft of book chapters available from https://sites.google.com/site/typetheorywithrecords/drafts.
Cooper, R., Dobnik, S., Lappin, S., & Larsson, S. (2015). Probabilistic type theory and natural language semantics. Linguistic Issues in Language Technology, 10(4), 1–45.Find this resource:
Cooper, R., & Ginzburg, J. (2015). Type theory with records for natural language semantics. In S. Lappin & C. Fox (Eds.), The handbook of contemporary semantic theory (2d ed., pp. 375–407). Hoboken: John Wiley & Sons.Find this resource:
Coquand, T. (1986). An analysis of Girard’s paradox. In Proceedings of the first IEEE symposium on logic in computer science (pp. 227–236). Washington, DC: IEEE Computer Society Press.Find this resource:
Coquand, T. (2011, June). Introduction to dependent type theory. Presentation slides of a talk at CIRM.Find this resource:
Coquand, T. (2015, summer). Type theory. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy.Find this resource:
Coquand, T., Pollack, R., & Takeyama, M. (2004). A logical framework with dependently typed records. Fundamenta Informaticae, 20, 1–22.Find this resource:
Dowty, D., Wall, R., & Peters, P. (1981). Introduction to Montague Semantics. Dordrecht: Kluwer Academic.Find this resource:
Fillmore, C. J. (1982). Frame semantics. In Linguistic Society of Korea (Ed.), Linguistics in the morning calm (pp. 111–137). Seoul: Hanshin.Find this resource:
Fox, C., & Lappin, S. (2005). Foundations of intensional semantics. Malden, MA: Blackwell.Find this resource:
Gallin, D. (1975). Intensional and higher-order modal logic. Saint Louis: Elsevier.Find this resource:
Ginzburg, J. (2012). The interactive stance: Meaning for conversation. Oxford: Oxford University Press.Find this resource:
Girard, J.-Y. (1971). Une extension de l’interpretation fonctionelle de Gödel à l’analyse et son application à l’élimination des coupures dans et la théorie des types. In Proceedings of the Second Scandinavian Logic Symposium (pp. 65–92). Amsterdam: North-Holland.Find this resource:
Girard, J.-Y. (1972). Interprétation fonctionelle et élimination des coupures de l’arithmétique d’ordre supérieur (Unpublished doctoral dissertation). Université Paris VII.Find this resource:
Girard, J.-Y., Lafont, Y., & Taylor, P. (1990). Proofs and types. Cambridge, MA: Cambridge University Press.Find this resource:
Grudzinska, J., & Zawadowski, M. (2014). System with generalized quantifiers on dependent types for anaphora. In R. Cooper, S. Dobnik, S. Lappin, & S. Larsson (Eds.), Proceedings of the EACL 2014 workshop on type theory and natural language semantics (TTNLS) (pp. 10–18). Gothenburg, Sweden: Association for Computational Linguistics.Find this resource:
Kamp, H. (1981). A theory of truth and discourse representation. In J. Groenendijk, T. Janssen, & M. Stokhof (Eds.), Formal methods in the study of language (pp. 277–322). Mathematical Centre Tracts 135. Amsterdam: Mathematisch Centrum.Find this resource:
Kamp, H., & Reyle, U. (1993). From discourse to logic. Dordrecht: Kluwer.Find this resource:
Kaplan, D. (1978). On the logic of demonstratives. Journal of Philosophical Logic, 8, 81–98.Find this resource:
Kohlhase, M. (1992). Unification in order-sorted type theory. In A. Voronkov (Ed.), Logic programming and automated reasoning (pp. 421–432). Berlin: Springer.Find this resource:
Kohlhase, M. (1994). A mechanization of sorted higher-order logic based on the resolution principle (Unpublished doctoral dissertation). Universität des Saarlandes, Germany.Find this resource:
Krivine, J. L. (1993). Lambda-calculus, types and models. Chichester: Ellis Horwood.Find this resource:
Lappin, S., & Fox, C. (Eds.). (2015). The handbook of contemporary semantic theory (2d ed.). Hoboken: John Wiley & Sons..Find this resource:
Luo, Z. (1994). Computation and reasoning: A type theory for computer science. Oxford: Oxford University Press.Find this resource:
Luo, Z. (2010). Type-theoretical semantics with coercive subtyping. In N. Li & D. Lutz (Eds.), Proceedings of semantics and linguistic theory 20 (pp. 38–56). Ithaca, NY: Cornell University.Find this resource:
Luo, Z. (2011). Contextual analysis of word meanings in type-theoretical semantics. In S. Pogodalla & J. P. Prost (Eds), Logical aspects of computational linguistics (pp. 159–174). Heidelberg: Springer.Find this resource:
Luo, Z. (2012). Common nouns as types. In D. Béchet & A. Dikovsky(Eds.), Logical aspects of computational linguistics: 7th international conference, LACL 2012, Nantes, France, July 2–4, 2012, Proceedings (pp. 173–185). Heidelberg: Springer.Find this resource:
Martin-Löf, P. (1971). An intuitionistic theory of types. Unpublished manuscript.Find this resource:
Martin-Löf, P. (1984). Intuitionistic type theory. Naples: Bibliopolis.Find this resource:
Montague, R. (1973). The proper treatment of quantification in ordinary English. In J. Hintikka, J. Moravcsik, & P. Suppes (Eds.), Approaches to natural languages (pp. 247–270). Dordrecht: Reidel.Find this resource:
Montague, R. (1974). Formal philosophy. New Haven: Yale University Press.Find this resource:
Moot, R., & Retoré, C. (2012). The logic of categorial grammars. Berlin: Springer.Find this resource:
Muskens, R. (1996). Combining Montague Semantics and discourse representation. Linguistics and Philosophy, 19(2), 143–186.Find this resource:
Muskens, R. (2007). Intensional models for the theory of types. Journal of Symbolic Logic, 72(1), 98–118.Find this resource:
Nordström, B., Petersson, P., & Smith, J. (1990). Programming in Martin-Löf’s type theory: An introduction. Oxford: Oxford University Press.Find this resource:
Partee, B. H. (1979). Semantics—mathematics or psychology? In R. Bäuerle, U. Egli, & A. v. Stechow (Eds.), Semantics from different points of view (pp. 1–14). Berlin: Springer.Find this resource:
Partee, B. H. (2014). The history of formal semantics: Changing notions of linguistic competence. Paper presented at the 2014 Annual Joshua & Verona Whatmough Lecture, Harvard.Find this resource:
Piwek, P., & Krahmer, E. (2000). Presuppositions in context: Constructing bridges. In P. Bonzon, M. Cavalcanti, & R. T. Nossum (Eds.), Formal aspects of context (pp. 85–106). Berlin: Springer.Find this resource:
Ranta, A. (1991). Constructing possible worlds. Theoria, 57(1–2), 77–99.Find this resource:
Ranta, A. (1994). Type-theoretical grammar. Oxford: Oxford University Press.Find this resource:
Ranta, A. (2004). Dialogue systems as proof editors. Journal of Logic, Language and Information, 13(2), 225–240.Find this resource:
Ranta, A. (2011). Grammatical framework: Programming with multilingual grammars. Stanford: CSLI Publications.Find this resource:
Ranta, A. (2015). Constructive type theory. In S. Lappin & C. Fox (Eds.), The handbook of contemporary semantic theory (2d ed., pp. 345–374). Hoboken: John Wiley & Sons.Find this resource:
Retoré, C. (2012). Variable types for meaning assembly: A logical syntax for generic noun phrases introduced by “most.” Recherches linguistiques de Vincennes, 41, 83–102.Find this resource:
Retoré, C. (2013). The Montagovian generative lexicon Lambda Tyn: A type theoretical framework for natural language semantics. In R. Matthes & A. Schubert (Eds.), Proceedings of 19th international conference on types for proofs and programs (TYPES2013) (pp. 202–229). Wadern: Leibniz-Zentrum für Informatik.Find this resource:
Retoré, C., & Cooper, R. (Eds.). (2017). Type theory and lexical semantics [Special issue]. Journal of Language Modelling, 5(2).Find this resource:
Russell, B. (1992 ). The principles of mathematics. London: Routledge.Find this resource:
Sundholm, G. (1989). Constructive generalized quantifiers. Synthese, 79(1), 1–12.Find this resource:
Tanaka, R., Nakano, Y., & Bekki, D. (2013). Constructive generalized quantifiers revisited. In Y. Motomura, A. Butler, & D. Bekki (Eds.), New frontiers in artificial intelligence (pp. 115–124). Berlin: Springer.Find this resource:
Tasistro, A. (1997). Substitution, record types and subtyping in type theory, with applications to the theory of programming (Unpublished doctoral dissertation). University of Gothenburg and Chalmers University of Technology.Find this resource:
Thomason, R. H. (1980). A model theory for propositional attitudes. Linguistics and Philosophy, 4, 47–70.Find this resource:
Turner, R. (2005). Semantics and stratification. Journal of Logic and Computation, 15(2), 145–158.Find this resource:
Wadler, P. (2015). Propositions as types. Communications of the ACM, 58(12), 75–84.Find this resource:
Whitehead, A., & Russell, B. (1925). Principia mathematica (2d ed.). Cambridge, MA: Cambridge University Press.Find this resource:
(2.) A similar paradox has been shown by Girard (1972) to be operative for the original formulation of constructive type theory by Martin-Löf (1971). The interested reader is directed to Coquand (1986) for an in-depth analysis of the issue. It is worth mentioning that the idea of a stratified universe, used today to avoid Girard’s paradox, is already present in Russell’s work.
(4.) For a detailed but still short introduction to the history of type theory, the interested reader is directed to Coquand (2011). We do not give a detailed exposition of the history of type theory here.
(5.) Actually, Gallin’s proposal does more than that. It reformulates Montague’s IL in a simpler way via using classical higher-order logic and Henkin models. There are good reasons to use Gallin’s TY2 system instead of Montague’s IL. The most important one is simplicity: Montague’s IL is an extension of Church’s logic, which, however, is more complicated and also “destroys” some of the fundamental mathematical properties of Church’s logic (e.g., the Church-Rosser property; see Muskens, 1996). Gallin’s proposal does not have these unwanted side-effects and, most importantly, can express everything that can be expressed in IL.
(6.) Or the Curry-Howard Isomorphism.
(7.) In this kind of type theory, such a type of types is often called a universe; see Luo (2012) and Chatzikyriakidis and Luo (2013). For a recent detailed defense of the common nouns as types view, see Chatzikyriakidis and Luo (2017c).
(9.) A similar use is found in the work of Retoré and colleagues: Retoré (2013); Richard and Retoré (2012), though within Girard’s system F Girard (1971) and Girard et al. (1990). For constructive approaches to generalized quantifiers see Sundholm (1989) and Tanaka et al.(2013).
(10.) When A and B are types, we can also write Σ(A, B) to represent the product type whose witness are pairs (a, b), where a:A and b:B.
(11.) Although [[Swedish]] for Girl might return a function which yields true propositions for individuals of Swedish nationality, whereas for Academic it might return a function which yields true propositions for individuals working at a Swedish university.
(12.) It is important to note here that this treatment does not have anything to say on the different and context-dependent standards of measurement for every type. As such, nothing in this treatment can guarantee that the standard contextual degree parameter for type girl will be different than the standard contextual degree parameter for another type, say boy. What this treatment achieves is restricting the domain the adjective applies depending on the case. In this way, unwarranted inferences like skilful man ⇒ skilful surgeon are blocked. A welcome consequence of this is that this basic inferential property is handled via typing and not via a meaning postulate. Furthermore, nothing in this account precludes incorporating grades. One way to do that is presented in Chatzikyriakidis and Luo (2017a) using indexed types. The interested reader is directed there for more details.
(14.) In TTR a record type is an unordered set of fields. In other approaches, the fields are ordered.
(15.) For example, 11–4:Nat and 4+3:Nat compute to the same canonical object 7:Nat. The notion of canonical object is important in a proof theoretic approach where there is no model which shows that different terms have the same referent.
(16.) For detailed discussion of relabeling, see Cooper (in prep).
(17.) This is the analysis presented in Chatzikyriakidis and Luo (2012). Perhaps a better analysis of this would not involve belief contexts but contexts containing a sequence of allegations. A related extension of the type theoretic contexts is that of signatures. Signatures are a more elaborate context mechanism where subtyping entries as well as manifest entries (specifying particular objects rather than variables) can be introduced (Chatzikyriakidis & Luo, 2014c). Manifest entries roughly correspond to manifest fields in record types (Coquand et al., 2004; Cooper & Ginzburg, 2015). For more information on the exact differences of the notion of context and signature, the interested reader is directed to Chatzikyriakidis and Luo (2014c).