# Cyclicity in Syntax

## Summary and Keywords

Cyclicity in syntax constitutes a property of derivations in which syntactic operations apply bottom-up in the production of ever larger constituents. The formulation of a principle of grammar that guarantees cyclicity depends on whether structure is built top-down with phrase structure rules or bottom-up with a transformation Merge. Considerations of minimal and efficient computation motivate the latter, as well as the formulation of the cyclic principle as a No Tampering Condition on structure-building operations (Section 3.3) without any reference to special cyclic domains in which operations apply (as in the formulation of the Strict Cycle Condition (Section 2) and its predecessors (Section 1)) or any reference to extending a phrase marker (the Extension Condition (Section 3)). Ultimately, the empirical effects of a No Tampering Condition on structure building, which conform to strict cyclicity, follow from the formulation of the Merge operation as strictly binary. This leaves as open questions whether displacement (movement) must involve covert intermediate steps (successive cyclic movement) and whether derivations of the two separate interface representations (Phonetic Form and Logical Form) occur in parallel as a single cycle.

Keywords: cycle, Strict Cycle Condition, Extension Condition, No Tampering Condition, Merge, successive cyclicity, Spell-Out, single cycle

0. The Cycle in Syntax

Any theory of human language worthy of serious consideration must account for how a lexicon in the mind of a speaker maps onto an unbounded set of structured expressions that have an interpretation paired with a pronunciation or some other form of externalization. This mapping function can be modeled as a computational system for human language, a key component in a faculty of language that is universal across the species. A fundamental question in linguistic theory concerns the nature of computational operations: what they are, and what general principles, if any, constrain their operation. Although the formulation of grammatical operations has undergone radical changes over the history of modern generative grammar, some version of a cyclic principle has remained central to theorizing about the nature of this computational system.

For a thorough understanding of the role of cyclicity in syntax, it is essential to review the major formulations of the cyclic principle that have been proposed in generative grammar, especially the evidence cited as motivation for these proposals. This article is organized as follows. Section 1 begins with a reconsideration of the original proposal for a cyclic principle in syntax in Chomsky (1965). Section 1.1 reexamines the empirical basis for this proposal. Section 1.2 discusses the Insertion Prohibition of Chomsky (1965), proposed as a further restriction on the cyclic application of transformations. Section 2.0 presents the more restrictive formulation as the Strict Cycle Condition proposed in Chomsky (1973). Section 2.1 evaluates the empirical basis for this formulation, including its relation to other independently motivated constraints which, when interpreted as conditions on representations, subsume its empirical effects—raising interesting questions about the relation between derivations and representations. Section 3.0 considers the reformulation of the cyclic principle within a minimalist program as the Extension Condition of Chomsky (1993), a general constraint on the application of a single generalized transformation for building phrase structure bottom-up. Section 3.1 examines the empirical motivation for the Extension Condition, which involves deviant constructions, and shows how this is limited to one particular derivation where others are possible that do not violate this condition. Section 3.2 covers some alternative analyses for the empirical evidence for the Extension Condition (including the analysis of strong features (Chomsky, 1995b) and the Linear Correspondence Axiom of Kayne (1994)). Section 3.3 investigates the reformulation of the Extension Condition as a more general No Tampering Condition (Chomsky, 2005, 2008), a potential third factor principle for efficient computation that essentially derives strict cyclicity from a constraint on structure-preservation. This section concludes with a discussion of the possibility that structure-preservation and thus strict cyclicity is built into the formulation of Merge as the main structure-building operation of the computational system. Section 4 explicates the concept of successive cyclicity, how it applies in derivations and its empirical motivation. Section 5 considers the concept of a single-cycle syntax as it relates to previous models of derivation and evaluates the possibilities for realizing a highly restrictive derivational model of this sort. Section 6 sums up what is at issue in this foregoing discussion of cyclicity in syntax.

1. Origins of the Cycle in Syntax

From the broadest perspective, cyclicity in syntax concerns the application of operations to syntactic structure “beginning with the smallest constituents and proceeding to larger and larger constituents” to quote Chomsky, Halle, and Lukoff (1956), which predates the first proposal for the cyclic application of syntactic transformations in chapter 3 of Chomsky (1965).The Chomsky, Halle, and Lukoff proposal is formulated in their fourth rule for the assignment of prosodic accent:

Rule 4: Given a phonemic clause,

(i) assign the value 1 to all accented vowels;

(ii) then apply each rule pertaining to accented vowels no more than once to each constituent, applying a rule to a constituent of order

nonly after having applied it to all constituents of ordern+ 1; i.e. beginning with the smallest constituent and proceeding to larger and larger constituents. (p. 75)

The conclusion comments:

From the point of view from which we have been operating above, we can consider that in the process of forming utterances, the choice of words and the placing of the phonological, hierarchically ordered junctures is determined by higher level grammatical considerations, in which the phrase structure of the language plays an important role

(p. 80).

Halle and Chomsky (1960) further clarifies the nature of this analysis:

The modifications are introduced in a stepwise fashion, successive steps reflecting the influence of successively higher constituents. Note also that the same modifications apply to all constituents regardless of their place in the constituent hierarchy; the same rules are reapplied to each constituent in a repeating cycle until the highest constituent is reached. The final result of such a cyclical reapplication of the same rules reflects to a certain extent the stress distribution of the morphemes as parts of lower constituents

(p. 275).

This characterization identifies two features of the cyclic application of grammatical operations: that they apply first to a smallest constituent and then to successively larger constituents that contain it, and also that these operations reapply to the larger constituents.

This proposal instantiates the first bottom-up approach to the analysis of syntactic structure in generative grammar, further developed in the X-bar theory of phrase structure (Chomsky, 1970) and ultimately with the replacement of top-down phrase structure rules by the optimally simple operation Merge (Chomsky, 1995a), which unifies—in a single operation—what previously had involved two distinct operations: phrase structure rules and (movement) transformations (see Chomsky, 2004, for the original proposal (hinted at in footnote 13 of Chomsky, 1995a), and Freidin, 2012a, 2013, for discussion of its historical development).

The importance of this proposal in particular is further elaborated in Section 5 of chapter 1 in Chomsky (1965), where it is described as a “proposal that the phonological component of a grammar consists of a sequence of rules, a subset of which may apply cyclically to successively more dominant constituents of the surface structure”—referred to as “a transformational cycle” (p. 29). This cyclic application of rules is proposed as an example of a formal universal, which constrains “the character of the rules that appear in grammars and the ways in which they can be interconnected.” The following section of chapter 1 explains the motivation for the search for formal universals, which capture the possibility of “deep underlying similarities” among grammars “that are attributable to the form of languages as such” (p. 35). It continues:

Real progress in linguistics consists in the discovery that certain features of given languages can be reduced to universal properties of language, and explained in terms of these deeper aspects of linguistic form. Thus the major endeavor of the linguist must be to enrich the theory of linguistic form by formulating more specific constraints and conditions on the notion “generative grammar.” Where this can be done, particular grammars can be simplified by eliminating from them descriptive statements that are attributable to the general theory of grammar [cf. Section 5].

The specific example cited is that the transformational cycle “is a universal feature of the phonological component.” Thus there is strong conceptual motivation for postulating the cyclic application of “those phonological rules that involve syntactic structure” (p. 35), assuming of course that the empirical predictions are accurate. In this case, the postulation of a general constraint that constitutes a formal universal results in a simplification in the formulation of grammars.

In retrospect, the conceptual motivation for the cyclic application of phonological rules involving syntactic structure applies equally to singulary syntactic transformations, which operate on a single phrase marker, with the added bonus that this would constitute a generalization of a formal universal. However, when the syntactic cycle is proposed in Chapter 3 of Chomsky (1965) none of this is mentioned.^{1} Instead, the proposal is motivated as a simplification of the theory of transformations that results from the elimination of both generalized transformations (operations that create phrase markers for constructions with clausal recursion) and the construct of transformation marker (T-marker)—both central to the original formulation of transformational grammar in the mid-1950s, published two decades later as Chomsky (1975). A further argument is offered that the cycle rules out in principle orderings among transformations that seem never to be utilized in natural language grammars but are nonetheless possible in a theory that utilizes T-markers. Specifically, it excludes extrinsic ordering among generalized transformations, and also between generalized and singulary transformations, where a singulary transformation applies to a matrix clause before a subordinate clause is embedded in it by a generalized transformation—observations due to Fillmore (1963). Given this discussion, the cycle in syntax is motivated on the grounds that it yields a more restrictive theory.

## 1.1 The Empirical Basis for a Syntactic Cycle

Given that “generalized transformations” are resurrected under the minimalist program of Chomsky (1993), it is worth reviewing the details of this initial argument for a cycle in syntax. Chomsky (1965) presents the derivation of (1) as an illustration.

(1)

The derivation of (1) involves three separate phrase markers, represented as (2–4), where (2) constitutes the main clause and (3–4) are subordinate clauses, (3) underlying the relative clause and (4), the infinitival complement of *persuaded*.^{2}

(2)

(3)

(4)

Δ in (2–4) represents an empty category that results when the phrase structure rules do not expand NP. Chapter 2 of Chomsky (1965) refers to a Δ dominated by a lexical category as a “dummy symbol” (p. 122), and uses the same designation for the symbol S′ (see below for further discussion).

The derivation of (1) spelled out in the T-marker (5), which represents (1) on a transformational level of analysis on a par with “a sequence of phonemes, morphemes, words, syntactic categories, and (in various ways) phrases” at lower levels of analysis (Chomsky, 1975, p. 72).

(5)

T_{E} is the generalized transformation that embeds (4) in (3), and (3) in (2). T_{P} designates the classical passive transformation under which a logical subject is postposed to the NP object of a PP headed by passive *by* and the logical object is preposed to a vacated subject position. T_{D} designates a deletion operation that accounts for phonetically null subjects in infinitivals, where the subject is anaphorically linked to an argument in a matrix clause—what was called Equi-NP Deletion.^{3} T_{R} designates the transformation that forms relative clauses, which given (2) must be a complex operation that replaces the subject NP *the man* in (3) with the relative pronoun *who* and permutes the embedded (3) to the right of *man* in (2). T_{AD} designated as “agent deletion” deletes the passive *by*-phrase, in this case where there is no overt object of the preposition.

(5) represents a cyclic ordering of transformations; but given that T-markers do not impose cyclicity, it is equally possible that (3) is embedded in (2) before (4) is embedded in (2+3). Depending on the formulation of these transformations, it is also possible that rules applying to (2) could precede—anticyclically—rules applying to (4). For example, the applications of the passive transformation to (2) and (4), which are mutually independent. So it is not clear how imposing cyclic ordering eliminates the need for a transformation marker.^{4}

The 1973 introduction to Chomsky (1975) states that the cyclic application of rules “makes possible a reconstruction of syntactic theory without generalized transformations or a level of transformations in which T-markers are assigned to sentences” (p. 16). In the case of generalized transformations, the details of analysis presented in Chapter 3 of Chomsky (1965) show that cyclic application is compatible with their elimination but does not require it. Consider first the formulation generalized transformation T_{E} as a substitution operation that replaces the “dummy symbol” S′ with the phrase labeled #S#. The same substitution operation accounts for both the two movements that result from applying the passive transformation and also the operation that inserts lexical items in derivations. In order for T_{E} to apply in the first place as a substitution operation, S′ and #S# must be nondistinct. The simplest hypothesis is that they are categorially nondistinct, in which case the special status of #S# as only an initial symbol of a derivation evaporates. So it seems that the phrase structure component of the grammar produces generalized phrase markers even with generalized transformations. If all phrase structure rules are obligatory, as assumed in Chomsky (1965, p. 119), then T_{E} could never apply. However, if all phrase structure rules are optional (see Chomsky & Lasnik, 1977, p. 440, and Chomsky, 1980, p. 3), then T_{E} could still apply. Its elimination would require an ad hoc stipulation that the substitution operation is restricted to a single phrase marker, which would in effect exclude the substitution account of lexical insertion given in Chomsky (1965). In any case, the elementary transformational operation underlying T_{E} remains a core part of the grammar—in which case it is unclear how eliminating T_{E} actually simplifies the grammar.^{5}

Note that the elimination of generalized transformations by itself does not motivate the elimination of T-markers, which in Chomsky (1975) are also posited for kernel sentences (p. 73)—i.e., simple sentences that are active (vs. passive), affirmative (vs. negative), and indicative (vs. interrogative or imperative). T-markers allow for both cyclic and anti-cyclic derivations, the latter apparently unattested in the grammars under consideration. A cyclic principle could exclude T-markers for anticyclic derivations, but not for cyclic derivations. The question is whether T-markers for cyclic derivations are a necessary part of syntactic representation. In Chomsky (1965), the reanalysis of optional meaning changing transformations (e.g., a negative transformation that introduces a negative element into a derivation) as obligatory, based on the assumption that meaning changing element (e.g., a negation marker) is present in the initial phrase marker, T-markers no longer contain crucial information for semantic interpretation (for details see Katz & Postal, 1964, and Chomsky, 1965).

In Chomsky (1965) the argument for the cycle in terms of limiting derivational options is based on the observations in Fillmore (1963) about the interaction of transformations mentioned above. The absence of extrinsic ordering between generalized transformations essentially follows given that there is a single embedding operation, a simple substitution operation as discussed above. Chomsky (1965) also assumes that there are singulary transformations that must apply to a clause before it can be embedded in a matrix clause. The derivation of the example given in (1) does not support the claim. As noted above, T_{P} could just as well apply to (4) after it is embedded in (3). Furthermore, given the simple formulation of T_{E}, there is no apparent reason why the substitution operation would be affected by the form of #S# that substitutes for S′.^{6} More specifically, any singulary transformation applying to a phrase marker #S# would have no effect on the embedding of that phrase marker in a matrix clause under T_{E}. The same observation applies equally in the case of a singulary transformation applying to a matrix clause before a subordinate clause is embedded in it. The only case where T_{E} must apply before a singulary transformation applies to a matrix clause containing the embedded clause is where the transformation involves elements of both clauses (e.g., T_{D} in (5)).

Note that substituting phrase structure rules that allow clausal recursion for T_{E} in itself does not necessitate cyclic application of singulary transformations. Consider for example the generalized phrase marker that would combine (2–4) in a single syntactic object (what would be derived by embedding (4) in (3) and the result in (2)). One possible derivation of (1) would involve the ordering in (6).

(6)

Such A>B>A ordering for the application of transformations is precisely what the syntactic cycle predicts. However, (1) can also be derived from the two additional orderings in (7), which are anti-cyclic:

(7)

In the derivation of (1) T_{P}(4) and T_{D}(3) are extrinsically ordered—T_{D}(3) cannot apply until T_{P}(4) applies; however, T_{P}(2) can apply independently of the extrinsically ordered pair and thus cyclic application is not required to generate (1).

In the late 1960s and early 1970s several empirical arguments for the cyclic application of transformations were proposed based on A>B>A ordering for the derivation of certain complex sentences in English.^{7} The validity of such arguments crucially depends on the existence of these transformations and also their specific formulations. In addition to the Passive/Equi-NP Deletion interaction discussed above (see also Postal, 1970, and Bach, 1974), the following interactions were also proposed as evidence of necessary A>B>A derivations.

(8)

Fast forwarding to the present, none of these transformational interactions exists because the grammatical rules on which they are predicated no longer exist as distinct operations. The existential *there* undergoes normal lexical insertion, so there is apparently no special transformation that “moves” its indefinite NP associate to the right and inserts *there* in the vacated position. Likewise, reflexive pronouns undergo normal lexical insertion and are linked to an antecedent by a rule of interpretation.^{8} Under the substitution-based analysis of Passive and Raising in the mid-1970s (cf. Chomsky, 1976), both designations fall under a single operation, Move-NP, later revised in Chomsky (1981) as Move α to include *wh*-movement and head movement.

## 1.2 The Insertion Prohibition

Chomsky (1965), at the conclusion of the third chapter, proposes another general condition on transformations, named the Insertion Prohibition (9) and characterized as “a step toward a stricter interpretation of the cycle” in Chomsky (1973, p. 243).

(9)

In Chomsky (1965) the constraint is applied to a transformational rule of reflexivization as an explanation for the contrast between the examples in (10).

(10)

Chomsky suggests that *near me* is underlyingly a clause “it is near me” and therefore *me* cannot undergo reflexivization to become *myself* as in (10b). (10c) is possible because the source of *at myself* is not a clause “it is at me”. In Chomsky (1973) the Insertion Prohibition analysis is extended to a transformational analysis that derives the reciprocal expression *each other* from the construction … *each* … *the other*. Furthermore, this constraint is incorporated into the formulations of the Tensed-S and Specified Subject Conditions, which prohibit the application of transformations involving two syntactic objects separated by a tensed sentence boundary or an intervening subject. Chomsky shows how these constraints generalize to both insertion and extraction operations, as illustrated in (11) and (12) where the a-examples involve the former and the b-examples, the latter. The brackets mark clause boundaries, which constitute cyclic domains for the application of transformations.

(11)

(12)

The Insertion Prohibition would improperly exclude *John expects himself to succeed* under the analysis where the matrix verb and reflexive pronoun are separated by a clause boundary (and therefore a cyclic boundary), in contrast to Tensed-S and Specified Subject Conditions, which do not block such constructions. With the abandonment of transformational rules for the derivation of reflexive pronouns and reciprocal expressions, the Insertion Prohibition appears to no longer have any empirical application (but see below).

2. The Strict Cycle Condition

In addition to the Insertion Prohibition, Chomsky (1973) proposes an additional constraint that yields “a stricter interpretation of the cycle”—namely, the Strict Cycle Condition (henceforth SCC) formulated as (51) in the 1973 paper and repeated here as (13).

(13)

Lasnik (2006, p. 200) characterizes SCC as “a further implicit requirement, finally made explicit in Chomsky (1973).” However, the SCC is actually the core of the cyclic principle: the purely cyclic application of transformations follows from it without further requirements—leaving aside the questionable status of the Insertion Prohibition.^{9} And without it, transformations would be able to apply in a derivation anti-cyclically.

While the SCC prohibits the anti-cyclic derivation of (1) given in (7), this provides no significant empirical motivation for the principle because these derivations yield exactly the same structure as the cyclic derivation. Rather, significant empirical motivation for the SCC comes from prohibiting the application of a transformation in a cyclic subdomain of a current cycle that might otherwise have applied to yield a deviant result. Chomsky (1973) alludes to such a case involving *wh*-movement (pp. 246–247), the cyclic nature of which is said to follow from the SCC because the rule must apply in subordinate clauses (e.g., indirect questions and relative clauses) (p. 243).^{10}

## 2.1 The Empirical Basis for the SCC

Consider instead the following base structure, which doesn’t involve the complications of Chomsky’s original example.

(14)

If C of the subordinate clause contains the feature [+Q], then *wh*-movement can produce two indirect questions, *John remembers which books which students borrowed* or *John remembers which students borrowed which books*.^{11} Assuming that *wh*-movement is like NP-movement, then the underlying operation is substitution and therefore the former substitutes a *wh*-phrase for a base generated empty position in CP adjacent to C (as proposed in Chomsky, 1986, p. 5). Under trace theory (as first mentioned in Chomsky, 1973, fn. 33, and developed in Chomsky, 1976), the movement of the *wh*-phrase out of the subordinate VP leaves behind a trace which constitutes an empty category linked to the moved *wh*-phrase by coindexation. If instead the main clause C is marked [+Q], then *wh*-movement can generate two direct questions, *which books does John remember which students borrowed?* or *which students does John remember borrowed which books?* Applying *wh*-movement to *which books* within the subordinate clause of (14) yields (15a) and applying the rule again in the matrix clause yields (15b).

(15)

The SCC rules out a third derivational step following (15b) where the “vacated” *wh*-phrase position adjacent to C is filled by *wh*-movement of the *wh*-subject *which students*, as shown in (16).

(16)

One difference between (15b) and (16) is that the subordinate CP in the latter is interpreted as an indirect question whereas the subordinate CP in the former is not. Thus, a possible alternative to the direct question derived from (15b) would contain an overt non-interrogative complementizer, as in (17).

(17)

While the presence of the overt complementizer seems to degrade the example, there is a clear contrast with the deviance of (18) where neither *wh*-phrase undergoes *wh*-movement, and the subordinate CP has an overt non-interrogative complementizer.

(18)

And while it is possible construct complex sentences that are direct questions and also contain indirect questions, as in (19a), it is not possible to form a direct question by extracting a *wh*-phrase out of an indirect question, as in (19b) which would have a derived structure (20).

(19)

(20)

Whether the deviance of (16) and (19b–20) empirically motivates the SCC depends crucially on the formulation of *wh*-movement and how it applies in the derivation of these constructions.

As discussed in Freidin (1978), there are multiple derivational paths to deviant constructions like (16) and (20); so to account for their deviance all such paths must be prohibited. There are two derivations that violate the SCC: (21a) and (21b)—specifically the last step in each.

(21)

In addition there is one other derivation of the deviant (20) that does not violate the SCC, the one that switches the order of the operations in derivation (21b), given in (22).

(22)

Instead, the last step of (22) as well as the first step of derivation (21b) violates the Subjacency Condition, which prohibits a single movement out of two bounding categories—TP in this case (as proposed in Chomsky, 1977). Furthermore, all derivations would violate Subjacency if the constraint is interpreted as a condition on trace binding, hence a condition on representations, rather than a constraint on derivations.^{12} Thus, the empirical effect of the SCC with respect to the deviant (19b) could follow from an independently motivated general principle (i.e., Subjacency), an analysis which extends to super-raising (see (23a) in Section 3)). Freidin (1978) demonstrates how this kind of account for *wh*-movement extends to NP-movement as well. As a result, there would be no empirical reason for postulating the SCC as an axiom of the theory because all of its empirical effects could be derived from other independently motivated principles of grammar.

Note that (21a) is the only derivation that violates the SCC alone. Even if we adopt a derivational approach to the interpretation of general principles (as suggested in Chomsky, 1995a, p. 57), the motivation for the SCC depends on a particular analysis of substitution where indexing of traces does not figure in the determination of nondistinctness and thus *which books* could legitimately substitute for the intermediate trace of *which students* in CP. In contrast, the examples cited in Chomsky (1973) (see note 11) are a harder case of nondistinctness to dismiss given that the two *wh*-phrases are categorially distinct, one being a PP *to whom* and the other a NP *what books*. Thus even on a derivational approach the empirical effects of the SCC could follow from a general nondistinctness constraint on substitution operations. However, see Freidin (1999) which considers the possibility of treating nondistinctness in terms of recoverability of deletion, in which case the trace erasures involved in these examples may not count as unrecoverable deletion simply because the deleted syntactic objects are not interpreted in these positions.

The discussion in these first two sections demonstrates the interdependence between the computational operations assumed and the formulation of general principles that constrain their application and output. It is the precise formulations of these operations and principles that determine the nature of cyclicity in syntax.

3. A Cyclic Principle for a Minimalist Program: The Extension Condition

In the initial discussion of a minimalist program for linguistic theory in Chomsky (1993), the concept of the strict cycle reappears in the guise of an output constraint on “overt substitution operations” (fn. 27). This account assumes the existence of “a single generalized transformation GT that takes a phrase marker K^{1} and inserts it in a designated empty position ø in a phrase marker K, forming the new phrase marker K*, which satisfies X-bar theory” (p. 189). The operation is restricted further by the requirement “that ø be external to the targeted phrase marker K” (p. 190)—designated the Extension Condition. This formulation of this condition yields a simpler version of the strict cycle in that it makes no reference at all to cyclic domains and subdomains. In effect, it applies to all constituent domains (cf. Williams, 1974, where the ordering of transformational rules is determined by applying rules that affect subdomains before rules that affect larger domains).

## 3.1 The Empirical Basis for the EC

Chomsky (1993) cites as empirical motivation for the Extension Condition three deviant examples that are claimed to be unaccounted for without it.

(23)

(23a) constitutes a case of super-raising; (23b), a violation of the Head Movement Constraint; and (23c), a *wh*-island violation. The argument assumes that there are possible derivations of (23a-c) from stages in their derivations (i.e., (24)^{13}) that would not violate their associated constraints (e.g., Relativized Minimality (Rizzi, 1990)).

(24)

(23a) is derived from (24a) by raising *John* directly to the matrix subject and then inserting *it* as the subject of the clausal complement of *seem*. Because the clausal complement subject does not exist when *John* is raised, it is assumed there is no shortest movement violation. The Extension Condition blocks the insertion of *it*. While that would block one derivation of (23a), there is another derivation that doesn’t violate the Extension Condition—namely, the one where *it* is inserted before *John* is raised. Prohibiting this derivation requires a locality constraint (shortest movement or Subjacency). Note further that (24a) could yield a different deviant construction (25), where there is no insertion of pleonastic *it*.

(25)

With or without the intermediate trace, the derivation would not, by hypothesis, violate a shortest movement constraint. So unless there is some other constraint that marks (25) as deviant in the framework of Chomsky (1993), then the Extension Condition account for (23a) cannot account for the deviance of (25). And if there is an account of the deviance of (25), then it will generalize to (23a), refuting the claim that without the Extension Condition there is no way to explain the deviance of super-raising in (23a) under a derivation that involves late insertion of *it*. Finally, there is a fairly straightforward way of characterizing the deviance of (23a) and (25) as violations of prohibition against extracting non-*wh* phrases out of finite clauses (e.g., the “Tensed-S” condition of Chomsky (1973, 1976) and its descendants).

The argument for the Extension Condition based on the Head Movement Constraint violation in (23b) also assumes a derivation in which a lexical item is inserted into a subordinate constituent (though exactly where is not specified). The general idea is that given (24b) as a stage in the derivation, the verb *fix* raises to adjoin to C and then the modal *can* is inserted inside the clause (details aside). As with (23a), the lexical insertion prohibited by the Extension Condition is extraneous. The derivation without the insertion yields a deviant structure—in this case, raising a main verb to C in English (e.g., **fixed John the car*). Furthermore, raising to C in English only occurs with finite forms (see Lasnik (1995, fn. 13). If the modal is present in the derivation, then presumably the main verb is a bare form, not a finite form—given the selectional restriction between a modal auxiliary the head of its verbal complement. Consider (26), a case that doesn’t involve syncretism:

(26)

Head movement will yield (27a), the interrogative counterpart of (26a), and (27b), the interrogative counterpart of (26b).

(27)

Head movement cannot yield (28), which on the analysis of Chomsky (1993) could be derived in two different ways.

(28)

On one derivation *be* moves to C over *can* violating the Head Movement Constraint (HMC); on the other, the modal, which is not present when *be* moves to C (and therefore no HMC violation occurs), is inserted after the movement occurs in violation of the Extension Condition. If the modal is not inserted after Head Movement, then the derivation yields the deviant (29), which violates neither the HMC nor the Extension Condition.

(29)

And without Head Movement the deviant (30) could also be derived.

(30)

The first step in the derivation of (29–30) would be the merger of *be* with *happy*, where the verbal element heads the syntactic object constructed. Suppose that like derivations generally, selection works bottom-up where the head of a complement selects the head that the complement is merged with. Then the bare form *be* selects a modal. (29–30) violate the selectional requirement of the bare form V. If selection is treated derivationally, then late insertion of a modal in (23b, 24b, and 28) is not possible because those derivations will always violate selection. And if selection is treated representationally, then a copy of the moved *be* must remain in its original position in the representation for (28) so that selection is satisfied—in which case the representation will show a violation of the HMC.

Unlike (23a–b), the *Wh*-Island violation in (23c) does not involve the late insertion of a lexical item. Instead, the analysis presented in Chomsky (1993) involves a derivation in which *wh*-movement applies counter cyclically (cf. (21) above) so that GT moves *how* in (24c) directly to the matrix CP and then moves *what* to the subordinate CP in violation of the Extension Condition. Exactly how the second operation is achieved is not entirely clear, as will be discussed below. Furthermore, the cyclic derivation involving (24c) where the order of *wh*-movements is reversed and therefore does not violate the Extension Condition must also be excluded, as it is by a locality constraint (e.g., Subjacency or a Minimal Link Condition (Chomsky, 1995a, p. 96). If these constraints are interpreted as conditions on representations (specifically on nontrivial chains created by movement operations), then the account generalizes to the counter-cyclic derivation, once again rendering the statement of the Extension Condition as an axiom superfluous.

Chapter 4 of Chomsky (1995b) cites (31) as further empirical support for the Extension Condition, characterizing it as a Constraint on Extraction Domains (CED) violation (Huang, 1982).

(31)

Under the cyclic derivation, the NP/DP *a picture of who* moves from VP to clausal subject position and then *who* in subject position moves to CP in violation of the CED. On the counter-cyclic derivation, the operations occur in reverse order, where the movement of *who* does not violate the CED (compare (31) to the legitimate (32)—i.e., extraction from the V complement position does not result in deviance) and the movement of NP/DP violates the Extension Condition.

(32)

Both derivations yield the deviant (31). The argument for the Extension Condition rests on the assumption that the CED is a constraint on the application of operations and therefore only blocks one of these derivations. If instead, it applies to representations, then the chain {*who*, *t _{wh}* } violates the CED (or whatever accounts for this CED effect) no matter how it is derived. In this way the CED generalizes to all derivations of the deviant construction.

For additional commentary on empirical motivation for the Extension Condition, see Freidin (1999).

Chomsky (1993, p. 191) proposes that the EC is limited to substitution operations, excluding adjunctions (though no specific examples are mentioned). Chomsky (1995b, p. 327) specifically mentions “head adjunction” (i.e., head movement) as a case of adjunction that cannot fall under the Extension Condition and also proposes that the EC does not apply to operations that apply after Spell-Out on the PF side (citing a minimalist approach to Case agreement theory).^{14}Chomsky (2000, p. 137) contrasts the Extension Condition with a condition of Local Merge, both of which are claimed to “yield cyclicity.”

The EC as formulated in Chomsky (1993) is claimed to yield a second consequence: “that given a structure of the form [_{X′} X YP], we cannot insert ZP into X' (yielding, e.g., [_{X′} X YP ZP]), where ZP is drawn from within YP (raising) or inserted from outside by GT” (p. 191). From this it is claimed to follow that raising to a complement position is impossible, “one major consequence of the Projection Principle and θ-Criterion at D-Structure, thus lending support to the belief that these notions are indeed superfluous” (p. 191). However, the superfluous nature of the Projection Principle (and the θ-Criterion to the extent that it must apply at D-Structure) follows straightforwardly and independently of the EC if GT as formulated cannot construct classical D-Structure. Whether the insertion of ZP into X′ actually constitutes raising to a complement position is not obvious, especially given GT. And furthermore, raising to complement position (or any position to which a θ-role is directly assigned) is ruled out by the uniqueness requirement of the θ-Criterion according to which an argument can be assigned only one θ-role (a requirement that could never apply at D-Structure).^{15} It appears that all the empirical effects of the EC can be accounted for without stating this cyclic principle as an axiom of the theory.

## 3.2 Alternatives to the EC

Chomsky (1995b) discusses an alternative analysis of strict cyclicity based on a notion of strong features, which according to (Chomsky, 1993, p. 198) are visible but illegitimate at PF. This analysis assumes that derivations “cannot tolerate” strong features, so that the insertion into a derivation of an element with a strong feature must trigger an operation that removes it—otherwise the derivation is cancelled (p. 233). This is formalized as a constraint against an element α with a strong feature where α is contained “in a category not headed by α” (p. 234). This configuration cancels a derivation. For example, take the feature Q of interrogative C in English, where Q is a strong feature. The derivation of the *wh*-island violation (19b) would involve Q features in both the subordinate and matrix CP. Only one counter-cyclic derivation, (21b), will trigger derivation cancellation, when the subordinate CP containing Q is merged with the verb *remembers* to form VP, thereby blocking the counter-cyclic operation. However, derivation cancellation by itself does not prevent the other counter-cyclic derivation (21a) where one *wh*-phrase moves through the subordinate CP thereby checking and erasing the strong Q feature followed by the other *wh*-phrase moving to the subordinate CP. But see Boskovic and Lasnik (1999), Richards (1999), and Lasnik (2006) for arguments supporting the replacement of the EC by this strong feature analysis.

In addition to the derivation cancellation by strong feature analysis, other attempts to eliminate the EC by showing its effects follow from more general considerations include the argument in Kitahara (1997) that counter-cyclic Move requires an extra derivational step and therefore runs afoul of an economy of derivation requirement, and the claim that counter-cyclic derivations violate the Linear Correspondence Axiom of Kayne (1994) as argued in Kawashima and Kitahara (1996), Collins (1995, 1997), and Epstein, Groat, Kawashima, and Kitahara (1998). See Freidin (1999) for critical commentary.

## 3.3 The No Tampering Condition

Chomsky (2000, p. 136) considers the possibility that the EC “always holds: operations preserve existing structure.” This statement is clarified as: “operations do not tamper with the basic relations involving the label that projects: the relations provided by Merge and composition, the relevant ones here being sisterhood and c-command.” In this way the EC is connected to what becomes the “the ‘no-tampering’ condition of efficient computation” (NTC) (Chomsky, 2005, p. 11), formulated in Chomsky (2008) as: “Merge of X and Y leaves the two syntactic objects unchanged” (p. 138). Assuming that the output of Merge (X, Y) is the set {X, Y} (hence the designation Set-Merge), “the simplest possibility worth considering,” the NTC prohibits Merge from reconfiguring the internal structure of X and Y in any way, including the insertion of features. Derivations that utilize such reconfigurations in structure-building would be less efficient (and minimal) than derivations that prohibit them. From the NTC it follows that Merge is always to the root (or edge) of the syntactic objects involved (Chomsky, 2008, p. 138). Strict adherence to the NTC requires that in the case of “internal” Merge, where Y is a constituent of X (cf. singulary substitution above), the full structure of Y remains as a constituent of X as well as merging with X at the edge—what is referred to as the copy theory of movement.^{16}

With Set-Merge, the main the structure-building operation, constrained by the NTC, both strict cyclicity and structure-preservation follow automatically. More accurately, the NTC enforces structure-preservation and strict cyclicity follows from structure-preservation. Taking the NTC as a third factor principle of efficient computation, we now have a principled explanation of why cyclic transformations are structure-preserving, as proposed for a very different framework in Emonds (1970, 1972)—see Freidin (2016) for discussion. That the NTC constitutes a principle of efficient (and minimal) computation provides a stronger argument for cyclicity than its proposed status as a formal universal (see Section 1), a second factor property (see Chomsky, 2005, for discussion of the three factors of language design).

The force of these conclusions depends crucially on the precise nature of Set-Merge. To see how this works, consider again the prohibition against raising to complement position. Chomsky (1993) gives as an example a structure [_{X′} X YP] that would be mapped onto a structure [_{X′} X YP ZP] where ZP is also a constituent of YP, and thus ZP is raised via internal Merge. The main issue is the relations between the three constituents X, YP, ZP (linear order aside). If Merge is binary, then it only generates a new relation between the two syntactic objects it targets. However, ZP in this structure would be in a sisterhood relation with both X and YP, which Merge cannot define. If binary Merge raises ZP contained in YP then it must target ZP and YP. But the proposed mapping involves the syntactic object X' as well, a third target not available with binary Merge. Now consider another potential derivation for raising to complement position, starting with an infinitival clause *herself to succeed*. Now merge the verb *expects* and *herself* forming a VP *expects herself*. Next merge the infinitival clause with the VP yielding the binary structure [_{VP} [expects herself] [herself to succeed]]. This derivation gives a version of sideward movement (see Nunes, 2004), where a constituent that is already part of a larger structure is constructed with an independent lexical item and the structure that results from this operation remains separate from this larger structure. Such derivations require three targets: the larger structure (the infinitival clause), the constituent contained in it that is to be merged with an external lexical item (the reflexive pronoun), and the lexical item itself (the matrix verb). Although the first-mentioned operation of Merge only constructs a single sisterhood relation between two syntactic objects, it nonetheless must target three syntactic objects to accomplish this. Strict adherence to the binary nature of Merge—including targets as well as the new relationship created between a pair of syntactic objects—by itself appears to produce the same effects as the NTC—namely, strict cyclicity and structure-preservation, where the latter entails the former. Thus it may be that Set-Merge, as an optimally simple (and hence minimal) computational operation that generates most of phrase structure (putting aside the contribution of Pair-Merge for adjunction constructions (Chomsky, 2000, 2004)), by itself results in efficient computation.^{17} If this analysis is correct, then Set-Merge exemplifies a direct connection between a third factor property and the formulation of computational operations.

4. Successive Cyclicity

The term successive cyclicity arises in the context of “long-distance” displacement, where a syntactic object is pronounced in a position that is separated from the position in which it is interpreted by one or more clause boundaries. Consider again the case of *wh*-movement as illustrated in (33).

(33)

In both (33a–b) *which books* is interpreted as the object of *borrowed*, but pronounced in the subordinate clause in (33a) as opposed to the matrix clause in (33b). Given that CP is the domain in which *wh*-movement applies, the question is whether it must apply in every CP domain and thus successive-cyclically—including those where the *wh*-phrase is neither pronounced nor interpreted.

The successive cyclic application of syntactic transformations resolves an apparent paradox: movement transformations appear to be both unbounded and bounded. Thus *wh*-displacement is apparently unbounded; for example, the syntactic distance between the verb *borrowed* and the *wh*-phrase *which books*, which is interpreted as its object can be greatly extended (e.g., (34) compared to (33b))—in principle, indefinitely.

(34)

In contrast, *wh*-displacement is definitely bounded, as illustrated by the deviance of (35).

(35)

It is assumed that the syntactic distance between *which students* and the syntactic position in which it is interpreted (i.e., subject of the subordinate clause) is too great. The deviance of such examples is accounted for by postulating a general locality constraint, formulated initially as the Subjacency Condition in Chomsky (1977)^{18} which prohibits a single movement operation (i.e., internal Merge) from connecting two positions that are separated by more than one clause boundary (TP for English^{19}). Given Subjacency, it follows that *wh*-movement must apply successive cyclically to every intermediate CP domain between the position in which the *wh*-phrase is interpreted as an argument of a predicate and the position in which it is pronounced. (33b) contains one such CP domain, whereas (34) contains three.

This analysis generalizes to NP/DP displacement in that it blocks super-raising by imposing successive cyclic movement in TP.

(36)

In (36) the brackets mark TP boundaries and the NPs designate covert copies of *Mary*. Without the intermediate copy of *Mary* the derivation yields a Subjacency violation.

Within work on a minimalist program of the past sixteen years, successive cyclic movement via internal Merge is imposed by the Phase Impenetrability Condition (PIC), as noted in Chomsky (2004, p. 112). For discussion of possible formulations and their application, see Chomsky (2000, 2001, 2004, 2008) and more recently Citko (2014) for a survey of work on phase theory. For some critical discussion of this work, see Freidin (2016).

An empirically based argument for successive cyclic *wh*-movement has been proposed to account for the interpretive possibilities for bound anaphors that have been displaced via *wh*-movement. Quicoli (2008) cites a construction with two subordinate clauses where the antecedent of the reflexive pronoun *himself* in the clause initial *wh*-phrase can be equally interpreted as either the main clause subject or either of the embedded clause subjects.^{20}

(37)

Given the standard analysis in which binding of the anaphor requires a c-commanding antecedent, the anaphor cannot be bound where it occurs overtly in the matrix CP. Therefore binding must occur in a covert position. If *himself* is interpreted as part of the *wh*-phrase in the VP headed by *heard*, then the reading where *Bill* is the antecedent follows straightforwardly. Furthermore, if the standard binding condition on anaphors applies, which prohibits an unbound anaphor in the c-command domain of a subject, then neither *John* nor *Max* can be interpreted as the antecedent of *himself*, as illustrated in (39) below. Successive cyclic movement would place the *wh*-phrase in the subordinate CP as illustrated in (38).

(38)

At this point in the derivation there is a copy of *himself* that because it is not in the domain of *Bill* could be interpreted as taking *John* as its antecedent. Movement to the higher CP in (37) would allow *himself* to take *Max* as its antecedent. This binding theory analysis depends on the following being deviant because they violate the standard binding condition on anaphors.

(39)

See Quicoli (2008) for the successive cyclic analysis of (37) in terms of a theory of phases.^{21}

Further empirical support for successive cyclic derivations has been proposed on the basis of overt word order effects (stylistic inversion in French (Kayne & Pollock, 1978), obligatory inversion in Spanish *wh*-constructions (Torrego, 1983, 1984)), overt morphosyntactic effects (complementizers in Irish (McCloskey, 2001, 2002)), and agreement (in Dinka (van Urk & Richards, 2015)).

5. Single Cycle Syntax

Starting with Chomsky (2000)^{22}—and essentially repeated in Chomsky (2004, 2005, 2007, 2008)), it is claimed that “there is a single cycle; all operations are cyclic” (p. 131), presumably motivated by considerations of computational complexity.

The discussion of the single cycle model in Chomsky (2000) compares it to a previous model of grammar that separates overt operations (which have phonetic effects) from covert operations (which do not) in terms of an operation Spell-Out that bifurcates a derivation into three parts as illustrated in (40).^{23}

(40)

Each subpart of (40) constitutes a component of the derivation, indicated with a box. PF is the representation that interfaces with the sensorimotor (SM) components, and LF is the representation that interfaces with the conceptual-intentional (C‑I) components of the cognitive system. Given this organization, all operations prior to Spell-Out will be overt because their effects will be passed on to the derivation of PF. After Spell-Out all operations involved in the further derivation of LF will be covert because they will have no effect on the derivation of PF. Taking “the computation to LF” as “narrow syntax” (Chomsky, 2001, p. 3), the derivation of narrow syntax has an overt component prior to Spell-Out and a covert component after Spell-out. Narrow syntax then explicitly excludes the derivation of PF after Spell-Out. Under this model the operations of each component apply independently and thus each component constitutes a separate cycle in a single derivation.

As noted in Chomsky (2004, p. 107), in the worst case the three cycles are independent, while in the best case there is only one cycle—best and worst determined on grounds of computational complexity. The best case results if “operations that have or lack phonetic effects are interspersed” (Chomsky, 2000, p. 131).

As discussed in Freidin and Lasnik (2011), single cycle syntax has roots in work on generative grammar that predates the Minimalist Program. The proposal to intersperse operations in this way is not new to the proposal in Chomsky (2000)—see Bresnan’s (1971) analysis of the Nuclear Stress Rule (Chomsky & Halle, 1968), where this phonological rules applies at the end of the cycle of syntactic transformations, and Jackendoff’s (1972) analysis of coreference relations where rules assigning coreference relations also apply at the end of each syntactic cycle.

Single cycle syntax raises the question about whether all the operations of the three components of (40) can be integrated so that the three cycles can be reduced to just one. If the “covert” operations of the LF component can be interspersed with the overt operations that apply prior to Spell-Out, then there will be no way derivationally to distinguish overt from covert operations in narrow syntax. From this Chomsky (2000) concludes that “there is no distinct LF component within narrow syntax” (p. 131). This depends, however, on exactly what operations are being assumed to apply in the LF component. Consider for example the well-known rule of Quantifier Raising (May, 1985), which has no phonetic effects. The LF derivation for the PF (41a) will involve an operation that merges the quantified nominal expression with the clause. The null hypothesis is that this operation is simply an instance of internal Merge, yielding (41b), which is then mapped onto a quantifier-variable construction (41c), which accurately renders the interpretation.

(41)

If QR applies prior to Spell-Out^{24}, then (41a) is still derivable from (41b) simply by linearizing the construction—which means pronouncing *every candidate* in a single position (presumably the subject position of the clause). If the mapping from (41b) to (41c) applies prior to Spell-Out, then it violates the NTC and further there is no simple way to derive the PF *every candidate* from (for ∀*x*, *x* a candidate) in (41c). Thus it would appear that the construal of quantifier-variable constructions (including *wh*-interrogatives) involves an operation that must occur after Spell-Out in the derivation of LF, contra the single cycle hypothesis unless the mapping to a quantifier-variable representation is not part of syntax.

With regard to the phonological component in (40), deletion and linearization might be considered as reasonable candidates for operations that must occur after Spell-Out. The claim for linearization is easily dismissed if the C‑I interface only pays attention to hierarchical structure. If so, then whether linearization applies before or after Spell-Out would make no difference. Similarly, if deletion (e.g., in ellipsis constructions) affects only phonetic features, then whether it applies before or after Spell-Out will have the same effect on the derivation of PF and no effect on the derivation of LF because the C-I systems cannot interpret these features in any case. Furthermore, if deletion applies prior to Spell-Out, then deletion in ellipsis could have access to semantic features as well. Thus it may be possible to eliminate a separate phonological component, which is one requirement for single-cycle syntax.

The initial proposal for a single cycle in Chomsky (2000) is actually based on a theory of “cyclic Spell-Out” where Spell-Out occurs at multiple points in a derivation (see Uriagereka, 1999), not just one as in (40). The proposal is spelled out in more detail in Chomsky (2004), but in a way that appears to undermine it. The analysis assumes an operation Transfer that hands a derivation of narrow syntax NS (now limited to the part in (40) prior to Spell-Out) over to a “phonological component” Φ and a “semantic component” Σ—in the best case “at the same stage of the cycle” (p. 107).Φ maps NS to a phonetic representation PHON that interfaces with SM, and Σ maps NS to a semantic representation SEM that interfaces with C-I.^{25} “[T]he three components of the derivation of <PHON, SEM> proceed cyclically in parallel” (p. 106), but in three independent cycles nonetheless. Whether three cycles are necessary depends on whether the operations that occur in Φ ανδ Σ could apply instead in NS, as discussed above.

Because Transfer applies cyclically to units called “phases”—e.g., CP and *v*P in Chomsky (2001, 2008), it applies multiple times in a derivation. This produces a derivation in which (40) would constitute a subpart and would contain a set of subparts stacked together in some way as yet to be determined. One significant problem for this model is constructing a single integrated SEM and PHON from the various derivational subparts—called the reassembly problem in Freidin (2016), which doesn’t arise if Transfer applies once per derivation. The argument for derivation by phase is that it minimizes computations by limiting the computational space of operations. Prior to phase theory this was accomplished with locality conditions, and could be still if the Phase Impenetrability Condition (cf. Chomsky, 2000, 2001, 2004, 2008) is interpreted as a locality condition, rather than as the basis for a complicated derivational procedure of cyclic Transfer which by itself does not produce an integrated single pair of interface representations for the whole derivation. If this discussion is on the right track, then cyclic Transfer appears to create serious problems for single cycle syntax to the extent that this can be realized.

Whether the derivation of <PHON, SEM> can be achieved by integrating the operations involved in their generation into a single cycle remains an open question, one that is ancillary to cyclicity as discussed in Sections 1–4. Nonetheless, cyclicity remains at the core of the computational system in the formulation of binary Merge, as proposed in Section 3.3. And furthermore, the successive cyclic application of operations that target the same syntactic object in a derivation has been well established by theoretical and empirical considerations.

6. What Is at Issue

As illustrated in Sections 1–5, the topic of cyclicity in syntax covers virtually every major theoretical area in the field, starting with the form and function of the computational operations available for human language. In addition to the way form may determine function (cf. the discussion of Merge at the end of Section 3.3), function is also determined by general constraints on operations that are independent of how these operations are formulated, including the interactions between operations, their general organization in a derivation, and general assumptions about how derivations operate. In this way, cyclicity is central to the theory of the computational system for human language.

Acknowledgments

I would like to thank two anonymous reviewers and especially Howard Lasnik for comments on an earlier draft.

## Further Reading

Boskovic, Z., & Lasnik, H. (1999). How strict is the cycle? *Linguistic Inquiry*, *30*, 691–703.Find this resource:

Chomsky, N. (1965). *Aspects of the theory of syntax*. Cambridge, MA: MIT press.Find this resource:

Chomsky, N. (1973). Conditions on transformations. In S. Anderson & P. Kiparsky (Eds.), *A festschrift for Morris Halle* (pp. 232–286). New York: Holt, Rinehart & Winston.Find this resource:

Chomsky, N. (2012). *Chomsky’s linguistics*. Edited by P. Graff & C. van Urk. Cambridge, MA: MIT Working Papers in Linguistics.Find this resource:

Fox, D., & Pesetsky, D. (2005). Cyclic linearization of syntactic structure. *Theoretical Linguistics*, *31*(1–2), 1–46.Find this resource:

Freidin, R. (1978). Cyclicity and the theory of grammar. *Linguistic Inquiry*, *9*, 519–549.Find this resource:

Freidin, R. (1999). Cyclicity and minimalism. In S. D. Epstein & N. Hornstein (Eds.), *Working minimalism* (pp. 95–126). Cambridge, MA: MIT Press.Find this resource:

Freidin, R., & Lasnik, H. (2011). Some roots of minimalism in generative grammar. In C. Boeckx (Ed.), *The Oxford handbook of linguistic minimalism* (pp. 1–26). New York: Oxford University Press.Find this resource:

Grinder, J. (1972). On the cycle in syntax. In J. Kimball (Ed.), *Syntax and semantics* (Vol. 1, pp. 81–111). New York: Seminar Press.Find this resource:

Kayne, R., & Pollock, J.-Y. (1978). Stylistic inversion, successive cyclicity, and Move NP in French. *Linguistic Inquiry*, *9*, 595–621.Find this resource:

Kimball, J. (1972). Cyclic and linear grammars. In J. Kimball (Ed.), *Syntax and semantics I* (pp. 63–80). New York: Seminar Press.Find this resource:

Lasnik, H. (2006). Conceptions of the cycle. In L. Cheng & N. Corver (Eds.), *Wh-movement: Moving on* (pp. 197–216). Cambridge, MA: MIT Press.Find this resource:

Lasnik, H. (2012). Single cycle syntax and a constraint on quantifier lowering. In A. M. di Sciullo (Ed.), *Towards a biolinguistic understanding of grammar: Essays on interfaces* (pp. 13–30). Philadelphia: John Benjamins.Find this resource:

McCloskey, J. (2002). Resumption, successive cyclicity, and the locality of operations. In S. D. Epstein & D. Seely (Eds.), *Derivation and explanation* (pp. 184–226). Oxford: Blackwell.Find this resource:

Ross, J. R. (1967). On the cyclic nature of English pronominalization. In *To honor Roman Jakobson* (pp. 1669–1682). The Hague: Mouton.Find this resource:

Torrego, E. (1984). On inversion in Spanish and some of its effects. *Linguistic Inquiry*, *15*, 103–129.Find this resource:

Uriagereka, J. (1999). Multiple spell-out. In S. D. Epstein & N. Hornstein (Eds.), *Working minimalism* (pp. 251–282). Cambridge, MA: MIT Press.Find this resource:

van Urk, C., & Richards, N. (2015). Two components of long-distance extraction: Successive cyclicity in Dinka. *Linguistic Inquiry*, *46*, 113–155.Find this resource:

## References

Bach, E. (1974). *Syntactic theory*. New York: Holt, Rinehart & Winston.Find this resource:

Barss, A. (1986). *Chains and anaphoric dependence*. Doctoral dissertation, MIT, Cambridge, MA.Find this resource:

Bobaljik, J. (1995). *Morphosyntax: The syntax of verbal inflection*. Doctoral dissertation, MIT, Cambridge, MA.Find this resource:

Boeckx, C., Hornstein, N., & Nunes, J. (2010). *Control as movement*. Cambridge, U.K.: Cambridge University Press.Find this resource:

Boskovic, Z., & Lasnik, H. (1999). How strict is the cycle? *Linguistic Inquiry*, *30*, 691–703.Find this resource:

Bresnan, J. (1970). An argument against pronominalization. *Linguistic Inquiry*, *1*, 122–124.Find this resource:

Bresnan, J. (1971). Sentence stress and syntactic transformations. *Language*, *47*, 257–281.Find this resource:

Chomsky, N. (1965). *Aspects of the theory of syntax*. Cambridge, MA: MIT Press.Find this resource:

Chomsky, N. (1966). *Topics in the theory of generative grammar*. The Hague: Mouton.Find this resource:

Chomsky, N. (1970). Remarks on nominalization. In R. Jacobs & P. S. Rosenbaum (Eds.), *Readings in English transformational grammar* (pp. 184–221). Waltham, MA: Ginn.Find this resource:

Chomsky, N. (1973). Conditions on transformations. In S. Anderson & P. Kiparsky (Eds.), *A festschrift for Morris Halle* (pp. 232–286). New York: Holt, Rinehart & Winston.Find this resource:

Chomsky, N. (1975). *The logical structure of linguistic theory*. New York: Plenum.Find this resource:

Chomsky, N. (1976). Conditions on rules of grammar. *Linguistic Analysis*, *2*, 303–351.Find this resource:

Chomsky, N. (1977). On *wh*-movement. In P. Culicover, T. Wasow, & A. Akmajian (Eds.), *Formal syntax* (pp. 71–132). New York: Academic Press.Find this resource:

Chomsky, N. (1980). On binding. *Linguistic Inquiry*, *11*, 1–46.Find this resource:

Chomsky, N. (1981). *Lectures on government and binding*. Dordrecht, Netherlands: Foris.Find this resource:

Chomsky, N. (1986). *Barriers*. Cambridge, MA: MIT Press.Find this resource:

Chomsky, N. (1993). A minimalist program for linguistic theory. In K. Hale & S. J. Keyser (Eds.), *The view from Building 20: Essays in linguistics in honor of Sylvain Bromberger* (pp. 1–52). Cambridge, MA: MIT Press.Find this resource:

Chomsky, N. (1995a). Bare phrase structure. In H. Campos & P. Kempchinsky (Eds.), *Evolution and revolution in linguistic theory: Studies in honor of Carlos P. Otero* (pp. 51–109). Washington, DC: Georgetown University Press.Find this resource:

Chomsky, N. (1995b). *The minimalist program*. Cambridge, MA: MIT Press.Find this resource:

Chomsky, N. (2000). Minimalist inquiries: The framework. In R. Martin, D. Michaels, & J. Uriagereka (Eds.), *Step by step: Essays on minimalist syntax in honor of Howard Lasnik* (pp. 89–155). Cambridge, MA: MIT Press.Find this resource:

Chomsky, N. (2001). Derivation by phase. In M. Kenstowicz (Ed.), *Ken Hale: A life in language* (pp. 1–52). Cambridge, MA: MIT Press.Find this resource:

Chomsky, N. (2004). Beyond explanatory adequacy. In A. Belletti (Ed.), *Structures and beyond: The cartography of syntactic structure, Vol*, *3* (pp. 104–131). Oxford: Oxford University Press.Find this resource:

Chomsky, N. (2005). Three factors in language design. *Linguistic Inquiry*, *36*, 1–22.Find this resource:

Chomsky, N. (2007). Approaching UG from Below. In U. Sauerland & H.‑M. Gärtner (Eds.), *Interfaces + recursion = language? Chomsky’s minimalism and the view from syntax-semantics* (pp. 1–18). Berlin: Mouton de Gruyter.Find this resource:

Chomsky, N. (2008). On phases. In R. Freidin, C. Otero, & M.-L. Zubizarreta (Eds.), *Foundational issues in linguistic theory* (pp. 133–166). Cambridge, MA: MIT Press.Find this resource:

Chomsky, N. (2015). Problems of projection extensions. In E. Domenico, C. Hamann, & S. Matteini (Eds.), *Structures, strategies and beyond: Studies in honour of Adriana Belletti* (pp. 1–16). Amsterdam: John Benjamins.Find this resource:

Chomsky, N., & Halle, M. (1968). *The sound pattern of English*. New York: Harper and Row.Find this resource:

Chomsky, N., Halle, M., & Lukoff, F. (1956). On accent and juncture in English. In M. Halle, H. Lunt, & H. MacLean (Eds.), *For Roman Jakobson* (pp. 65–80). The Hague: Mouton.Find this resource:

Chomsky, N., & Lasnik, H. (1977). Filters and control. *Linguistic Inquiry*, *11*, 425–504.Find this resource:

Chomsky, N., & Lasnik, H. (1993). The theory of principles and parameters. In J. Jacobs, A. von Stechow, W. Sternefeld, & T. Vennemann (Eds.), *Syntax: An international handbook of contemporary research* (Vol. 1, pp. 506–569). Berlin: Walter de Gruyter.Find this resource:

Citko, B. (2014). *Phase theory*. Cambridge, U.K.: Cambridge University Press.Find this resource:

Collins, C. (1995). Toward a theory of optimal derivations. In R. Pensalfini & H. Ura (Eds.), *Papers on minimalist syntax* (Vol. 27, pp. 65–103). Cambridge, MA: MIT Working Papers in Linguistics.Find this resource:

Collins, C. (1997). *Local economy*. Cambridge, MA: MIT Press.Find this resource:

Emonds, J. (1970). *Root and structure preserving transformations*. Doctoral dissertation, MIT, Cambridge, MA.Find this resource:

Emonds, J. (1972). A reformulation of certain syntactic transformations. In S. Peters (Ed.), *Goals of linguistic theory* (pp. 21–62). Englewood Cliffs, NJ: Prentice-Hall.Find this resource:

Epstein, S. D., Groat, E., Kawashima, R., & Kitahara, H. (1998). *A derivational approach to syntactic relations*. New York: Oxford University Press.Find this resource:

Fillmore, C. J. (1963). The position of embedding transformations in a grammar. *Word*, *19*, 208–231.Find this resource:

Fox, D., & Pesetsky, D. (2005). Cyclic linearization of syntactic structure. *Theoretical Linguistics*, *31*(1–2), 1–46.Find this resource:

Freidin, R. (1978). Cyclicity and the theory of grammar. *Linguistic Inquiry*, *9*, 519–549.Find this resource:

Freidin, R. (1999). Cyclicity and minimalism. In S. D. Epstein & N. Hornstein (Eds.), *Working minimalism* (pp. 95–126). Cambridge, MA: MIT Press.Find this resource:

Freidin, R. (2012a). A brief history of generative grammar. In G. Russell & D. Graff-Fara (Eds.), *The Routledge companion to the philosophy of language* (pp. 895–916). New York: Routledge.Find this resource:

Freidin, R. (2012b). *Syntax: Basic concepts and applications*. Cambridge, U.K.: Cambridge University Press.Find this resource:

Freidin, R. (2013). Chomsky's contribution to linguistics: A sketch. In K. Allen (Ed.), *The Oxford handbook of the history of linguistics* (pp. 439–467). Oxford: Oxford University Press.Find this resource:

Freidin, R. (2016). Chomsky’s linguistics: The goals of the generative enterprise. *Language*, *92*, 671–723.Find this resource:

Freidin, R., & Lasnik, H. (2011). Some roots of minimalism in generative grammar. In C. Boeckx (Ed.), *The Oxford handbook of linguistic minimalism* (pp. 1–26). New York: Oxford University Press.Find this resource:

Grinder, J. (1972). On the cycle in syntax. In J. Kimball (Ed.), *Syntax and semantics* (Vol. 1, pp. 81–111). New York: Seminar Press.Find this resource:

Halle, M., & Chomsky, N. (1960). The morphophonemics of English. *MIT RLE Quarterly Progress Report*, *58*, 275–281.Find this resource:

Hornstein, N. (1999). Movement and control. *Linguistic Inquiry*, *30*, 69–96.Find this resource:

Huang, C.-T. J. (1982). *Logical relations in Chinese and the theory of grammar*. Doctoral dissertation, MIT, Cambridge, MA.Find this resource:

Jackendoff, R. (1972). *Semantic interpretation in generative grammar*. Cambridge, MA: MIT Press.Find this resource:

Katz, J. J., & Postal, P. (1964). *An integrated theory of linguistic descriptions*. Cambridge, MA: MIT Press.Find this resource:

Kawashima, R., & Kitahara, H. (1996). Strict cyclicity, linear ordering, and derivational c-command. In J. Camacho, L. Choueiri, & M. Watanabe, (Eds.), *Proceedings of the Fourteenth West Coast Conference on Formal Linguistics* (pp. 255–269). Stanford, CA: CSLI Publications.Find this resource:

Kayne, R. (1994). *The antisymmetry of syntax*. Cambridge, MA: MIT Press.Find this resource:

Kayne, R., & Pollock,. J.-Y. (1978). Stylistic inversion, successive cyclicity, and Move NP in French. *Linguistic Inquiry*, *9*, 595–621.Find this resource:

Kimball, J. (1972). Cyclic and linear grammars. In J. Kimball (Ed.), *Syntax and semantics I* (pp. 63–80). New York: Seminar Press.Find this resource:

Kitahara, H. (1997). *Elementary operations and optimal derivations*. Cambridge, MA: MIT Press.Find this resource:

Lasnik, H. (1995). Verbal morphology: *Syntactic Structures* meets the Minimalist Program. In H. Campos & P. Kempchinsky (Eds.), *Evolution and revolution in linguistic theory: Essays in honor of Carlos Otero* (pp. 251–275). Washington, DC: Georgetown University Press. Reprinted in H. Lasnik (1999), *Minimalist analysis* (pp. 97–119), Malden, MA: Blackwell.Find this resource:

Lasnik, H. (2006). Conceptions of the cycle. In L. Cheng & N. Corver (Eds.), *Wh-movement: Moving on* (pp. 197–216). Cambridge, MA: MIT Press.Find this resource:

Lasnik, H. (2012). Single cycle syntax and a constraint on quantifier lowering. In A. M. di Sciullo (Ed.), *Towards a biolinguistic understanding of grammar: Essays on interfaces* (pp. 13–30). Philadelphia: John Benjamins.Find this resource:

Lasnik, H. (2015). Aspects of the theory of phrase structure. In Á. J. Gallego & D. Ott (Eds.), *50 years later: Reflections on Chomsky’s Aspects* (pp. 169–174). MIT Working Papers in Linguistics.Find this resource:

Lasnik, H., & Saito, M. (1984). On the nature of proper government. *Linguistic Inquiry*, *15*, 235–289.Find this resource:

Lasnik, H., & Uriagareka, J. (2012). Structure. In R. Kempson, T. Fernando, & N. Asher (Eds.), *Philosophy of Linguistics* (pp. 33–61). Amsterdam: Elsevier.Find this resource:

May, R. (1985). *Logical form: Its structure and derivation*. Cambridge, MA: MIT Press.Find this resource:

McCloskey, J. (2001). On the morphosyntax of wh-movement in Irish. *Journal of Linguistics*, *37*, 67–100.Find this resource:

McCloskey, J. (2002). Resumption, successive cyclicity, and the locality of operations. In S. D. Epstein & D. Seely (Eds.), *Derivation and explanation* (pp. 184–226). Oxford: Blackwell.Find this resource:

Nunes, J. (2004). *Linearization of chains and sideward movement*. Cambridge, MA: MIT Press.Find this resource:

Postal, P. M. (1970). On coreferential complement subject deletion. *Linguistic Inquiry*, *1*, 439–500.Find this resource:

Quicoli, A. C. (2008). Anaphora by phase. *Syntax*, *3*, 299–329.Find this resource:

Richards, N. (1999). Featural cyclicity and the ordering of multiple specifiers. In S. D. Epstein & N. Hornstein (Eds.), *Working minimalism* (pp. 127–158). Cambridge, MA: MIT Press.Find this resource:

Richards, N. (2001). *Movement in language*. Oxford: Oxford University Press.Find this resource:

Rizzi, L. (1980). Violations of the *wh*-island constraint and the subjacency condition. *Journal of Italian Linguistics*, *5*, 157–195.Find this resource:

Rizzi, L. (1990). *Relativized minimality*. Cambridge, MA: MIT Press.Find this resource:

Ross, J. R. (1967). On the cyclic nature of English pronominalization. In *To honor Roman Jakobson* (pp. 1669–1682). The Hague: Mouton.Find this resource:

Sportiche, D. (1981). Bounding nodes in French. *The Linguistic Review*, *1*, 219–246.Find this resource:

Torrego, E. (1983). More effects of successive cyclic movement. *Linguistic Inquiry*, *14*, 561–565.Find this resource:

Torrego, E. (1984). On inversion in Spanish and some of its effects. *Linguistic Inquiry*, *15*, 103–129.Find this resource:

Uriagereka, J. (1999). Multiple spell-out. In S. D. Epstein & N. Hornstein (Eds.), *Working minimalism* (pp. 251–282). Cambridge, MA: MIT Press.Find this resource:

van Urk, C., & Richards, N. (2015). Two components of long-distance extraction: Successive cyclicity in Dinka. *Linguistic Inquiry*, *46*, 113–155.Find this resource:

Watanabe, A. (1992). Subjacency and S-structure movement of *wh*-in-situ. *Journal of East Asian Linguistics*, *1*, 255–291.Find this resource:

Williams, E. S. (1974). *Rule ordering in syntax*. Doctoral dissertation, MIT, Cambridge, MA.Find this resource:

## Notes:

(1.) The first mention of a syntactic cycle applies the concept to the phrase structure rules of the base. Chomsky (1965) proposes that given phrase structure rules that allow clausal recursion, the rules of the base “apply cyclically, preserving their linear order” (p. 134). Chomsky (1966) describes the construction of generalized phrase-markers as applying “the linearly ordered system of base rewriting rules in a cyclic fashion, returning to the beginning of the sequence each time we come upon a new occurrence of S introduced by a rewriting rule” (p. 62). Note however that this “cycle” applies top-down from larger constituents to smaller constituents contained in them and thus does not share the crucial bottom-up property of the transformational cycle in syntax. Instead, it shares the property of reiterated application for a linear sequence of grammatical rules.

(2.)
These trees are slightly modified versions of the originals. In both (2) and (4) the NP object of passive preposition *by* is given as *passive* rather than Δ. The infinitival complement of *persuade* in (3) is given as a PP headed by *of* whose NP complement consists of an empty N (N dominating Δ) and S′, and the Aux in (4) dominates an element “nom” in the original.

(3.)
The T-marker in Chomsky (1965) includes a transformation T* _{to}* that follows T

_{D}and replaces a string “

*of*Δ nom” with infinitival

*to*. Putting aside the ad hoc character of the replacement, this could not be the general source for infinitival

*to*.

(4.) Lasnik (2015) offers as a reason to eliminate T-markers that a phrase structure component allowing clausal recursion subsumes their major purpose—showing how phrase markers that are generated independently are combined together into a single syntactic object. Note that this might also motivate eliminating T-markers even with generalized transformations given that the syntactic object derived under their application also shows how the independently generated phrase markers are combined. Note also that T-markers specify the singulary transformations that apply in a derivation and the order of their application. It is far from clear that this information contributes to syntactic representation beyond the syntactic object derived from the application of these singulary transformations. For further important discussion of the interplay between phrase structure rules and generalized transformations, see Lasnik (2015).

(5.) Extending this strategy of simplifying the grammar by eliminating generalized transformations to the case of coordinate constructions is anything but straightforward—see Lasnik and Uriagereka (2012). In Chomsky (1993) a generalized transformation GT is characterized as a “binary substitution operation” involving two independent syntactic objects in contrast to “the singulary substitution operation Move α” (p. 189).

(6.) Chomsky (1966) discusses a different example (i):

((i))

where it is assumed that a relative clause transformation must apply to the phrase marker of *the man quit work*, changing it to *who quit work* before it can be embedded in phrase marker of *someone fired the man*. Such a derivation is clearly problematic given that the replacement of *the man* with the relative pronoun *who* is surely dependent on the existence of *the man* in the matrix clause. Moreover, if the relative pronoun is lexically inserted as part of the base phrase marker (i.e., as *someone fired who*) and relative clause formation is thus reduced to a form of *wh*-movement (Chomsky, 1977) where the relative pronoun is lexically inserted and not transformationally derived, then ordering T_{E} before or after *wh*-movement would yield the same result.

(7.) Note that Chomsky (1965) does not cite A>B>A derivations as a motivation for the cycle, nor has Chomsky cited them in any other publication. Given the evolution of the theory of grammatical operations (see Freidin (2012a, 2013) for details), such derivations cannot be formulated—see below for details.

(8.) The same analysis holds for regular pronouns, which eliminates the argument for the cycle in Ross (1967) based on the existence of a pronominalization transformation that converts a non-pronominal NP into a corresponding pronoun in the presence of a nonpronominal antecedent.

(9.)
The SCC could also block the unwanted application of a reflexivization transformation as in (11a) and (12a) and therefore subsume the effects of the Insertion Prohibition, but would in addition block the legitimate *John expects himself to succeed* under the assumption where the reflexive pronoun remains in the subject position of the infinitival complement clause.

(10.)
The structure Chomsky cites (see his (74)) involves a pair of *wh*-phrases in the predicate of an infinitival clause complement with an underlying phrase structure (i), which is modified from the original to reflect current analyses of clause structure.

((i))

(i) could yield two deviant direct questions: **what books does John know to whom to give?* and **to whom does John know what books to give*? (Chomsky’s (63) and (64) respectively). Because one *wh*-phrase is a NP and the other a PP, the derivation that Chomsky suggests would be prohibited by the SCC could also be excluded by the nondistinctness condition on substitutions. See below for further discussion.

(11.)
Note that in the first indirect question the *wh*-phrase object moves over the *wh*-phrase subject in violation of the Superiority Condition (73) in Chomsky (1973), which is claimed to provide “a stricter interpretation of the notion of the cycle” (p. 243).

(12.)
Note that this interpretation has never been endorsed by Chomsky, nor has it been explicitly rejected (but cf. Lasnik & Saito, 1984, which provides a technical argument for the derivational interpretation based on an assumption that covert *wh*-traces can be optional—either not generated or deleted).

(13.)
Chomsky (1993) gives (24b) as C followed by the VP *fix the car* without the subject *John*. Given the Extension Condition this must be an unintended omission.

(14.) This suggests that head movement is not part of narrow syntax. This possibility is discussed in Chomsky (2001, pp. 37–38). See below for further discussion and also chapter 7 of Freidin (2012b) for some justification and a detailed proposal along these lines.

(15.) The validity of the argument uniqueness part of the θ-Criterion has been denied under the movement theory of control (see Hornstein, 1999, and more recently Boeckx, Hornstein, & Nunes, 2010). However see Freidin (2012b, Chapter 6) where the uniqueness requirement is motivated by an analysis of nominalization constructions that are not susceptible to the movement theory argument against it.

(16.) See Chomsky (2008) on “the device of inheritance,” which is claimed to be “a narrow violation of NTC” that could still be in accord with the Strong Minimalist Thesis. See Freidin (2016) for arguments against incorporating this device. The operation of “tucking in” (Richards, 2001), according to Chomsky (2008) a literal but principled violation of NTC, appears to be unnecessary in the framework developed in Chomsky (2015).

(17.) This is proposed at the conclusion of Freidin (1999), but without the supporting details of analysis.

(18.) Later versions include Relativized Minimality (Rizzi, 1990) and the Minimal Link Condition (Chomsky, 1995b).

(19.) See however Rizzi (1980), Sportiche (1981), and Torrego (1983, 1984) for evidence that locality in the Romance languages differs from English and therefore requires a parametric approach to Subjacency or whatever constraint replaces it.

(20.)
(38) differs from Quicoli’s example in using *which rumors about himself* instead of *which pictures of himself* simply because some of the judgments about related examples that are cited below seem clearer. Similar examples are first mentioned in Barss (1986), which states “successive cyclic movement proliferates the possibilities for grammatically assigning an antecedent to the reflexive beyond what exists in the pre-movement structure” (p. 25). Unlike Quicoli (2008) Barss’s analysis does not require the reflexive to occur in the intermediate covert Spec-CP position and also argues against the successive cyclic application of Condition A of the binding theory (see p. 122). Instead, the reflexive is bound in its overt position. Nonetheless, the binding of the reflexive depends on the occurrence of a *wh*-phrase trace in the intermediate covert Spec-CP position and therefore depends on successive cyclic *wh*-movement.

(21.) See also Fox and Pesetsky (2005) for a phase-theoretic analysis that attempts to motivate successive cyclic movement on the basis of a theory of cyclic linearization.

(22.)
The introduction of Bobaljik (1995) proposes that “there are no syntactic operations after Spell-Out” so that “[t]he morphology is fed by the final output of the syntax”, what is called “Single Output Syntax” (p. 23). This is clarified in chapter VI, which claims that both overt and covert movement operations form “a single cycle”—citing the analysis of *wh*-movement in Watanabe (1992). This proposal is a precursor to the single cycle in Chomsky (2000), which generalizes to “all operations” (but cf. the discussion of head movement, especially note 16).

(23.) This is essentially the model proposed in Chomsky and Lasnik (1977, p. 431) with two crucial differences. Instead of the operation Merge, the Chomsky and Lasnik model lists base rules (rewrite rules for phrase structure) and transformations, which apply to the output of the base rules, as the operations of the first part of the derivation. The second difference concerns the interpretation of the point at which the derivation splits. In the Chomsky and Lasnik model this point constitutes a level of syntactic representation, S-Structure, defined by specific wellformedness conditions, whereas Spell-Out (see the original formulation in Chomsky, 1993) is an operation “which switches to the PF component” at a point in the derivation that may vary from language to language (p. 189).

(24.) Lasnik (2012) constructs an intriguing account of the impossibility of quantifier lowering based on this analysis.

(25.) If LF identifies the representation that interfaces with C-I, then SEM and LF are identical.