Constraining Free Merge
Constraining Free Merge
Jason Ginsburg 1
[1] Graduate School of Human and Environmental Studies, Kyoto University, Kyoto, Japan.
Corresponding Author: Jason Ginsburg, Graduate School of Human and Environmental Studies, Yoshida-
nihonmatsu-cho, Sakyo-ku, Kyoto 606-8501, Japan. E-mail: ginsburg.jasonrobert.2h@kyoto-u.ac.jp
Abstract
Some recent influential work in the Minimalist Program takes the position that Merge, the core
language ability to recursively combine two elements together, is free. However, if Merge were
completely free, there would be an infinite number of possible derivations for every utterance.
Thus, Merge must be constrained in some way. In this paper, I describe a computer model of
language that implements a limited form of Merge that is free. I attempt to demonstrate that,
within the confines of the language module, Labeling is generally sufficient to constrain Free
Merge, and I discuss issues that arise regarding overgeneration of syntactic structures given Free
Merge.
Keywords
Merge, Labeling, Box Theory, FormCopy, computer modeling
1 Introduction
If recent work in linguistics is correct, then Merge, the process of combining together
linguistic objects, is a core property of language that is utilized by the language faculty
to construct syntactic objects (SOs). Chomsky (2010, p. 52) writes that “unbounded Merge
is the sole recursive operation within UG” and that it is “part of the genetic component
of the language faculty.” If this is correct, human language makes use of recursive Merge.
Berwick (2011) suggests that non-human primates have lexical items but no Merge,
whereas birds have something like Merge (used in songs) but no lexical items. Human
language, crucially, makes use of lexical items and Merge.
Chomsky (2001, 2013, 2015) takes the position that Merge is free. Chomsky (2015,
p. 14) writes that “[t]he simplest conclusion … would be that Merge applies freely” and
This is an open access article distributed under the terms of the Creative Commons
Attribution 4.0 International License, CC BY 4.0, which permits unrestricted use,
distribution, and reproduction, provided the original work is properly cited.
Constraining Free Merge 2
“[o]perations can be free, with the outcome evaluated at the phase level for transfer
and interpretation at the interfaces.” I take this to mean that both External Merge and
Internal Merge are free. Crucially, Free (Internal) Merge would result in an infinite
number of possible structures generated for every possible utterance. This is untenable.
Thus, Free Merge must be constrained by the language faculty.
The question then arises of how Free Merge is constrained. In this paper, I demon
strate how, given a Merge-based model of language generation (based on recent work
in linguistic theory), Merge can be constrained. Crucially, I argue that arguments can
be subject to Free Merge, subject to the constraints of the language module, but that
Labeling in general is sufficient to eliminate most impossible derivations.
In the following sections, I discuss my core assumptions regarding syntactic struc
ture, which I implemented in a computer model that automatically generates sentences.
Notably, in this model, I attempt to remove many of the basic problematic and over
ly complex assumptions in recent work in the Minimalist Program (Chomsky, 1995)
with the goal of keeping language simple, in accord with the Strong Minimalist Thesis
(Chomsky, 2000, 2001, 2010; Chomsky et al., 2023), the notion that “language keeps to
the simplest recursive operation, Merge, and is perfectly designed to satisfy interface
conditions (Chomsky, 2010, p. 52).” Then I explain how Labeling is generally sufficient to
constrain Free Merge. This is followed by discussion of issues that arise with respect to
overgeneration.
1) This computer model is simply an attempt to model syntactic theory in accord with recent work in the Minimalist
Program. This model, as well as related models presented in Ginsburg (2016), Fong and Ginsburg (2019) and Ginsburg
and Fong (2019), have no relation to the Minimalist Grammar formalisms of Stabler (1997, 2011) and related work.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 3
each Merge step, the model checks for the possibility of agreement relations and for the
possibility of Labeling (see Section 3). When a derivation is complete, it is transferred to
Spell-Out, where a particular pronunciation is determined. For any particular example,
there can be multiple successful derivations (that converge), as well as multiple crashed
derivations.
Figure 1
Main Components of the Computer Model
In this paper, I utilized this computer model to test the theory that is developed in the
following sections. The model produces complete step-by-step derivations for all target
constructions, thus making it possible to find problems with, and verify the accuracy of,
the target theory. The complete derivations for all target constructions presented in this
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 4
paper can be found in the Supplementary Appendix (see Ginsburg, 2024),2 and these can
be of use to researchers who are interested in verifying the proposals in this paper. The
main focus of this paper is on linguistic theory. I used this model to test the accuracy of
the theory that I develop in this paper.
2) More details about how the model works are given in the Appendix (see Ginsburg, 2024). An in-depth description
about how the model works is beyond the scope of this paper.
3) An anonymous reviewer writes that in Chomsky’s approach, “there is no labelling notion at all, and all there is the
necessity for Minimal Search to univocally find a feature for each SO.”
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 5
Also, given a {T, YP} structure, Labeling is not possible, assuming that T is too weak to
label. These are summarized in (1). Furthermore, consider a root that has Merged with a
functional head (e.g., a categorizer) to form what is essentially a head-head structure; for
example, the root walk Merges with a categorizer n. In this case, the root is too weak to
label by itself, but the functional head can label. This is the position that Chomsky (2013,
p. 47) takes, following Marantz (1997), Embick and Marantz (2008), and Borer (2005a,
2005b, 2013). I assume that a root can label after it Merges with a categorizer.
The convergent derivation of Tom read books,4 following Chomsky (2013, 2015), proceeds
as shown in Figure 2. The lines below each terminal node represent the frontier of the
derivation—the portion of the derivation that is sent to Spell-Out to be pronounced. The
root books set-Merges with the functional categorizer n, and n labels, as the root books
is not capable of Labeling. Chomsky (2015) claims that a verbal root undergoes internal
pair-Merge (head movement) with v to form <v*, read>, resulting in v* being dephased
(also see Epstein et al., 2016). The <v*, read> pair-Merged structure is represented with
a dotted arc. Dephasing is a process in which an element that would typically function
as a phase head, thus being a point of transfer, no longer functions as phase head.5 In
this case, Chomsky proposes that phasehood is passed onto the complement of v*. Thus,
the complement of v* will function as a phase and be transferred. A phase head passes
uFs to its complement, so the uPhi (uninterpretable Phi) of v* are passed onto the verbal
root read, which being a root, is unable to label by itself. The object a book undergoes IM
(Internal Merge) with read to form an {XP, YP} structure. In the matrix clause, the subject
is initially-Merged with vP, and then it internally-Merges with the TP. The uPhi of C are
inherited by T. Minimal Search results in phi-feature agreement in the {NP, {Tpast…}} and
{NP, {read…}} structures, and these shared phi-features are able to label, where the label is
indicated as <ɸ, ɸ>.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 6
Figure 2
In the following subsections, I explain my assumptions about Labeling Theory. Note that,
for reasons discussed in the following sections, I do away with some of the operations
utilized in the type of derivation shown in Figure 2.
3.2 Phases
Labeling Theory follows the view that the structures of sentences are constructed hier
archically in a bottom-up fashion, and sentences consist of phases, which are portions
of sentences that essentially become inaccessible after construction. The core phases are
generally assumed to be a transitive Verb Phrase (v*P) and a Complementizer Phrase
(CP), following Chomsky (2000, 2001). Both (2)a–b are well-formed and both crucially are
formed from the same set of lexical items. These examples differ, however, with respect
to the ordering of lexical items. The embedded CP in (2)a is a phase that is constructed
from a lexical array that does not contain there. As a result, a man raises to subject
position of the CP. The expletive there is associated with the higher phase of the matrix
clause. In (2)b, on the other hand, the expletive there is available in the embedded CP
phase. As a result, there is inserted in subject position of the CP and a man does not need
to move.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 7
Once a phase is complete, the complement of the phase head becomes inaccessible to fur
ther operations, which is proposed to reduce memory burden—the mind can essentially
put a completed phase to the side and compute the next phase. Note that when a phase
head is Merged, there are differing views about which portions of the phase become
inaccessible in accord with the Phase Impenetrability Condition (Chomsky, 2000, 2001;
Müller, 2004; Richards, 2011). Under one version of the Phase Impenetrability Condition,
the complement of the phase head becomes inaccessible and is transferred, but in anoth
er version, the complement (if present) of the lower phase head becomes inaccessible and
is transferred. As noted by Boeckx and Grohmann (2007, p. 206),6 referring to Chomsky
(2000), “[c]omputation cost reduction is the prime conceptual advantage and motivation
for phases.” This means that all feature checking operations within the phase must be
complete, and any elements that need to move out of the phase must have moved to the
edge of the phase before completion. Since phases are thought to be complete (in some
sense), they ideally should be of some advantage when accounting for island effects,
although whether or not this is the case is open to debate (e.g., see Chomsky, 2008;
Gallego, 2010). I incorporate the notion of phases into this model, since they are utilized
in Labeling theory. I assume that the phases, following Chomsky (2001), are transitive
VP (v*P) and CP. Note that, essentially following Chomsky (2021), I will assume that
when v* or C is Merged, the v*P/CP is transferred. Thus, the head of the phase is
transferred together with its complement but the specifier, if present, remains outside of
the transferred phase.
6) See Boeckx and Grohmann (2007) for discussion of problems with the notion of phases.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 8
Feature inheritance can be useful for accounting for Exceptional Case Marking (ECM)
constructions. In the ECM (3)a, the embedded T, pronounced as to, occurs without C.
In this case, T lacks agreement features and him Agrees with the matrix verb expect.
In (3)b, on the other hand, T, pronounced as past tense on the verb win (resulting in
won), occurs with C. T has agreement features and Agrees with the subject, resulting in
the nominative pronoun he. These types of simple examples demonstrate how T, in the
presence of C, has agreement features, which it lacks in the absence of C. While feature
inheritance is useful for accounting for the ECM data, it isn’t necessarily clear if it is
required. If non-finite T simply lacks a full set of unchecked/uninterpretable phi-features
(uPhi), and tensed T has a full set of uPhi, the same facts can be accounted for, without
recourse to feature inheritance.
Although the existence of complementizer agreement as in (4) has been given as evi
dence for feature inheritance (Chomsky, 2008; Miyagawa, 2005), complementizer agree
ment tends to be less common and less complete than agreement with T. Matasović
(2018, p. 9) writes “that the most common agreement pattern within the domain of the
clause is verbal agreement.” Koppen (2017, p. 7) writes “The CA [Complementizer Agree
ment] paradigm is usually defective, however, in the sense that not all person/number
combinations of the subject lead to an overt agreement reflex on the complementizer.”
Koppen (2005, p. 35) points out this defectivity in a variety of Germanic/Dutch languag
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 9
Table 1
Agreement in Frisian
1 Per.Sg -0 -n
2 Per.Sg -st -st
3 Per.Sg -0 -t
1 Per.Pl -0 -e
2 Per.Pl -0 -e
3 Per.Pl -0 -e
Note. Koppen (2005, p. 35).
If it truly is the case that verbal agreement is more common than complementizer
agreement and that verbal agreement tends to be more complete than complementizer
agreement, then this may be an indication that a complementizer is not always the origin
of agreement features. If C were the locus of agreement features, then one might expect
agreement with C to be more common than it is, and for agreement with C to tend to be
more, not less, complete than agreement with T.
Feature inheritance also raises technical problems. If uPhi are inherited by T, one
possibility is that all of the uPhi of C are passed from C onto T and no longer remain
on C. This would be the case when agreement only shows up on T (usually visible on
the verb). This does not appear to be the case in the examples in (4) in which there is
agreement between a subject and both C and the verb (assuming the verbal agreement is
the result of agreement on T). Another possibility is that the uPhi of C are copied onto
T, so that they appear on both C and T. This could account for the data in (4). Again,
copying of features from one element onto another seriously alters an already formed
SO, again violating the No-Tampering Condition. It would be simpler if C and T come
with their necessary agreement features.
Richards (2007) provides arguments for feature-inheritance, proposing that feature
transmission is a “conceptual necessity” in order to avoid transfer of uninterpretable
features. Uninterpretable features, by definition, cannot be processed by the semantic
component of a derivation. If uninterpretable features are checked but are not transferred
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 10
immediately, then they should stay around and cause a derivation to crash, according
to Richards, since they can’t be interpreted. Thus, uninterpretable features must be
transferred as soon as they are checked. The assumption seems to be that when checked,
uninterpretable features are transferred with the phase. They are “deleted” so that they
are no longer visible to the semantic component. Assume that T has uPhi that are
checked via Agree with a subject, before the phase head C is Merged. When these
features are checked, they cannot be transferred until after the phase head C is Merged.
Thus, these uPhi cannot be deleted as soon as they are checked. These checked uPhi,
according to Richards, then become indistinguishable from interpretable features and
interpretable phi-features on T, presumably, will cause a derivation to crash. The idea
seems to be that since T is not an argument, phi-features (which are associated with
arguments) cannot be interpreted on T. On the other hand, if uPhi are inherited from C
by T, then as soon as they are inherited, they are checked, and since the phase level has
been reached, the checked uPhi are instantly transferred, so that they no longer remain
for the semantic component. Assuming that uninterpretable features originate on a phase
head predicts phase-level operations of inheritance of features, Agree (e.g., checking of
uPhi on T by phi-features on a subject), and transfer of the relevant portion of the phase.
However, the idea that uninterpretable features need to be deleted as soon as they are
checked is not necessarily a given. Since these features are uninterpretable, by definition,
they could cause a derivation to crash if they are transferred, but as long as they are
deleted before transfer, it isn’t clear why they need to be deleted immediately – this
seems to be a stipulation. Furthermore, some recent work takes the position that the
complement of a phase head is not transferred immediately. Chomsky (2015) argues
that phasehood can be transferred to the complement of a phase head, based on ECM
constructions and the that-trace effect.7 As noted by Goto (2017), the motivation for
feature-inheritance based on the need to delete uninterpretable features as soon as they
are checked may not necessarily hold.
Another issue with feature inheritance involves probe-goal agreement. Since
Chomsky (2001), agreement has typically been assumed to involve a probe-goal relation.
For example, assume that T in (5) has uPhi that probe for and Agree with the phi-features
on a subject. Similarly, v* has uPhi that probe for and Agree with phi-features on an
object. The relations Agree(T[uPhi], Mary[iPhi]) and Agree(v*[uPhi], books[iPhi]) check the
uPhi on T and Mary via probe-goal agreement.
7) Note that the complex operation of phasehood transfer is also suspect from the perspective of Minimalism. I do not
adopt this operation in my model.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 11
Now consider how probe-goal agreement of this sort works given feature-inheritance. If
T must inherit its uPhi from C, then the uPhi on T cannot probe until after C is Merged.
Thus, probing is counter-cyclic, not from a root node, which is contrary to the original
notion of probe-goal in which probing occurs from the root node (Richards, 2006).
Another problem, pointed out by Epstein, Kitahara, and Seely (2022), hereafter EKS
(2022), is that given feature inheritance, there are cases in which agreement must occur
with a goal that is no longer visible. Assuming feature inheritance, in (6), the uPhi
features on T are inherited from C. Thus, T does not obtain its uPhi features until after
C is Merged, and also after the subject has internally Merged with the TP (assuming that
a subject raises to the specifier of TP). Then, following Chomsky’s (2013) view that only
the highest copy of a syntactic object (SO) is visible to probing, the lower copy of the
subject is not visible to probing. This means that the probe cannot find the subject in
its base position. The higher copy of the subject is in the specifier of the TP, so that the
past tense T does not c-command it. See Figure 3. EKS (2022) propose a solution based
on Minimal Search (Agreement occurs between T and the subject in the TP). However,
none of this is necessary if there is no feature inheritance. If T simply comes with its
relevant set of uPhi features, then it can probe as soon as it is Merged. There is no need
for counter-cyclic Agree relations, and the problem of agreement with an invisible copy
of an SO does not arise.
Figure 3
The facts regarding feature inheritance are far from settled, but I will assume that from
the perspective of the Strong Minimalist Thesis, it is best to do without it.8 Feature
inheritance is best eliminated from the current theory from the perspective of simplicity;
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 12
8) Goto (2017) and Gallego (2017) both attempt to do away with feature inheritance. These works, however, make use
of head movement, an operation that is potentially problematic. If there is no head movement of the sort that they
propose, then these approaches potentially do not work. See Section 3.5 for discussion of head movement.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 13
These Case facts can be accounted for if Case is primarily a Spell-Out phenomenon.
Marantz (2000, p. 20) argues that “case and agreement morphemes are inserted only
after SS [Sentence Structure] at a level we could call “MS” or morphological structure.”
Bobaljik (2008), following Marantz, writes that “the proper place of the rules of m-case
[morphological-case] assignment is thus the Morphological complement, a part of the
PF interpretation of syntactic structure (Bobaljik, 2008, p. 300).” Chomsky (2021, p. 23)
suggests that “Case is part of externalization” further writing that “there seems to be no
general semantic reason” for Case systems and “[p]erhaps establishing relations among
elements facilitates perception/parsing.”
In my model, I take a Spell-Out-based approach to Case. I assume that Case appears
at Spell-Out, following Chomsky’s view that Case is a reflex of phi-feature agreement
(Chomsky, 2000, 2001). This approach can account, at least to a certain extent, for
some of the language-internal and cross-linguistic idiosyncrasies that occur with Case. I
assume that unchecked phi-features, uPhi, must be checked for a derivation to converge.
The result of phi-feature agreement can lead to an argument being pronounced with
overt Case morphology. Case, however, is a Spell-Out phenomenon. The exact form of
Case can be subject to language internal and cross-linguistic variation, but the actual
form of Case on an argument does not have an influence on syntax. Note that if an
argument is unable to be pronounced with Case, a derivation can crash at Spell-Out (see
Section 3.6).
Chomsky has proposed differing views on head movement. Chomsky (1993, p. 23) argues
that the Extension Condition does not apply to adjunction operations, which means that
it does not apply to head movement, assuming that head movement is adjunction (also
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 14
see Dékány, 2018, p. 5). However, making head-movement an exception to the Extension
Condition is not exactly ideal. From the perspective of Minimalism, the notion that all
Merge operations target the root of an SO is optimal. Thus, head-movement seems to
violate this requirement, and this alone is enough to make head-movement suspect.
In recent work in Labeling Theory, a version which I assume here, Chomsky (2015)
makes use of head movement. Chomsky (2015, p. 12) proposes that a verbal root R
raises to v*, resulting in dephasing of v* (see Figure 2 above). This raising operation
forms “the amalgam [R-v*]”, which Epstein, Kitahara, and Seely (2016) (hereafter EKS,
2016) specifically describe as being a case of internal pair-Merge. EKS (2016, p. 90) write
“pair-Merge internally forms <R, v*> (=R with v* affixed).” Internal pair-Merge, unless
it is implemented via sidewards movement, requires R to internally pair-Merge with v*
after v* has Merged with the SO. This head-movement again violates the No Tampering
Condition and the Extension Condition.
Despite making use of head movement in some works such as those discussed above,
a variety of issues, including the violation of the Extension Condition/counter-cyclicity,
lead Chomsky (2001) to propose that head-movement is a phonological (PF) operation.
Chomsky (2001, p. 37) writes “[t]here are some reasons to suspect that a substantial core
of head-raising processes, excluding incorporation in the sense of Baker (1988), may fall
within the phonological component.” Some other reasons from Chomsky (2001, pp. 37–
38) why head movement is problematic are as follows (see Roberts, 2011, for a summary
of these arguments). There are no clear interpretation differences in languages such as
French and Icelandic, in which the verb appears in a position generally considered to be
T (possibly a result of head movement), compared with languages such as English, in
which the verb remains below T. If head movement were responsible for verb movement,
and if head movement influences interpretation, then the expectation is that semantic
differences would arise (Roberts, 2011, p. 99). Another issue is that if a head raises and
adjoins to a higher head within an SO, then “the raised head does not c-command its
trace (Chomsky, 2001, p. 38).” Also, a phrase can undergo successive-cyclic movement
whereby it moves from the specifier of one phrase to the specifier of another phrase,
but this doesn’t appear to occur with a head. Rather, “it always involves ‘roll-up’ (i.e.
movement of the entire derived constituent … iterated head movement always forms
a successively more complex head (Roberts, 2011, p. 201).” For example, in an English
interrogative, an auxiliary (Aux) moves to T (assuming that the auxiliary is not base
generated in T), and then Aux-T raises to C. An auxiliary cannot move to C and leave T
behind.
There are a variety of proposals related to head movement in the literature which
make use of syntactic movement and/or post-syntactic PF operations. Embick and Noyer
(2001) propose that there are postsyntactic lowering operations in which a head can
lower and combine with another head. Matushansky (2006) argues that typical cases
of head movement can result from a syntactic operation of movement of a head to a
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 15
9) See Harizanov and Gribanova (2019, p. 493) for a more extensive list of literature that discusses these two possible
types of head movement.
10) In my model, Free Merge is limited to arguments (see Section 4). Thus, the notion that a head is able to undergo
re-Merge, even in limited cases, might be problematic.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 16
Although the issues regarding head-movement are far from settled, I adopt a model
in which there is no counter-cyclic head movement in the narrow syntax. I take an affix
hopping approach (Chomsky, 1957) in which a set of Spell-Out rules are applied to the
output of the syntax.11 Crucially, if X (an affix) and Y are adjacent at PF, then it is possible
for Y to be linearized before X.
The basic rules that I implemented are given in (12) below. Note that only a small
set of rules is sufficient to account for the basic English constructions produced by
my model. The computer model generates a syntactic structure. If a derivation does
not crash, the nodes of the tree are sent to Spell-Out. Then the basic PF (Phonological
Form) rules apply when necessary. Examples of rule applications for particular Spell-Out
forms are shown in Table 2. Note that for these rules to apply, the PF component of the
derivation has to have access to some syntactic category information. Thus, if the model
finds T adjacent to v* which is adjacent to a root R, or an auxiliary, then T attaches
onto the root or auxiliary, and v* is eliminated from PF, since it has no pronunciation. As
shown in Table 2, for Tom saw Fred, the adjacent SOs T(Past,3rd,sg) v* see are converted
into T(Past,3rd,sg)+see which is pronounced as saw, where T ends up suffixing onto the
adjacent see; with a regular verb, past tense is generally pronounced as -ed. The verbal
head v* is not pronounced. For Will Tom read a book, the interrogative C and T combine
to form CQ+T(Pres,3rd,sg). Note that this requires T to attach onto CQ by moving over
the subject at Spell-Out. Furthermore, the auxiliary will must move over the subject
and combine with T to form T(Pres,3rd,sg)+will, which ends up being pronounced as
will. In cases in which T combines with C, but there is no overt element in T, then
the appropriate form of do (depending on Tense and agreement) is pronounced. The
appropriate forms of do as well as irregular verb forms are listed in a lexicon, which the
model consults.12
11) This type of approach might be compatible with more complex approaches such as that of Harley (2004).
12) The lexicon contains a list of irregular verbs. When the model encounters a verb at Spell-Out, it checks the
lexicon. If the verb is not listed in the lexicon, then regular tense rules apply. For example, the past tense is formed by
adding -ed. If the verb is in the lexicon, then the appropriate form of the verb is selected from the lexicon.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 17
Table 2
Spell-Out PF Rules
T(Pres,3rd,sg)+will
will Tom read a book
3.6 FormCopy
Chomsky (2021) proposes a FormCopy operation which accounts for how nominals are
interpreted as Copies. Chomsky (2021) describes FormCopy as a rule that assigns “the
relation Copy to certain identical inscriptions” (p. 17) that are in a c-command relation,
and that presumably are in the same phase. Identical inscriptions refers to arguments
that are identical in form. Consider how this accounts for the Control construction
(Chomsky, 1981, 1986) in (13)a, as shown in (13)b. Here, I assume that the Control
construction is a TP.13 The NP many people1 is externally-Merged in the vP theta-position
and it undergoes IM to the non-finite TP specifier position. The NP many people2 is sepa
rately externally-Merged in the matrix vP theta-position, and internally Merged with the
matrix T. FormCopy applies and all inscriptions of many people, except for the highest
many people2 in the matrix TP, are interpreted as Copies, which are not pronounced.
13) It is crucial that this Control clause (the embedded non-finite clause) be treated as a TP. If it were a CP, then when
the C phase head is Merged, it would be transferred, thus making it inaccessible to FormCopy applications from the
matrix clause. Chomsky (2021) treats a Control clause as a TP. However, the status of a Control clause is not entirely
straightforward. It has been analyzed as a CP as well as a TP. See Radford (2016, Chapter 4), and references therein, as
well as Landau (2024) for arguments that a Control clause is a CP with a null infinitival complementizer. Note that if
a non-finite clause contains a C that for some reason is not treated as being a phase head, then the current FormCopy
account could be maintained.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 18
If FormCopy were not to apply in (13), the lower many people1 would be treated as a
separate NP (a repetition, not a Copy) from the higher many people2. This is fine, as it
has a separate theta-role from the matrix many people. However, the construction would
then crash. Why exactly it would crash is then an issue. I assume that it would crash at
Spell-Out due to Case reasons, not due to issues with the syntax. The relevant argument
many people1, not being a Copy, would need to be pronounced, but it would not be in a
position in which it could obtain Case, in the edge of a non-finite TP, and the matrix try
is not the kind of verb that assigns accusative Case.
Note that FormCopy has the advantage of eliminating the need for memory of movement
operations. In (14), assuming the standard view of the VP-internal subject hypothesis,
the subject John is externally-Merged in the v*P. Then it internally-Merges with T in
subject position. With FormCopy, there is no need to retain memory of the movement
of John. When the construction is generated, John undergoes IM. Then, at the phase
level, FormCopy applies and the lower inscription of John is interpreted as a Copy of the
higher John.14
A number of issues arise regarding the FormCopy operation and its formulation in
the literature. First, the question arises of whether or not FormCopy can apply freely.
Chomsky (2021, p. 25) writes:
Let’s return to simple transitive sentences, such as John saw X. Sup
pose X = John. With the subject inserted by EM [External Merge]
in the predicate-internal position, they are in an IM-configuration
[Internal-Merge-configuration]. If FC [FormCopy] applies, the ex
pression will crash at CI [Conceptual-Intentional interface] with a
θ-Theory violation. We conclude, then, that like other operations,
FC is optional, not applying in this case so there is no deletion, just
two repetitions of John.15
14) This example raises some questions if FormCopy does not apply. If FormCopy does not apply, then the lower John
would have to be pronounced. This can potentially be ruled out by Theta Theory. If the lower John is not a Copy of
the higher John, then this construction would contain two instances of John that each require separate theta-roles,
but there is only one theta-role that is available in this clause.
15) In the case of John1 saw John2, Chomsky suggests that if FormCopy were to apply so that the lower
John2 were treated as a Copy of the higher John1, the same argument John would be interpreted as
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 19
Chomsky indicates that FormCopy is optional, but I don’t take this to mean that it
applies freely. Rather, it applies in some, but not all, cases in which there are multiple
inscriptions with the same form.16 Specifically, it applies at the phase level. Chomsky
et al. (2023, p. 25) write that “[i]n technical terms, the point at which FC [FormCopy]
applies is referred to as the phase level.” Limiting FormCopy to the phase level accounts
clearly for why FormCopy cannot convert the lower John1 into a Copy in John2 saw
John1. As shown in (15), FormCopy should apply when the lower phase head v* is
Merged, before John2 is Merged. After John2 is Merged, the lower v* and its complement
are no longer accessible to FormCopy. Thus, FormCopy cannot apply between John2 and
John1.
having two separate theta-roles at CI, which is problematic. This argument isn’t clear to me, as Form
Copy can apply in Control constructions, like in (13), to convert arguments with separate theta-roles
into Copies. However, John1 saw John2 could be ruled out with respect to Case at SpellOut. The lower
John2 is marked with accusative Case, and a Case marked argument in English appears to need to be
pronounced, but if it is a Copy it is not pronounced. Also, as I note below, FormCopy shouldn’t be able
to apply between John1 and John2 anyway because they are in separate phases. The higher John1 is
outside of the v*P phase.
16) Chomsky (2024) suggests that FormCopy can be used “for convenience” but that “it need not be listed among the
admissible operations.” How exactly FormCopy can be done away with, but still be adopted for convenience is not
clear to me. In this paper, I assume that FormCopy is an operation utilized by the Faculty of Language.
17) Strictly speaking, FormCopy should apply only if X c-commands Y. Note that I haven’t formally implemented the
c-command component in my model, as it is not crucial to the examples that I implemented.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 20
richness of agreement is at play, then questions arise around languages (e.g., Japanese)
which do not appear to show any agreement, but allow null subjects.
If certain heads like the English T can be weak, then the question also arises of what
to do with non-finite T. Assume that finite T is weak and requires agreement to be
strengthened for Labeling. But non-finite T does not show agreement. In (16), non-finite
T, pronounced as to, and the embedded subject Mary are adjacent. One possibility is that
the embedded TP has the structure in (16)b with Mary and the non-finite T forming an
{XP, YP} structure. In (17), there is no overt argument in the embedded clause, but there
should be a Copy of John (or PRO) in the embedded clause, as in (17)b. In this case, since
the lower Copy of John is not pronounced, it should be invisible to Labeling, and thus
non-finite T cannot be strengthened. Presumably, non-finite T can label, and thus, it does
not require strengthening.
Chomsky (2015) assumes that in an ECM construction such as (16), the embedded subject
raises to the matrix object position. This follows work by Postal (1974) and Lasnik and
Saito (1991), among others. For example, the embedded subject in an ECM construction
can be passivized and it behaves like it is in the matrix clause with respect to binding
effects. Chomsky argues that the Root expect is weak and thus must be labeled by an
{XP, YP} structure. On this account, the embedded subject raises and forms an {XP, YP}
structure with the matrix verbal root. For example, in (16), the structure {Mary, {expect
Mary to arrive}} is formed. Problems with this approach are that the root expect must
inherit uPhi from the higher v* (if feature inheritance is assumed) and that expect has
to undergo some form of head movement over Mary. If head movement is a Spell-Out
phenomenon, then this would happen at Spell-Out.
Due to the complexities involved, I take a simpler approach—Mary is in the embedded
non-finite TP in (16), where it Agrees with the higher v*, resulting in accusative Case
appearing on Mary at Spell-Out. The evidence that the subject of a non-finite clause can
behave like the object of the matrix clause is strong, but whether or not this requires the
embedded subject to actually undergo IM with the matrix verbal root is not clear. I will
simply assume that, due to the lack of an intervening phase boundary, an ECM subject
can behave like it is a matrix object even if it is in the embedded clause.18
18) As pointed out by an anonymous reviewer, there is evidence from a variety of languages, including English, that
an ECM subject behaves like a matrix object. Chen (2018) distinguishes two types of analyses of ECM constructions.
In one type of analysis, the subject of an embedded clause raises to matrix object position, as has been argued for
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 21
There is evidence that an overt subject can appear in a non-finite clause. In examples
such as (18), the subject him appears to be in the embedded clause. It seems to be a
fairly standard assumption that the embedded subject is in the non-finite T (e.g., see
Chomsky & Lasnik, 1977). If Labeling Theory is correct, then there must be some type of
agreement relation between non-finite T to and him.
with respect to English as well as other languages such as Icelandic (e.g., Sigurðsson, 2006; Sigurðsson & Holmberg,
2008, etc.), and in languages such as Japanese, Korean, Romanian, and Zulu. In another type of analysis, an object
is base generated in a matrix clause and is co-indexed with an overt subject pronoun in an embedded clause, as in
languages such as Madurese, Tagalog, and Sundanese. See Chen (2018) and references therein. There are also analyses
in which an ECM subject remains in the ECM complement clause. For example, Chen (2018) argues that in Puyuma,
an ECM phrase (which does not necessarily have to be a subject) remains within an embedded clause. In addition,
there a variety of conflicting analyses on these types of constructions. For example, Kuno (1976), Tanaka (2002), and
others (e.g., see Kishimoto, 2021) argue that an ECM subject in Japanese raises out of an embedded clause to a matrix
object position. However, Kishimoto (2021) argues that an ECM subject in Japanese remains within an embedded
clause. The facts and the various analyses regarding these constructions are complex, and thus I acknowledge that
my simple assumption that an ECM subject remains within a complement clause is certainly worthy of further
investigation.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 22
d. (eles/elas) fala-r-em
(they) speak-Inf-3Pl (Pires, 2006, p. 92)
Inflected infinitives are also found in European Portuguese (Raposo, 1987), as well as
other Romance languages such as Galician and Old Neapolitan (Groothuis, 2015; Scida,
2004). Hungarian also has inflected infinitivals, as shown in (20).
An anonymous reviewer points out that the distribution of subjects with agreeing infin
itives and with non-agreeing infinitives is different. According to Pires (2006, p. 93),
in Brazilian Portuguese a non-inflected infinitival requires a PRO subject (which has a
local antecedent), and an inflected infinitival has a pro subject (which does not require
a local antecedent). Furthermore, a non-agreeing infinitive requires a sloppy reading
under ellipses, whereas an agreeing infinitive permits a strict or sloppy reading, and
a non-agreeing infinitive does not permit split-antecedence but an agreeing infinitive
does. Although an in-depth analysis is beyond the scope of this work, I suggest that
these differences boil down to whether or not the non-finite T Agrees partially or fully
with an argument. In some Portuguese infinitivals, there may be full agreement with an
argument (the infinitival property is due to the lack of tense, not phi-features), and pro is
permitted. In non-agreeing infinitivals, there can only be partial agreement, which is not
sufficient to check Case on an argument and only PRO is permitted.
Even though there is no clear overt indication of agreement in modern English
infinitives, it is possible that there is partial agreement, as found in languages such as
Portuguese, Hungarian, etc. Thus, I assume that T can label either by itself or via shared
Person features.
Furthermore, I assume that heads can generally label. Mizuguchi (2017, p. 331) sug
gests that “[h]eads can label only when they are without unvalued features.” If a head
has unvalued/unchecked features, it is incomplete, and thus it is reasonable to assume
that Labeling isn’t possible. I adopt this view in my model; heads can label by themselves
as long as they lack unchecked features. A root, however, needs to be categorized. Thus,
a root cannot label by itself. For example, the root walk can be labeled only after it
combines with a categorizer N or v.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 23
Timing of insertion into the Box is an issue that arises. Chomsky (2024) suggests that
boxing is contingent upon IM. Chomsky writes segregation of a boxed element is “estab
lished by IM [Internal Merge], which carries the derivation from the propositional to
the clausal domain.” Chomsky further writes that “we can think of the element E that
is IM-ed to the phase edge as being put in a box, separate from the ongoing derivation
D.” Chomsky appears to be proposing that boxing results from IM of a particular SO to
the phase edge. Note that it is simpler to put an SO into the Box, without boxing being
contingent on IM, rather than to do IM of the SO followed by boxing. Furthermore, I
also assume that IM of arguments is free (see Section 4). If IM to a phase edge results in
boxing, then there could be overgeneration of boxed SOs.
Assuming that boxing happens without IM, it could be that as soon as a wh-phrase
is externally Merged with an SO, it goes into the Box, or it could be that it goes into the
Box at the phase-level. Also, when a phrase is accessed from the Box, its base position
should be treated as a Copy. This means that FormCopy must apply. FormCopy could
apply as soon as an SO goes into the Box, or it could apply as soon as the SO is accessed
from the Box. Also, consider (22). In this case, CQ and the subject who are within the
same phase. So whether or not who has to go into the Box isn’t clear as CQ should be able
to access who without looking into the Box.
19) Note that if the Box can store multiple elements and the last element in is the first element that can be accessed,
then the Box is similar to a Stack structure that is commonly used in computer science, and that has been used in
some linguistics work (e.g., see Fong & Ginsburg, 2014, 2019).
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 24
In my implementation, I had to make decisions about the timing of the Box operation.
From an implementational perspective, it is easier to put a wh-phrase into the Box as
soon as possible, rather than to wait until a phase is complete, since waiting requires
checking an already formed structure for wh-phrases (or other phrases that need to go
into the Box). Thus, my model places a wh-phrase into the Box as soon as possible. Since
the Box is assumed to exist, this applies to wh-subjects too. The model places wh-phrases
in the Box and CQ can only see into the Box. FormCopy applies when CQ accesses a
wh-phrase from the Box.20
Assuming the existence of the Box, successive-cyclic wh-movement is potentially
an issue. If an argument is in the Box, there is no reason for it to undergo IM to an
intermediary position. However, there is evidence for successive-cyclic wh-movement.
Some well-known evidence is the existence of partial wh-movement (McDaniel, 1989).
For example, in German and Albanian, a wh-phrase can appear in an intermediary
position and a wh-phrasal scope marker (or some type of question element) can appear
in the relevant scope position, as shown in (23) and (24). In Malay, as shown in (25), a
wh-phrase can move to the edge of a clause in which it does not have scope, and be
interpreted with scope in a clause in which there is no overt wh-marker.
(23) [Was1 glaubst du [was1 Hans meint [[mit wem]1 Jakob t1 gesprochen hat]] (German)
Wh believe you Wh Hans thinks with whom Jakob talked has
‘With whom do you believe that Hans thinks that Jakob talked?’ (Cheng, 2000, pp. 78–79)
(25) Ali memberitahu kamu tadi [apa1 (yang) Fatimah baca t1 ] (Malay)
Ali told you just.now what that Fatimah read
‘What did Ali tell you just now that Fatimah was reading?’ (Cole & Hermon, 2000, p. 105)
20) The Box can also be used for non-wh-phrasal focused elements. For example, Chomsky (to appear) gives the
example “Bill, John met yesterday” with the topicalized phrase Bill.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 25
21) Chomsky cites Riny Huijbregts for pointing this problem out.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 26
22) I assume that phi-features are present on N only, but see Danon (2011) for arguments that a D can share the
phi-features of an NP complement.
23) Much further examination of the categorial status of arguments is warranted, but this topic is beyond the scope
of this work. For further recent discussion of the categorial status of arguments, see Blümel and Holler (2021), and
references therein.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 27
Figure 4
4 Free Merge
Assume that IM is completely free, so that elements within an SO can be freely internally
set-Merged to the root of the SO. This is untenable as an infinite number of possible SOs
can be formed for every phrase and sentence.24, 25
If the discussions in the previous sections are correct, head movement is not a
possible syntactic operation, and this greatly limits the Free Merge possibilities. For
example, if head movement could apply freely, then illicit derivations such as those in
Figure 5 would be possible; heads such as n, book, and will would be able to undergo
IM. The resulting structures, however, would have to crash. Although structures of this
sort could be ruled out as involving failures of Labeling and/or interpretation, generating
them would involve a great deal of unnecessary and wasteful work. We can deal with
this issue by simply assuming that head-movement (IM of heads) is not a possibility.
Free Merge, if it exists, must apply to IM of arguments at the phase level only. Assum
ing Box Theory, only topicalized/focused arguments such as wh-phrases can escape from
a phase, and escape is via the Box. Not permitting non-focused/topicalized arguments to
escape greatly constrains Free Merge. Thus, I assume that Free Merge is limited by phase
boundaries.
Given the constraints of the language module, as presented in this paper, it turns out
that allowing Merge of arguments (NPs) to apply freely within a phase is not necessarily
a problem. Ill-formed constructions can generally be ruled out as Labeling failures.
24) For example, starting with 4 lexical items, if internal and external Merge are allowed to apply freely, then given 8
possible Merge operations, there are more than 7 million SOs that can be generated (Ginsburg & Fong, 2018). If there
is no limit on the number of Merge operations, then the number of SOs that can be generated is infinite.
25) Although external Merge could be free in some sense, in my model, external Merge is not free, since lexical items
are selected from an input stream and externally Merged together. In language, there are clearly constraints on which
lexical items can be externally Merged together.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 28
Figure 5
Consider the derivation of the simple statement in Figure 6. When v* is Merged, the
lower v*P phase is transferred. Then the subject Tom is externally set-Merged. After
Tense (past tense Tpast) is Merged, Tom undergoes IM with Tpast. After C is Merged,
at the phase-level, FormCopy applies and converts the lower inscription of Tom into a
Copy. The Spell-Out is computed as shown, whereby the frontier of the tree structure
is converted via pronunciation rules (PF rules) into the correct output. Tpast and read
combine to form the past tense read and functional elements such as v* and C are not
pronounced.
Figure 6
Tom Read a Book
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 29
Next, consider the derivation of the wh-construction in Figure 7. When the v*P phase
is completed, what is inside of the Box. The interrogative CQ is Merged, and then it
looks into the Box and finds what. Importantly, what does not undergo IM to the CP.
When what is accessed by CQ, FormCopy applies to what in its base position. Assume
that FormCopy can still access the lower phase via the Box. FormCopy also converts the
lower inscription of the subject Mary into a Copy—this FormCopy operation happens
at the phase level. The frontier of the derivation is show in Figure 7b. At Spell-Out, CQ
forces Tense to combine with it, forming CQ+T, and also T forces the auxiliary will to
combine with it, so the result is C-T-Aux. This is not movement in the syntax, but rather
displacement in the pronunciation of lexical items, as discussed in Section 3.5 above.
Figure 7
What Will Mary Buy?
I next turn to crashed derivations that result from Free Merge. Two failed derivations
(crashed derivations) of Tom read a book are shown in Figure 8. As Free Merge of
nominals is permitted at the phase level, it is possible for the object a book to undergo IM
with the SO headed by read, and it is also possible for the subject Tom to simply remain
in its base position (Tom is free to not undergo IM). In each case, there are Labeling
failures due to {XP, YP} structures that lack shared features.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 30
Figure 8
In some cases, there can be a large number of crashed derivations. Consider two crashed
derivations of (27). These result from IM of an argument to a position in which Labeling
cannot occur. In Figure 9a–d, the passivized object the book does not undergo IM to
the TP. In each derivation, there is a Labeling failure at the position in which the book
has undergone IM, due to a lack of shared features—the results are unlabelable {XP, YP}
structures.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 31
Figure 9
Labeling Failures for “The Book Will Have Been Being Read”
Next, consider What did John say that Mary will buy? which contains long-distance
wh-movement. A successful derivation is shown in Figure 10. This construction contains
3 phases. The verb say takes a clausal complement, but it is not an ECM verb. So I
assume that it occurs with the non-phasal v (Chomsky, 2001), which does not Agree with
an argument. After what is initially Merged, it is inserted into the Box. At the embedded
CP phase level, after that is Merged, FormCopy applies to the lower inscriptions of Mary.
When the matrix CQ is Merged, it looks into the Box and finds what. At Spell-Out, what
is pronounced together with CQ, and Tpast is pronounced adjacent to CQ, resulting in
pronunciation of did.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 32
Figure 10
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 33
Given Free Merge, there are five crashed derivations of What did John say that Mary will
buy?. All of these are shown in Figure 11. These crash because of unlabelable {XP, YP}
structures. In Figure 11a, what undergoes IM with the SO headed by the root buy to form
an unlabelable {XP, YP} structure. In Figure 11b, Mary remains in-situ, resulting in an
unlabelable {XP, YP} structure because Mary and v* do not share features. In Figure 11c,
Mary undergoes IM with the SO headed by will and remains in this position, resulting
in an unlabelable {XP, YP} structure because Mary and will do not share features. In
Figure 11d-e, the derivations crash in the matrix clause because John remains in its base
position forming an {XP, YP} structure with v, with which it does not share features.
These two derivations are almost identical except that Mary has undergone IM from v* to
Tpres in the embedded clause in Figure 11d, whereas in Figure 11e, Mary undergoes IM
to the SO headed by will before it lands in the TP.
I next turn to a typical Control construction, such as John tried to win. In this case,
there are crucially two separate arguments John in the same phase, assuming that the
lower non-finite TP is not a phase. A convergent derivation is shown in Figure 12.
John1 is externally-Merged in theta-position in the embedded clause. John2 is externally
Merged with the matrix v in theta-position. Both John1 and John2 undergo IM to their
respective TPs. In this case, FormCopy applies three times. FormCopy explains how
John1 and John2 have the same referent, but separate theta-roles.
Given Free Merge, for John tried to win, a number of potentially problematic situa
tions arise involving IM of the “wrong argument” as well as involving multiple instances
of John (multiple specifiers) in the same phrase edge.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 34
Figure 11
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 35
Figure 12
In Figure 13, Tpast undergoes an Agree relation with John2 (not John1). The uPhi on
Tpast probe and Agree with the phi-features on John2. Then John1 (from the embedded
clause) undergoes IM to the matrix TP. This derivation appears strange, since the wrong
John moves to TP. However, this is permitted if Merge is truly free.26
Figure 13
John Tried to Win (John1 in TP and John2 in vP)
26) Note that this issue only arises when there are multiple arguments in the same phase. For example, in What
did John say that Mary will buy? (Figure 10), Mary is contained within a separate phase from John, so Mary cannot
undergo IM to the matrix clause.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 36
One possibility is that this derivation in Figure 13 crashes because the phi-features of
John1 and John2, although identical in terms of person, number, and gender, are treated
differently because they are associated with separate arguments. This can be modeled
with what I refer to as a Unique Feature Rule—John1 comes with iPerson:3rd1, iNum
ber:sg1, and iGender:1, where the final 1 is a unique feature identifier.27 John2 comes with
person, number and gender features that are identically valued to those of John1, but the
unique feature identifier is 2 instead of 1, so the features are iPerson:3rd2, iNumber:sg2,
and iGender:2. Utilizing this Unique Feature Rule, this derivation can be ruled out. The
uPhi on Tpast Agree with the phi-features of John2. Then after John1 undergoes IM to the
TP, Minimal Search finds the phi-features on Tpast and on John1, but they are treated as
being different, due to the Unique Feature Rule. This is ruled out as a Labeling failure,
shown in Figure 14a.
I also modeled this construction in my computer model without the Unique Feature
Rule. When the Unique Feature Rule does not apply, then this derivation converges, as
shown in Figure 14b. FormCopy converts all lower instances of John into Copies, and the
highest John1 only is pronounced. Minimal Search finds equally valued person, number,
and gender features on John1 and on Tpast—it does not matter that Tpast has obtained
these phi-features via agreement with John2 instead of John1. Crucially, if this derivation
is permitted, there is no problem for Spell-Out—the correct John tried to win results.
Figure 14
27) Although gender is not important in English, I still assume that it is a feature of a noun. Eliminating gender
would not change this analysis.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 37
Although the Unique Feature Rule sounds like an added, and possibly unnecessary
complexity, it is necessary. Consider the derivation of Figure 15. Without the Unique
Feature Rule, if Tom remains within the v*P and does not undergo IM to the matrix TP,
Labeling would be possible within the v*P. This is because the uPhi of v* are checked by
the phi-features of Fred. The person, number, and gender features of Tom and Fred are
identical. If features are not treated as unique, Labeling should be possible within the v*P.
Figure 15
Tom Saw Fred
In order to rule out superfluous Labeling as in Figure 15, the Unique Feature Rule,
defined in (28), is required. Features that are valued the same way, but that are associated
with different lexical items, are not treated as being identical by the language module.
(28) Unique Feature Rule: Features associated with a particular lexical item are
unique from identically valued features associated with a separate lexical item.
(For example, iPerson:3rd1 of X are not identical to iPerson:3rd2 of Y.)
The derivations in Figure 16a–b below involve what would traditionally be referred to
as multiple specifiers. In Figure 16a, John1 is initially Merged in theta-position in the
non-finite clause. Then John1 undergoes IM to the matrix vP theta position, followed by
EM of John2. Assume that there are no problems for theta-role assignment, in accord
with Theta Theory (Chomsky, 1981), so John2 is able to obtain a theta role.28 Then John1
undergoes IM to the TP. Figure 16b is similar. In this case, John2 is successfully Merged
in matrix theta position. Then John1 undergoes IM to the vP. John2 undergoes IM to the
TP, but Tpast Agrees with the closest NP that it c-commands, John1. In both of these
28) If John2 needs to be directly Merged with v (without anything intervening) in order to obtain a theta-role, then
this is potentially a violation of Theta Theory. But whether or not this is the case is not clear. If one assumes that as
long as John2 is Merged with v (even if there is an intervening element), a theta-role can be assigned, then John2 can
obtain a theta-role.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 38
derivations, Tpast Agrees with a different John from the John that appears in the TP.
These derivations are ruled out by the Unique Feature Rule, so that the phi-features of
John2 are treated as being different from the phi-features of John1, as shown in Figure 17.
Figure 16
Figure 17
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 39
Figure 18
Derivations With Successive Applications of IM for ‘John Tried to Win’
To block derivations such as these, which can potentially result in infinite loops, there
needs to be a rule that blocks successive IM of multiple arguments to the same phrase.
Generation of these ill-formed structures can be blocked by the following rule that
simply bans consecutive applications of IM. After one application of IM of an argument,
the next operation cannot be IM. This solves the relevant problem and structures such
as those in Figure 18 cannot be generated. Thus, I will assume that this constraint No
Successive IM holds.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 40
(29) No Successive IM: *IM IM (An IM operation cannot directly follow another IM
operation.)
Note that if (29) holds and successive applications of IM are not permitted, then construc
tions in which there are consecutive applications of IM should not appear in language.
Whether or not this is truly the case is an open question. Some languages have multiple
wh-fronting (e.g. Bulgarian, Serbo-Croatian) that could potentially be formed by multiple
applications of IM (e.g., see Boeckx & Grohmann, 2003; Bošković, 2002; Rudin, 1988), as
in the following examples.
Box Theory offers an explanation. If these arguments are actually in the Box, from where
they are accessed, they are not treated like typical arguments that are set-Merged with
the core SO. Thus, their presence may be permitted at Spell-Out, with language-related
idiosyncrasies that are beyond the scope of this work. They are pronounced together, but
they do not actually involve consecutive applications of IM.
If the arguments in this paper are correct, Free Merge of arguments can generally be
constrained by Labeling, but Free Merge also produces multiple convergent derivations
for target constructions, which I turn to next.
5 Overgeneration
The main problem that Free Merge raises is that of overgeneration. Given Free Merge
of arguments, a large number of crashed derivations can occur. Furthermore, a single
construction can have multiple convergent derivations. As discussed in the previous
sections, I used a computer model to implement Free Merge of arguments within a
particular phase. The model also incorporates the Unique Feature Rule, which requires
features associated with a particular argument to be uniquely identified, and the No
Successive IM rule, which blocks consecutive applications of IM. The total numbers of
convergent and crashed derivations for the main sentences generated by the model used
for this paper are shown in Table 3–Table 6, which list the numbers of derivations that
crash and converge for each target construction. All complete crashed and convergent
derivations are available in the Supplementary Appendix (see Ginsburg, 2024).
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 41
Table 3
Basic Statements
Table 4
Control Constructions
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 42
Table 5
Wh-Questions
Table 6
Yes/No Questions
The question arises of whether or not it is reasonable for there to be multiple crashed
derivations for a single construction. For example, the following example (see discussion
of (27) above) has 63 crashed derivations.
The ideal model would most likely be one that does not generally produce crashed
derivations. That said, it is crucial to note that if Merge is free, then derivations of this
sort can be generated. Given Labeling though, they generally crash, which is desired.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 43
Figure 19
Multiple Derivations of ‘Tom Will Read a Book’
Four possible derivations (out of 65) for The book will have been being read are shown in
Figure 20. In Figure 20a, the book undergoes IM to Psv, Prog, and Tpres. In Figure 20b, it
undergoes IM to Psv, will, and Tpres. In Figure 20c, the book undergoes IM to Perf and
Tpres. In Figure 20d, it undergoes IM to read, will, and Tpres.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 44
Figure 20
Six successful derivations (out of 12) of John tried to win are shown in Figure 21. Note
that the lower John1 can remain in-situ in theta-position, or it can undergo IM to a
higher position. FormCopy converts John1 into a Copy, so there are no Labeling problems
in these positions. Both John1 and John2 are Merged in theta-positions and FormCopy
results in only John2 being pronounced. The two derivations in Figure 21e–f involve IM
of both John1 and John2 with v. Since Tpast Agrees with the same inscription of John
that is present in the TP, Labeling is possible (without violating the Unique Feature Rule).
None of these derivations cause problems for Labeling, and all converge successfully.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 45
Figure 21
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 46
be stranded as in (33)b–c, which can be accounted for if all the children raises through
a VP-internal position. Assuming that arrive is unaccusative, all the children should
originate as the complement of arrive. In particular, in (33)c, if the adverbial quickly is
within the VP, then all must also be in a VP-internal position.29
McCloskey (2000) gives the following examples with quantifier stranding from West
Ulster English, which support the idea that there is internal set-Merge (successive-cyclic
IM) of an argument in intervening positions.30
29) See Sportiche (1988) and Stroik (2009), among others, for evidence of successive-cyclic movement of subjects.
30) How exactly Labeling works in cases in which a quantifier is stranded remains unclear, as discussed in Blümel
(2018). For example, it isn’t clear how all and the VP arrived are labeled in (33), as they potentially form an {XP,
YP} structure {{all the children}1, {arrive t1}} with no shared features. Blümel discusses several possibilities. One
possibility is that the quantifier functions as an adverbial, in which case it could be an adjunct, and thus potentially
not cause problems for Labeling. Another possibility, referred to as Distributed Deletion (Fanselow & Ćavar, 2002), is
that copies can be selectively pronounced so that “pronunciation of members of a movement chain can be scattered
(Blümel, 2018, p. 67).” In my model, the easiest way to implement Labeling with quantifiers would probably be
to treat them in a similar manner to determiners, and make them adjuncts that are pair-Merged to an SO. As
pair-Merged adjuncts they would not cause problems for Labeling.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 47
the subject undergoes IM with Prog. In both cases, the result is an unlabelable {XP, YP}
structure which crashes due to a lack of shared features.
Figure 22
Figure 23
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 48
There are issues, however, with unaccusative and passive constructions. Given the stand
ard assumption that the surface subject of an unaccusative and passive originates as
an object, derivations in which an object remains in situ are not necessarily ruled out.
The model incorrectly generates arrived Mary, shown in Figure 24, was read the book
in Figure 25, and John expects to arrive Mary in Figure 26. Crucially, all of the other
convergent derivations for these examples that are generated by this model result in the
well-formed output (4 other derivations successfully converge as Mary arrived, 8 other
derivations as The book was read, and 4 other derivations as John expects Mary to arrive).
Thus, the model does produce the correct derivations most of the time.
Figure 24
*Arrived Mary (Mary Arrived)
Figure 25
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 49
Figure 26
In Figure 24–Figure 26, T probes and Agrees with the underlying object (which is the
closest argument), and T’s uPhi are checked. Then T is not part of an {XP, YP} structure,
so if it labels, it must label by itself. As discussed in Section 3.1, Chomsky deals with
the EPP requirement for an overt subject in English by relying on strength, with the
proposal that English T is too weak to label by itself, and so it requires an {XP, YP}
configuration for Labeling. If T is weak and is stipulated to require an {XP, YP} structure
for Labeling, then these derivations are correctly ruled out. However, strength is an
unclear stipulation, which I do not adopt. See Section 3.7 above. In my model, non-finite
T may or may not have an overt specifier. In the derivation for John tried to win, shown
in Figure 27a, John1 remains in-situ (although it can also raise to toT, where it will be
converted into a Copy). In the derivation of John expects Mary to arrive shown in Figure
27b, Mary appears in the non-finite clause forming an {XP, YP} structure with toT (where
shared Person features label). If non-finite T were weak, then toT would always require
an overt “specifier”, contrary to fact.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 50
Figure 27
The simplest assumption is that there are no strong or weak heads in the syntax, thus
suggesting that Labeling Theory does not provide an explanation for the need for a
subject to be in the traditional specifier of TP position (at least not in certain cases). I do
not have a clear solution to this issue (which is the long-debated problem of the EPP), but
one possibility is that the requirement for an overt subject in languages such as English
is primarily a constraint on Spell-Out.
Richards (2016) develops what he calls Contiguity Theory, which takes the position
that movement operations can be influenced by phonological structures, so that syntax
and phonology are heavily connected. Richards proposes that in English, T is a suffix
that must follow a metrical boundary, and a subject in the TP provides this metrical
boundary. A metrical boundary is the edge of a metrical foot, where a foot contains
one or more syllables, one of which receives more stress than the others. In a language
such as Spanish, which does not require an overt subject, the vowel that precedes a
tense morpheme is stressed, and thus the syllable before the Tense morpheme follows
a metrical boundary. In Spanish, a metrical boundary can occur within a word. For
example, in (35)a–b, the boldfaced tense morphemes follow a metrical boundary at the
end of the verbal root.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 51
Richards (2016, p. 15) writes that in languages such as English, “metrical boundaries
occur only on complete words, which are in turn found in specifiers.” For example, in
(36), there is supposedly a metrical boundary at the edge of the subject there, which
precedes the verb containing the tense morpheme suffix -ed.
Note that Richards argues that phonological constraints have specific effects on a syntac
tic derivation, and not necessarily only at Spell-Out, writing “the narrow syntax can
make reference to, for instance, metrical boundaries” (Richards, 2016, p. 27). Consider
how this approach can deal with examples in which T does not overtly follow a specifier
at Spell-Out. In (37)a–b, I assume that T raises at Spell-Out. But in the syntactic structure,
T still follows the subject, which has a metrical boundary. If the requirement for the affix
T to follow a metrical boundary applies at the level of narrow syntax, then these can be
accounted for.
31) Contiguity Theory, as developed in Richards (2016) is not necessarily compatible with the model developed in this
paper. For example, it makes extensive use of head movement and phrasal movement that my model would consider
to be suspect. In addition, the requirement that T follow a metrical boundary in English does not directly rule out a
non-argument from appearing in the typical specifier of TP position. Thus, (i)-(ii) are not directly ruled out, although
I note that these sound better to my ears than their counterparts without the adverbials; in particular (ii) sounds like
it might be possible (to my ears).
(i) *John expects quickly to arrive Mary.
(ii) *?Quickly was read the book.
Richards is able to limit the initial specifier position to arguments in English, thus ruling out examples such as
(i)–(ii), by means of complex proposals that the English affix T must follow a metrical boundary and that T must
be in the same prosodic domain as a goal that it Agrees with. T is a probe and a subject is its goal. If I understand
correctly, if the subject does not move to the specifier of the TP then it will be in a different prosodic domain from
its probe. If the subject moves to the specifier of TP, then both the probe and goal are in the same prosodic domain
and T follows a metrical boundary created by the subject. The notion that phonological constraints can account for
perplexing properties of language, such as the EPP property, may be promising, but the complexity of the Contiguity
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 52
Figure 28
*Who Do You Think That Read the Book?
Theory Approach is potentially problematic from the perspective of the Strong Minimalist Thesis. Further analysis is
required.
32) Chomsky (2015) develops an analysis of the that-trace effect. See Ginsburg (2016) for discussion of potential
problems with that analysis. If insertion into the Box requires IM, then it might be possible to claim that who
undergoes IM to the CP phase edge and then enters the Box, after which it is invisible to Labeling, which creates a
problem for Labeling of the TP. Although, as noted above, requiring IM for insertion into the Box seems superfluous.
Also, following Chomsky (2015), if one assumes that a root is weak and requires Labeling via an {XP, YP} structure (a
position which I do not take), then the question arises of how a wh-phrasal object and a verbal root are labeled after
the wh-phrasal object is inserted into the Box. If an object and a root label before the object is inserted into the Box,
then it isn’t clear why Labeling couldn’t also occur before a wh-subject is inserted into the Box.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 53
I suggest that the ill-formedness of that-trace effect constructions may have to do with
extra-syntactic factors applying at Spell-Out. First of all, when that is not pronounced,
the that-trace effect goes away. Chomsky (2015) relies on dephasing to account for
this (in the absence of that, the embedded CP is no longer a phase), but dephasing
is an extraordinarily complex process that also violates the No Tampering Condition.
A simpler assumption is that the that-trace effect is simply dependent on whether or
not C is pronounced overtly. A promising possibility is that the that-trace effect is not
syntactic, but has to do with phonological factors. Sato and Dobashi (2016) propose that
the that-trace effect results from constraints on prosodic phrasing. They propose that
there is a “PF condition” that “[f]unction words cannot form a prosodic phrase on their
own (2016, p. 1)”, and that when that is followed by a trace, that ends up forming a
prosodic phrase by itself, which is not permitted. When that is followed by a subject, as
well as other types of phrases such as adverbials, it does not form a prosodic phrase on
its own, and there is no problem. It is also notable that the that-trace effect is not found
in certain English dialects as well as in many other languages. For example, Sobin (1987)
points out that some English speakers do not find some that-trace constructions to be
ill-formed. This suggests that the cause of the that-trace effect may not be syntactic in
nature. If the that-trace effect lacks a syntactic cause, then it needs to be accounted for at
Spell-Out. I leave in-depth examination of this issue for future work.
6 Conclusion
In this paper, I have discussed Free Merge as implemented by a computer model. I pre
sented the basic components of this model, which attempts to take a “simple” approach
(although not necessarily as simple as possible) to language generation, dispensing with
complex mechanisms. I have shown that in general, Labeling, combined with phase
boundaries, is sufficient to constrain Free Merge. Also, note that Theta Theory and Case
Theory play no clear role in ruling out derivations, and Labeling alone is generally
sufficient to constrain Free Merge.
There is a certain amount of overgeneration that is a potential problem. In order
to deal with overgeneration, I needed to propose the Unique Feature Rule and No
Successive IM. If these are truly principles of language, they require further examination.
Overgeneration of ill-formed structures ideally should not occur or should be severely
limited, probably more so than presented in this paper. Furthermore, overgeneration of
well-formed structures is an issue, but the potentially problematic examples discussed
in this paper can possibly be eliminated at Spell-Out. The beauty of Free Merge is
that IM requires no trigger, and a variety of attested IM operations fall out from the
model. Note, however, that feature-driven IM has the advantage of doing away with this
overgeneration problem. On the other hand, feature-driven Merge is complicated by the
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 54
need for a variety of features to trigger IM. Whether or not feature-driven Merge should
truly be eliminated from the theory requires further examination.
Funding: This work was supported by the Japan Society for the Promotion of Science Grant-in-Aid for Scientific
Research (C), Grants #20K00664 and #24K03964.
Acknowledgments: I would like to thank Kleanthes K. Grohmann and two anonymous reviewers for their
extremely helpful comments. I would also like to thank Hiroshi Terada and Sandiway Fong for helpful comments and
discussion. All errors are my own.
Competing Interests: The author has declared that no competing interests exist.
Author Note: This paper is an extensively-revised and expanded version of Ginsburg (2022), a proceedings paper
from the Joint Conference on Language Evolution (JCole) Kanazawa, Japan 2022.
Trees for this paper were created with a Tree Drawing Program that I created. This program runs directly in the
browser and it is available for anyone to use. Note that there are bugs.
Tree Drawing Program: https://ginsburg-lab.h.kyoto-u.ac.jp/JGTreeDrawingProgram.html
Data Availability: This paper discusses derivations that were modeled with a computer program. The complete
derivations that are generated by this computer program are available in the Supplementary Appendix (see Ginsburg,
2024).
Supplementary Materials
The Supplementary Materials consists of webpages (see Ginsburg, 2024) that display:
• information about how the computer model that the author used works
• the complete derivations that were produced by the computer model that is presented in this
paper
References
Abney, S. (1987). The English noun phrase in its sentential aspect [Doctoral dissertation].
Massachusetts Institute of Technology.
Ackema, P., & Neeleman, A. (2001). Context-sensitive spell-out and adjacency [Unpublished
manuscript]. Utrecht University and University College London.
Baker, M. (1988). Incorporation. Chicago University Press.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 55
Berwick, R. (2011). All you need is Merge: Biology, computation, and language from the bottom up.
In A. M. Di Sciullo & C. Boeckx (Eds.), The biolinguistic enterprise: New perspectives on the
evolution and nature of the human language faculty (pp. 461–491). Oxford University Press.
Blümel, A. (2018). Q-float in West Ulster English and labeling. Yearbook of the Poznan Linguistic
Meeting, 4(1), 55–73.
Blümel, A., & Holler, A. (2021). DP, NP, or neither? Contours of an unresolved debate. Glossa: A
Journal of General Linguistics, 7(1), Article 153. https://doi.org/10.16995/glossa.8326
Bobaljik, J. D. (2008). Where’s phi? Agreement as a postsyntactic operation. In D. Harbour, D.
Adger & S. Béjar (Eds.), Phi theory: Phi features across modules and interfaces (pp. 295–328).
Oxford University Press. https://doi.org/10.1093/oso/9780199213764.003.0010
Boeckx, C., & Grohmann, K. K. (Eds.) (2003). Multiple wh-fronting. John Benjamins.
Boeckx, C., & Grohmann, K. K. (2007). Putting phases in perspective. Syntax, 10(2), 204–222.
https://doi.org/10.1111/j.1467-9612.2007.00098.x
Borer, H. (2005a). In name only (Structuring sense, Vol. 1). Oxford University Press.
Borer, H. (2005b). The normal course of events (Structuring sense, Vol. 2). Oxford University Press.
Borer, H. (2013). Taking form (Structuring sense, Vol. 3). Oxford University Press.
Bošković, Ž. (2002). On multiple wh-fronting. Linguistic Inquiry, 33(3), 351–383.
https://doi.org/10.1162/002438902760168536
Bruening, B. (2009). Selectional asymmetries between CP and DP suggest that the DP hypothesis is
wrong. University of Pennsylvania Working Papers in Linguistics, 15(1), 27–35.
https://repository.upenn.edu/pwpl/vol15/iss1/5
Bruening, B. (2020). The head of the nominal is N, not D: N-to-D Movement, Hybrid Agreement,
and conventionalized expressions. Glossa: A Journal of General Linguistics, 5(1), Article 15.
https://doi.org/10.5334/gjgl.1031
Bruening, B., Dinh, X., & Kim, L. (2018). Selection, idioms, and the structure of nominal phrases
with and without classifiers. Glossa: A Journal of General Linguistics, 3(1), Article 42.
https://doi.org/10.5334/gjgl.288
Carstens, V. (2003). Rethinking complementizer agreement: Agree with a case-checked goal.
Linguistic Inquiry, 34(3), 393–412. https://doi.org/10.1162/002438903322247533
Cecchetto, C., & Donati, C. (2015). (Re)Labeling. MIT Press.
Chen, V. (2018). The raising-to-object construction in Puyuma and its implications for a typology of
RTO. Glossa: A Journal of General Linguistics, 3(1), Article 111. https://doi.org/10.5334/gjgl.423
Cheng, L. (2000). Moving just the feature. In U. Lutz, G. Müller, & A. von Stechow (Eds.), Wh-scope
marking (pp. 77–99). John Benjamins.
Chomsky, N. (1957). Syntactic structures. Mouton.
Chomsky, N. (1981). Lectures on government and binding. Foris.
Chomsky, N. (1986). Knowledge of language: Its nature, origins, and use. Praeger.
Chomsky, N. (1993). A Minimalist Program for linguistic theory. In K. Hale & S. J. Keyser (Eds.),
The view from building 20: Essays in linguistics in honor of Sylvain Bromberger (pp. 1–52). MIT
Press.
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 56
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 57
Embick, D., & Marantz, A. (2008). Architecture and blocking. Linguistic Inquiry, 39(1), 1–53.
https://doi.org/10.1162/ling.2008.39.1.1
Embick, D., & Noyer, R. (2001). Movement operations after syntax. Linguistic Inquiry, 32(4), 555–
595. https://doi.org/10.1162/002438901753373005
Epstein, S. D., Kitahara, H., & Seely, T. D. (2014). Labeling by Minimal Search: Implications for
successive-cyclic A-Movement and the conception of the postulate “phase”. Linguistic Inquiry,
45(3), 463–481. https://doi.org/10.1162/LING_a_00163
Epstein, S. D., Kitahara, H., & Seely, T. D. (2016). Phase cancellation by external pair-merge of
heads. Linguistic Review, 33(1), 87–102. https://doi.org/10.1515/tlr-2015-0015
Epstein, S. D., Kitahara, H., & Seely, T. D. (2022). A simpler solution to two problems revealed about
the composite operation Agree. In S. D. Epstein, H. Kitahara, & T. D. Seely (Eds.), A Minimalist
theory of simplest Merge (pp. 111–115). Routledge.
Fanselow, G., & Ćavar, D. (2002). Distributed deletion. In A. Alexiadou (Ed.), Theoretical approaches
to universals (pp. 65–107). John Benjamins.
Fong, S., & Ginsburg, J. (2014). A new approach to tough-constructions. In R. E. Santana-LaBarge
(Ed.)., Proceedings of the 31st West Coast Conference on Formal Linguistics (pp. 180–188).
Cascadilla Proceedings Project.
Fong, S., & Ginsburg, J. (2019). Towards a Minimalist Machine. In R. E. Berwick & E. P. Stabler
(Eds.). Minimalist parsing (pp. 16–38). Oxford University Press.
https://doi.org/10.1093/oso/9780198795087.003.0002
Fong, S., & Ginsburg, J. (2023). On the computational modeling of English relative clauses. Open
Linguistics, 9(1), Article 20220246. https://doi.org/10.1515/opli-2022-0246
Gallego, Á. J. (2010). Phase theory. John Benjamins.
Gallego, Á. J. (2017). Remark on the EPP in Labeling Theory: Evidence from Romance. Syntax,
20(4), 384–399. https://doi.org/10.1111/synt.12139
Georgi, D., & Müller, G. (2010). Noun-phrase structure by reprojection. Syntax, 13(1), 1–36.
https://doi.org/10.1111/j.1467-9612.2009.00132.x
Ginsburg, J. (2016). Modeling of problems of projection: A non-countercyclic approach. Glossa: A
Journal of General Linguistics, 1(1), Article 7. https://doi.org/10.5334/gjgl.22
Ginsburg, J. (2022). Constraining free Merge: Labeling and the theta-criterion. In A. Ravignani, R.
Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D.
Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), Proceedings of the Joint Conference on Language
Evolution (JCole) Kanazawa, Japan 2022, 237–244.
Ginsburg, J., & Fong, S. (2018, June 10). On constraining Free Merge. 43rd Meeting of the Kansai
Linguistic Society. Konan University, Kobe, Japan.
https://ginsburg-lab.h.kyoto-u.ac.jp/WebPresentations/KLS43Pres-vers7.pdf
Ginsburg, J., & Fong, S. (2019). Combining linguistic theories in a Minimalist Machine. In R. E.
Berwick & E. P. Stabler (Eds.), Minimalist parsing (pp. 39–68). Oxford University Press.
https://doi.org/10.1093/oso/9780198795087.003.0003
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 58
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Ginsburg 59
McDaniel, D. (1989). Partial and multiple wh-movement. Natural Language and Linguistic Theory, 7,
565–604. https://doi.org/10.1007/BF00205158
Miyagawa, S. (2005). On the EPP. In M. McGinnis & N. Richards (Eds.), Perspectives on phases (pp.
201–236). MIT Working Papers in Linguistics.
Mizuguchi, M. (2017). Labelability and interpretability. Studies in Generative Grammar, 27(2), 327–
365. https://doi.org/10.15860/sigg.27.2.201705.327
Müller, G. (2004). Phrase impenetrability and wh-intervention. In A. Stepanov, G. Fanselow, &
R.Vogel (Eds.), Minimality effects in syntax (pp. 289–326). Mouton de Gruyter.
https://doi.org/10.1515/9783110197365
Oishi, M. (2015). The hunt for a label. In H. Egashira, H. Kitahara, K. Nakazawa, T. Nomura, M.
Oishi, A. Saizen, & M. Suzuki (Eds.), Untiring pursuit of better alternatives (pp. 222–334).
Kaitakusha.
Pesetsky, D., & Torrego, E. (2001). T-to-C movement: Causes and consequences. In M. Kenstowicz
(Ed.), Ken Hale: A life in language (pp. 355–426). MIT Press.
Pesetsky, D., & Torrego, E. (2011). Case. In C. Boeckx (Ed.), The Oxford handbook of linguistic
Minimalism (pp. 52–72). Oxford University Press.
Pires, A. (2006). The Minimalist syntax of defective domains: Gerunds and infinitives. John
Benjamins.
Platzack, C. (2013). Head movement as a phonological operation. In L. L. Cheng & Cover, N. (Eds),
Diagnosing syntax (pp. 21–43). Oxford University Press.
Postal, P. M. (1974). On raising: One rule of English grammar and its theoretical implications. MIT
Press.
Radford, A. (2016). Analyzing English sentences (2nd ed.). Cambridge University Press.
Raposo, E. (1987). Case theory and Infl-to-Comp: The inflected infinitive in European Portuguese.
Linguistic Inquiry, 18(1), 85–109. https://www.jstor.org/stable/4178525
Richards, M. D. (2006). Object shift, phases, and transitive expletive constructions in Germanic.
Linguistics Variation Yearbook, 6(1), 139–159. https://doi.org/10.1075/livy.6.07ric
Richards, M. D. (2007). On feature inheritance: An argument from the Phase Impenetrability
Condition. Linguistic Inquiry, 38(3), 563–572. https://doi.org/10.1162/ling.2007.38.3.563
Richards, M. D. (2011). Deriving the edge: What’s in a phase? Syntax, 14(1), 74–95.
https://doi.org/10.1111/j.1467-9612.2010.00146.x
Richards, N. (2016). Contiguity theory. MIT Press.
Roberts, I. (2010). Agreement and head movement: Clitics, incorporation, and defective goals. MIT
Press.
Roberts, I. (2011). Head movement and the minimalist program. In C. Boeckx (Ed.), The Oxford
handbook of linguistic Minimalism (pp. 195–219). Oxford University Press.
Rudin, C. (1988). On multiple questions and multiple wh-fronting. Natural Language and Linguistic
Theory, 6, 445–501. https://doi.org/10.1007/BF00134489
Sato, Y., & Dobashi, Y. (2016). Prosodic phrasing and the that-trace effect. Linguistic Inquiry, 47(2),
333–349. https://doi.org/10.1162/LING_a_00213
Biolinguistics
2024, Vol. 18, Article e14015
https://doi.org/10.5964/bioling.14015
Constraining Free Merge 60