Dear Stevan,

We have now revised our BBS manuscript "Toward a mechanistic psychology of dialogue" and would like to resubmit it to the journal. Following your advice we have made major revisions in response to the very helpful and positive comments of the 9 referees. We describe how we have dealt with each of their ‘in bold’ comments in detail below. First, here is a summary of how we have dealt with the main points that you raised in the letter.

1. SIGNIFICANCE : There was some concern as to whether the paper would elicit wide-ranging commentaries from across the BBS specialities. In the revision we have done two things to rectify this. First, following a suggestion from referee 6, we have included an additional table (Table 2, section 10) in which we contrast our dialogue-based approach to language processing to the traditional monologue-based approach. This should make it easier for commentators to identify points of commentary (see response 6.1 below). Second, We have included a new section 9 that briefly sketches the wider implications of our account, for such areas as automaticity in social interaction, imitation, language acquisition, theory of mind, and human-computer interaction. We also make reference to the literature on automaticity within social psychology in section 3.2, when discussing channels of alignment. This was in response to referee 9 (see response 9.3).

In fact, having presented this work at various meetings we have been contacted by quite a number of researchers in the general areas of cognitive science, linguistics and philosophy who have expressed interest in writing commentaries on the paper.

2. PRESENTATION: There were two main points about presentation that the referees raised. The first concerned the interpretation and discussion of Figures 2 & 3. In response to referees 3, 7 and 9 we have revised the figures and also substantially revised section 3 of the paper in which the figures are discussed. The second concern was with our discussions of ‘situation models’ and ‘alignment’. In response to referees 2 and 3 we have clarified what we mean by situational model and given a more extensive discussion of the nature of the alignment process at this and other levels (sections 2.1 and 2.2 of the revised paper and response 2.2, 2.3, 3.3 & 6.2).

3. SCHOLARSHIP: We have taken into account the literature that the referees very helpfully pointed to and also included additional references to relevant work that has come out since the original submission. There is now nearly 30% more work referred to in the paper.

4. REASONING: A few referees raised specific points about our argument. In particular referee 1 was concerned about the arguments in section 7 of the original. We have completely revised this section taking into account many of the helpful suggestions from this referee (See responses 1.1-1.3, 1.5-1.12 below). Other small points were raised by referee 5 and we have taken these into account (see responses 5.1-5.3 below).

Detailed responses:

Referee 1: Jonathan Ginzburg

1.1. role and structure of context in dialogue interpretation receives little attention

We have handled this in the revised version in Section 7.1, with a more extended treatment of examples, taken from work by Ginzburg, Clark, and Morgan.

1.2. Just and Carpenter (and incrementality)

At the end of section 7.1, We have downplayed the importance of incrementality. We still regard it as highly relevant to accounts of dialogue, but we accept that it is equally relevant to accounts of monologue (including reading).

1.3. model is not presented in sufficient detail repair/clarification, which hinge precisely on mismatches between interlocuters, are given a cursory discussion weakest part of the paper, and requires significant modification

We have significantly revised and clarified the model in sections 2-4, and have related it to linguistic theory in a substantially revised section 7. Section 7.1 now includes an extensive discussion of repairs and clarifications. We are very grateful to the reviewer for bringing these issues to our attention, as they are all highly relevant to our model, and some provide us with considerable support for it. In particular, we now argue that the parallelism constraints on non-sentential utterances in dialogue contexts follow directly from the process of alignment.

1.4 Ross 1969 and Morgan 1973

These are referred to at the beginning of Section 7.1. We particularly acknowledge the importance of Morgan’s work in relation to parallelism constraints on short answers.

1.5 work by formal and computational semanticists precisely on developing theories of information states and their dynamics in dialogue [see e.g. work within the EU TRINDI project, and the annual series of conferences on the formal semantics and pragmatics of dialogue (MUNDIAL, TWENDIAL, AMSTELOGUE, GOTALOG)].

We have included references to recent work in this tradition in Section 7.1.

1.6. One problem with this section is that the sole example considered by the authors (A: Mary donated B: ...A book to John) is unnatural

We fully accept this criticism, and now quote naturally occurring examples (e.g., from Ginzburg, Clark, and Garrod and Anderson).

1.7. some evidence from a corpus like the British National Corpus, or the London Lund etc concerning the frequency and nature of such utterances would be useful.

We now discuss recent corpus work by Fernandez and Ginzburg, which provides a rough indication of the prevalence of different kinds of non-sentential utterances. This work demonstrates the importance of such phenomena in real dialogue.

1.8. interactive alignment (IA) model on its own does not quite provide a full explication of such utterances …the authors do not discuss this type of interaction

See Response 1.3.

1.9 Poesio and Traum,1997

We have referred to this work in section 7.1

1.10 certain key aspects of the model are not presented in enough detail to enable one to draw strong conclusions or conterexamples …….Whether these are counterexamples to the IA model can be judged only once it is stated in a more algorithmic fashion. Ginzburg 1997, Ginzburg and Cooper 2001

See Responses 1.3 and 1.5. We have referred to a number of examples similar to those discussed in the review in Section 7.1 and showed how they are consistent with our model.

1.11 This section (e.g., 7.2) is problematic in a number of ways …repeated discussion of a single example …. unclear what relevance .. these have for the various cases of dialogue ellipsis where there is no such intuition, e.g. short answers, sluicing, clarification ellipsis,

See Responses 1.3 and 1.6.

1.12 no argument for this is provided and no indication …. Ginzburg and Sag 2000

As noted in Response 1.2, we have downplayed the importance of incrementality for the model (see end of Section 7.1). Our preference is for a grammar that treats well-formed dialogue turns as constituents, and we mention CCG as a theoretical framework that is consistent with this. We point out the value of integrating context into the grammar earlier (second paragraph, Section 7.1).

1.13 "according to standard linguistics, many of the utterances are not grammatical sentences (e.g. only one of the first six contains a verb)". This criticism is a bit vague

…. see Ginzburg, Purver, and Sag

We have removed the words "according to standard linguistics." (beginning of Section 2). Here we are simply pointing out that the language looks disorganized (compared to well-crafted monologue), and are not making a point about linguistic theory.

1.14 Clark 1996: iterative conceptions of common ground

We think that we were simply not clear enough in our original discussion on this point. We have now rectified this, and incorporate a reference to Barwise (1989). This discussion is now at the beginning of Section 4.1.

1.15. Ginzburg 1999

We deal with this point in Footnote 6 (Section 7.1).

1.16. Johnson and Postal 1980, Ades and Steedman 1982, Kaplan and Bresnan 1982, Gazdar, Klein, Pullum, and Sag 1985, Pollard and Sag 1994), I think the reference to Jackendoff 1999 as a basic reference concerning constraint-based grammar is a bit misleading chronologically

We completely accept that Jackendoff’s framework is not the first to employ multiple generative components, and have now included a number of earlier references in Section 7.2. However, Jackendoff’s account integrates particularly well with the interactive alignment model. We make this clear in the second paragraph of Section 7.2 and Footnotes 9 and 10.

Referee 2: Ellen Bard

2.1. sell short contributions by othersClark and Marshall, 1981 Brown and Dell (1987)

At the beginning of Section 4.1, we explicitly discuss Clark and Marshall’s contribution to the theory of common ground. Apart from giving appropriate credit to their work, our discussion allows us to contrast it with our new notion of implicit common ground. Later in this section, we credit Brown and Dell for pointing out that speakers who have similar representations to listeners may appear to be sensitive to listener knowledge when in fact they are only concerned with their own mental states.

2.2. "situation models" is critical to the argument, but is not defined here.

In Sections 2.1 and 2.2, we have considerably expanded our discussion of situation models and of how interlocutors align aspects of their situation models.

2.3. "alignment" … the authors think otherwise and they should defend their view.

Our view of priming is that it underpins the alignment mechanism and should not simply be regarded as a behavioral effect. Although priming occurs between items that have similar but not identical representations (e.g., semantic priming), it is strongest between items that have identical representations (repetition priming). Our view is that alignment is primarily driven by repetition priming. We first discuss this at the beginning of Section 2.2. We expand the discussion is Sections 2.3, 2.4, and 3.2 (particularly the final paragraph).

Referee 3: Art Glenberg

3.1. whether commentary will be elicited from a broader readership

We take up this issue at the beginning of the letter.

3.2. more thorough description of the task

We provide this in Footnote 1, at the beginning of Section 2.

3.3. what the authors mean by a "mental model" and alignment

The first part is the same as Ellen Bard’s point 2.2 (note that we now consistently used the term situation model). We explicitly define alignment when we introduce the notion in the penultimate paragraph of the introduction to Section 2.

3.4. Goldinger

We very much thank Art Glenberg for bringing this extremely interesting paper to our attention. As we point out in the penultimate paragraph of Section 3, it provides support for alignment at a sound-based level and for the assumption of parity between comprehension and production. We also relate it to Ginzburg’s work on clarification ellipsis in Section 7.1.

3.5. not certain as to the two levels being referred to.

We clarify the contrast in the second paragraph of section 2.4 and in Footnote 3.

3.6 missing an important point or Figure 3 is giving me the wrong idea

We cover this at the beginning of the letter.

Referee 4: Bernhard Hommel

4.1. slightly more extended advanced organizer somewhere at the end of 2.0.

We completely agree that such an organizer would be helpful. We decided to include it at the end of Section 1 (final three paragraphs), as this seemed to be the most appropriate location for it after revision.

4.2. make explicit that A and B in Figures 2 and 3 refer to different people.

We have now done this in the figure captions.

Referee 5: Gerard Kempen

5.1 this is a non sequitur

Originally we said "As a result, these accounts offer limited and inadequate theories of the mechanisms that underlie language processing in general." We have changed this to the slightly weaker claim "As a result, these accounts may only offer limited theories of the mechanisms that underlie language processing in general."

5.2. Levelt would argue, I presume, that the standard sequence of levels is traversed FOR EVERY INCREMENT separately.

We disagree with this point, because "That picture" presumably functions as a single increment. We argue that its acoustic form is produced before it has been assigned a grammatical function. However, our argument is made clearer by replacing "That picture" with the single word "Pictures," in order to avoid issues about the syntactic combination of the words. We have made this change in the text.

5.3. "in monologue there is no opportunity to call on aligned linguistic representations". This may be interpretated as denial of the authors' earlier claim (Section 6) that during monitoring speakers engage in self-alignment.

In the new Section 8 (paragraph 4), we now state that the problem in monologue is that there is no opportunity to align with your addressee. Self-alignment via monitoring still takes place, but is no substitute for other-alignment.

Referee 6: Art Markman

6.1. While I am enthusiastic about this paper overall, I think the authors could do more to make this paper easier to comment on. Two things would be particularly helpful in this regard. First, it would be helpful to have some sort of summary table that contrasted the authors' theory with traditional linguistic assumptions

In the revised version, we include such a summary table (Table 2) in the concluding section 10.

6.2. describe the mechanisms that might achieve alignment at different levels of representation

One of our main changes has been to increase the explication of situation models (in our responses 2.2 and 3.3 above). As part of this discussion, we have considered the mechanisms of alignment at different levels (see Sections 2.2 and 2.3). However, Section 3.2 (first paragraph) points out that we do not have fully specified theories of alignment for every level. We now explicitly consider Boroditsky’s important work on structural priming from the spatial to the temporal component of situation models, and we thank Art Markman for bringing this to our attention.

6.3. Gentner and Markman (1997, American Psychologist; Markman & Gentner, 1993, Cognitive Psychology).

We address these papers as part of our extended discussion of the alignment of situation models, at the end of Section 2.2.

6.4. flesh out the alignment processes at least to the degree that it will enable people to write specific commentaries

See response 6.2.

Referee 7: Keith Rayner

7.1 who would take issue with the underlying theoretical claims being made here?do a better job of clearly spelling out which points that they make would be accepted by most people and which claims are controversial.

Throughout the paper, we have tried to indicate why our account will be seen as controversial (see beginning of this letter). One group who will be troubled by the implications of this work are those who are committed to a monological approach, either for theoretical or methodological reasons. Levelt and his collaborators will have particular problems with our proposals that their models of isolation production will not transfer easily into dialogue, as will other researchers such as Bock and Garrett. The mechanistic basis for our accounts is highly problematic for the far more "intentional" approach to dialogue advocated by Clark and his colleagues, and by many sociolinguistic approaches to dialogue such as conversational analysts such as Schegloff and Sacks. Note that we are not attempting to propose a series of novel "micro-mechanisms" (though the claim that alignment at one level leads to more alignment at other levels has not, to our knowledge, been made before). Instead, our intention is to show how relatively well established mechanisms can explicate aspects of real language processing in a highly novel way.

7.2 didn't find Figures 2 and 3 very helpful in trying to figure out the authors' claims, and I don't think that they do much to enlighten their readers as to what exactly is supposed to be conveyed via these figures provide a much tighter description of what the figures are intended to convey.

We cover this at the beginning of the letter.

7.3. important recent paper by Ferreira and Dell (2000, Cognitive Psychology).

This was simply an oversight on our part. We now mention it at the beginning of Section 4.2 when discussing Brown and Dell.

7.4. Maybe all that is needed is a boost at the end again saying why understanding dialogue is so important.

The revised version includes a new section (section 8) that explicitly relates dialogue and monologue, in order to demonstrate how monologue can be regarded as a special case, but that understanding dialogue is fundamental. We also point out that monologue should be regarded as one end of a continuum. Also, it now does not appears as the final section of the paper, and therefore is not given undue emphasis.

7.5. I do think that a revision is needed to make certain points more apparent/explicit. I also think the authors need to do a better job of making whatever their controversial points are more explicit.

We have attempted to do this throughout the revised paper. In particular, we explicitly contrast our claims with traditional claims in Table 2.

Referee 8

8.1. apparent ignorance of a substantial literature that is relevant to its central theme

We have now made reference to a much broader range of literature (indeed, our reference section has increased by about a third). Within this, we include a number of additional references to talk in interaction.

Referee 9

9.1. What kind of link is this exactly?

We have addressed this issue in Section 3.2, where we discuss what we now call channels of alignment.

9.2. characterization of monologue vs. dialogue needs to be clarified.

The revision includes a new section 8 that explicitly addresses this issue (see response 7.4).

9.3. additional discussion of the implications of the authors' arguments for the domains in cognitive science that study linguistic interaction

We include a new section 9 that briefly sketches the wider implications of our account, for such areas as automaticity in social interaction, imitation, language acquisition, theory of mind, and human-computer interaction. We also make reference to the literature on automaticity within social psychology in section 3.2, when discussing channels of alignment. The reviewer is highly positive about the broad relevance of our paper, and the inclusion of additional discussion provides many examples of this.

9.4. Clark & Wilkes-Gibbs, 1986 need to be cited in the first paragraph of 2.2 and in the first paragraph of 2.2.1.

We agree that Clark and Wilkes-Gibbs should be cited at these points and now do so (note that section 2.2.1 is now section 2.3).

9.5. it is odd to cite Isaacs & Clark 1987 as having anything to do with producing a particular emotional reaction in a listener or persuading a listener

We accept that this reference was inappropriate and have removed it.

9.6. Schober's 1993 paper on spatial perspective-taking

As suggested, we have included a discussion of this paper at the very end of section 4. Note that we have also referred to the work in section 2.2, and in section 2.4 Footnote 4.

9.7 DuBois, J. W. (1974).

Because we included reference to Ross (1969) and discussion of Morgan (1973), we did not feel it necessary to include further reference to such work.
 
 

We hope that the revision produces a paper that is now appropriate for BBS commentary.

Yours truly,

Martin and Simon