Grounding in computer-supported collaborative problem solving

səhifə	16/18
tarix	25.06.2016
ölçüsü	1.41 Mb.

1 ... 10 11 12 13 14 15 16 17 18

Synthesis

The observations indicate the relationship between grounding and problem solving. Interestingly, the acknowledgment rate seems more related to problem solving variables (e.g. low acknowledgment rates are associated with long-term cross-redundancy ) than to other dialogue variables (e.g. frequency of talk, delay or symmetry). The content of grounding mechanisms is related to the problem solving stage: task management during planning episodes, facts during data acquisition and inferences during data analysis/synthesis. The dominant mode of grounding in related to the content being negotiated: MOO is largely preferred to the whiteboard for management, the inverse is true for facts and inferences are grounded both in the MOO and on the whiteboard, but still more often in the MOO. The division of labor does also vary along this episodes, the data acquisition being essentially individual, except when the data seem particularly important, while the two other process, planning and synthesis, are more collective. These three parameters (content, mode and division of labor) defined pair profiles which we related to problem solving strategies.

Regarding the specific role of the whiteboard, these observations presented in this report contradict our initial expectations. Our global hypothesis was that the whiteboard would help to disambiguate MOO dialogues. Disambiguation could be performed by simple deictic gestures or by drawing explanatory graphics. We observed almost no deictic gestures, probably for three reasons: (1) the MOO dialogues contain few spatial references ( 'there', 'here', ...) and pronouns referring to an antecedent outside the utterance ('his', 'she', ...); (2) the emitter cannot simultaneously type the message and point on the whiteboard; (3) the receiver cannot look at the same time at the MOO window and at the whiteboard window; (4) the partner's cursor was not visible on the whiteboard. We also observed very few explanatory graphics (4 timelines and 3 graphs on 20 pairs).

Actually, our observations reverse the expected functional relationship between the dialogues and the whiteboard. Most cross-modality interactions are oriented towards the whiteboard. In talk/whiteboard interaction, information is sometimes negotiated before being put on the whiteboard. Grounding is not really achieved through the whiteboard. Grounding rather appears as a pre-condition to display information in a public space. Conversely, whiteboard/talk interactions often aim to ground the information put on the whiteboard ("why did you put a cross on...?", "What do you mean by..."). We also observed that pairs with a structured whiteboard have a higher acknowledgment rate for inferences (in any mode).

If the whiteboard often is the central space of coordination, it is probably because it retains the context, as suggested by Whittaker et al (1995). This context is established at the cognitive level: the whiteboard increases mutual knowledge with respect to what has been done and how to do the rest, both during data acquisition and data synthesis. The context is not established at the linguistic level: the mutual understanding of MOO utterances does not seem to rely on the mutual visibility of whiteboard information. We even observed several cases in which two different contexts are established, i.e. that the subjects participate in parallel into two conversations, one on the whiteboard and the other in MOO dialogues.

If experienced MOO users can participate in parallel conversation, it means that they maintain distinctively different contexts of interactions. If the context was unique, the interwoven turns reported in the previous section would lead to complete misunderstanding, which was not the case. The existence of multiple concurrent contexts appears as an important avenue for research. Intuitively, it may provide a greater flexibility both in negotiation and in problem solving, but this hypothesis should be validated; alternatively, the extra cognitive load needed for disambiguating among several contexts may degrade conversational performance. The existence of several contexts might also bring into question current theories of situated cognition in which context is perceived as a whole.

The ability to maintain multiple contexts is not a property of the individuals, but of the whole user-MOO-user cognitive system. The semi-persistence of information displayed on the MOO modifies the communication constraints, probably by reducing the cognitive load of context maintenance. This confirms the relevance of distributed cognition theories in studies of computer-supported collaborative problem solving.

This distributed cognition perspective enables us also to generalize the observations from different pairs as different configurations of a similar system. The abstract cognitive system is described by the [function X tool] matrix below. Basically the task involves 6 functions (collecting facts, sharing facts, sharing inferences, storing facts or inferences and coordinating action) and 4 tools (MOO dialogue, MOO action, whiteboard and merging/reading notebooks³⁰). Table 6 indicates how these different tools support different functions.

Tool Function	MOO dialogue	MOO action	Whiteboard	Notebooks
Collecting facts			X
Sharing facts		X		X	X
Sharing inferences		X		X
Storing facts				X	X
Storing inferences				X
Coordinating action		X	X	X

Table 6: Configuration of a cognitive system: matrix of possible allocations of functions to tools.

The matrix in table 6 is theoretical. A matrix corresponding to an actual pair has to be less redundant: For instance, if a pair communicates all facts through dialogues, the whiteboard will be globally more available for inferences³¹; if a pair exchanges many facts through notebooks, it will communicate fewer facts through dialogues, and so forth. The actual [function X tool] matrix varies from one pair to another. It may also vary within a pair as the collaboration progresses, one function being for instance progressively abandoned because the detectives become familiar with another one. This plasticity, this ability to self-organize along different configurations justifies the descriptions of a pair as single cognitive system.

This plasticity raises a fundamental question, however. These experiments revealed that small details, for instance a syntactical constraint in a MOO command or a sub-optimal feature of the whiteboard, may change the active [function X tool] matrix. Some apparently minor system features seem to cause major reconfiguration of how people interact. There is hence a probability that we would observe different results with a slightly modified design. This is a methodological problem. Let us illustrate the importance of with 3 examples.

With the chosen task the main features of the whiteboard seems to be the persistence of information and the possibility to organize the information spatially, rather than its intrinsic graphical features. Actually, it would be fairly easy to augment the MOO with two features: the MOO window could include a pane in which information would be persistent, in which users could directly paste an utterance from the MOO³², move it to a specific location and add pre-defined marks on existing notes. The basic distinction we drew in this research between MOO interactions and whiteboard interactions would thus be erased by re-designing the system. And if, for instance, in this re-designed system, the subjects could not paste their own utterances but only their partner’s utterances, we would have, by definition an acknowledgment rate of 100% in talk/whiteboard interactions.
The distinction between the MOO notebook and the whiteboard is also arbitrary, since instead of displaying the notebook content in the MOO window, we could have decided to display it on the whiteboard. The display could even be structured 'by suspect'.
We explained that the small number of acknowledgment of action or by action was due to (1) the strict conditions for such an acknowledgment in terms of mutual visibility, co-presence and MOO expertise (reasoning on what the other can see) and (2) the semantics of action being quite reduced in the current system design. Both aspect could be different: (1) The MOO could inform more systematically A about what B can see, could make the consequences of actions visible from different rooms, etc...., (2) One could enrich the semantics of action by including commands at the task knowledge level (such as putting cutoffs on a suspect).

These examples show that the CMC system is not just a neutral communication channel. It carries out some functions of the problem solving process. Hence, if we want to abstract observations beyond the particular technological environment which has been used, one has to reason from a distributed cognition perspective, i.e. to consider the software tools and the users as different components of a single cognitive system. Without this systemic view, we would continue to ask why a cyclist moves faster than a runner, despite a similar rate of leg movements per minute.

We hence attempted to express our results in terms which are more general than the features of the particular technical component (persistence, shared visibility, ...). In this distributed cognitive system, the actual allocation of a function to a tool depends on 3 criteria: the operation (acquiring, sharing, storing, comparing ...), the information (facts, inferences, management, meta-communication, ...) and the tool itself. We consider here only the 'sharing' operation³³. The choice of a tool (or mode) for grounding information is influenced by the difference between two levels: how much it is grounded yet (Prob.grounded) and how much it has to be grounded (Need.grounding). If the difference is small, i.e. if the information is already shared more or less as much as it should be shared, the probability will be lower that the subjects make additional effort towards grounding it. This probability also depends on how much effort is required. It may be the case that a small difference leads to a grounding action because this action is very cheap, or, conversely, that a larger difference does not lead to a grounding act because the cost of it is too high. Hence the difference is compared to the cost of grounding the information. The relationship between these 3 parameters is expressed by the following expression:

Prob.grounding.act [i, m] ≈ Need.grounding [i] - Prob.grounded [i, m]

Cost.grounding [ m]

This formula does not express a real equation, since none of its parameters can be accurately measured. It expresses semi-quantitative relations, i.e. how the probability of a grounding act for information through medium m (Prob.grounding.act [i, m]) varies if one of the following parameters increases or decreases:

Need.grounding (i) is the necessity that information i is grounded for solving the task, which corresponds to Clark' and Wilkes-Gibbs’s concept of the grounding criterion. It can also be expressed as the cost of non-grounding, i.e. the probability that the task is not solved (or takes more time) if information i is not grounded. In these protocols, only the final inference "the killer is XX" has to be agreed upon. All other interactions are instrumental to that goal. The different categories of content vary according to the cost of non-grounding: how dramatically will it impact on the problem solution if some information is not received or is misunderstood. For instance, misunderstanding one's partner’s suggestion regarding conversation rules is impolite but often not dramatic for the success of the task. The importance of information is often a function of its persistence, i.e. how long it will remain valid. It is often not justified to have costly interactions to share information such as MOO position when this may change the next second. This point is important because the persistence of information validity must be related to the persistence of information availability on provided medium³⁴.
Prob.grounded (i, m) is the extend to which information i is already grounded before any specific interaction. This probability depends on different factors according to the level of mutuality of knowledge³⁵ . For instance, the low acknowledgment rate for facts on the whiteboard can be explained by the high probability that they are grounded at level 2 (whiteboard = shared visibility) and the low probability that they are not grounded at level 3 or 4 (misunderstanding or disagreement)
At grounding levels 1 and 2 (access/perception), this probability mainly depends on the medium. The medium may guarantee that some information is more or less permanently grounded. On the whiteboard, shared access can be inferred permanently. The MOO includes many messages which provide mutual information about current positions, which are rarely discussed in the protocols, but not about future positions, a very frequent object of interaction .
At levels 3 and 4 (understanding / agreement), the probability depends more on the intrinsic features of the information. For instance, the disagreement probability for inferences is higher than for facts (there is little room for disagreement about facts). This probability may vary inside a category. The probability of misunderstanding varies according to the way information is presented, namely how it is expressed: clarity of verbal utterances, explicitness of graphical representations, .... The probability of disagreement varies according to the context of interaction: an inference has a lower probability of disagreement if the elements which immediately support it have already been grounded.
Cost.grounding (m) is the cost of interaction with m. In section 3.1, we explained that the cost of grounding depends on the medium in terms of production costs, formulations costs, reception costs, repair costs.

More specific research has to be carried out to validate this model. The research of Montandon (1996) described hereafter confirmed the relations expressed in this model for one type of information: MOO positions. It is clear however that this model only grasp some aspects of grounding. The probability of a grounding act being performed can not be considered in isolation. Since the participants have limited resources to act (both through their own concentration and two hands, as well as the serialized input interface), the partners will have to choose at any given moment only a subset of actions that can be immediately performed. Even if the innate Prob.grounding.act [i, m] is high, there might be some i’ or m’ for which Prob.grounding is even higher, and requires more immediate action. Context and timing will also play a role in which grounding acts actually get performed. It may be cheaper to perform one action a over another action a’ if he context is right for a at the time of action. Moreover, since the context is dynamic, changing with each action, and also with the actions of the partner, the choice of actions can be dependent on the perception of how the context will change: a relevant but less urgent action may be preferable, if performance of another action first may make it less relevant (and thus more expensive).

We will end this synthesis by a methodological comment. Most of the quantitative observations presented here have been obtained by counting some events across the whole protocol. With such synthetic variables, the variations of processes over time is lost. We illustrated in section 6.7.2 that a pair may show different communication profiles at different stages of problem solving. Similarly, we cannot -by definition- expect the grounding process to be constant over time. Our protocols show that interactions are structured 'episodes', unified semantically³⁶ (e.g. talking about Oscar) and even syntactically (e.g. segments of utterances in which "he"= Oscar). However, the length of these problem solving stages or communication episodes is variable. We hence need new data analysis methods which preserve the "sequential integrity of data" (Sanderson & Fisher, 1994). It may be the case that these techniques are very specific to the task and the technological environment.

This computation could be carried out in real time by artificial agents in the MOO. An interesting research direction is to display in real time this information to the experimenters and to the users for supporting reflective evaluation of their actions. We mentioned several variables which could be automatically computed: the rate of redundancy, the number of facts obtained in the notebooks, what each agent could see at any time,... Moreover, with a structured whiteboard the agents could control when/ how many facts are displayed in the shared space. If we use a semi-structured interface (such as in Baker & Lund, 1996, or Jermann, 1996) the agents could automatically compute the rate of acknowledgment (since the user indicate which previous utterance he refers to) and the structure of grounding patterns (interwoven turns, type of speech acts, ...). The features would not only be interesting as observation functionnalities, they do also constitute the first layer of skills to be provided to artificial collaborators.

1 ... 10 11 12 13 14 15 16 17 18