Grounding in computer-supported collaborative problem solving

səhifə	14/18
tarix	25.06.2016
ölçüsü	1.41 Mb.

1 ... 10 11 12 13 14 15 16 17 18

CONTENT OF ACKNOWLEGMENT

The figures 5 (a) et 5 (b) show the global distribution of these different categories respectively in talk-talk interactions (a) and in interactions around the whiteboard (b). These figures reveal that task management is mainly performed through explicit verbal interactions and that the balance facts/inferences is not the same in MOO dialogues and on the whiteboard

Figure 5 (a): Categories of content in MOO messages

Figure 5 (b): Categories of content on the whiteboard

Variations of the acknowledgment rate for different contents

The grounding behavior varies according to the content of interactions. Table 4 gives the average rate of acknowledgment for the different categories of content. The rate is computed inside content categories, i.e. as the percentage of acknowledged interactions inside one category with respect to the total number of interactions in that content category.

Content Category	Acknowledgment Rate
Facts		26%	Knowledge
Inferences		46%	38%
Management		43%
Meta-Communication		55%
Technique		30%
All categories		41%

Table 4: Acknowledgment rate in different content categories²⁷

The interaction rate about the problem itself (the 'knowledge level') is 38%. It is interesting to discriminate the sub-categories 'facts' and 'inferences' since they have a very different acknowledgment rate, respectively 26% and 46%. The main difference between the two is the probability of disagreement. There is nothing to disagree about facts. Acknowledgment of facts basically means "I have seen it", rather than "I understand" or "I agree". The probability of misunderstanding or disagreement will be taken into account in our final model (section 7).

The acknowledgment rate for communication is largely superior to the average. This category represents only 8% of all verbal interactions in the MOO (see figure 5 a). In average, a pair interacts once every 15 minutes at this level. Hence, a candidate explanation for the high rate of acknowledgment would be that these interactions are better noticed because they are rare. Another explanation is that they often have a strong social aspect (expressing impatience, apologizing for long delays, ...)

The acknowledgment rate computed on technical aspects is based on a small amount of data (in average 4.5 per pair) and hence should lead be a particular interpretation. Especially, sometimes the technical problem being discussed in these utterances does itself perturb the interaction concerning this problem.

The acknowledgment rate for the 'management' category is higher than we expected. The task implies some strategy for coping with the information overflow (many suspects, many rooms, many motives, many times, ..). A sub-optimal strategy does not dramatically reduces the chances of success like in a non-reversible task. However, a low rate of acknowledgment increases redundancy in data collection. The subjects with a high rate of acknowledgment for the category 'management' ask significantly fewer redundant questions than those with a low acknowledgment rate. This difference is significant if we contrast the two extreme thirds of the sample: the average number of redundant questions is 12.6 for the 5 subjects with the lowest rate and 4.8 for the five subjects with the highest rate (F= 5.79, df=1; p=.05). We do not obtain a significant difference if we split the sample in two halves (despite a difference in the average: 12.6 and 4.8).

It is interesting to notice that the two groups have almost the same mean with respect to self-redundancy: 3.4 for the low acknowledgment group and 3.2 for high acknowledgment group. This reinforces the hypothesis that the redundancy is due to mis-coordination rather than to memory management, since memory failure would affect both self-redundancy and cross-redundancy. The group of high acknowledgers for the category of content 'management' ask in average almost the same number of immediate redundant questions than the low acknowledgers, the mean being respectively 1.20 and 1.40. The difference between the two groups comes from the number of long term redundancy (mean=11.40 for low group, mean=3.40 for high group).

Figure 6: Relationship between the rate of acknowledgment and different indicators of redundancy in questions

Immediate redundancy is not always an indicator of mis-coordination. It may sometimes be the result of explicit coordination: we observed several cases in which one subject, instead of summarizing the information for his partner, simply invites him to ask the same question again. In these cases redundancy is not anymore of waste of energy, but an economical way of sharing information: it may take less time for an agent to type a question than for his partner to summarize its answer.

The cost of redundancy is difficult to estimate. Typing a question such as "ask Marie about last night" takes a very short time. One must add to it the time necessary to reach the room, i.e. for typing another command (move) and the time for reading the answer. This time may be relatively short in case of self-redundancy, but not in cross-redundancy. If the global waste of time amounts to one minute per question, the global cost of redundancy may be up to 30 minutes according to the pairs. In average the pairs ask 12 redundant questions. The redundancy rate (number of redundant questions / number of questions) varies between 6 and 51%, its mean for the group is 23%.

In summary, the rate of acknowledgement regarding to the management on the task has an obvious impact on the problem solving strategy, impact measured by an increase in long-term cross-redundancy.

Relationship between the content, the mode and the strategy

The content category which is most mentioned in interaction around whiteboard (talk/whiteboard, whiteboard/talk and whiteboard/whiteboard) is the task knowledge. We have few cases (1%) of meta-communication (agreeing on graphical codes), no technical talk and some interactions (9%) concerning task management (discussed later). The facts and inferences together represent the remaining 90% of interactions around the whiteboard. Hence, our comparison across modes is reduced to the 'task knowledge' category. We observe (figure 7) an interesting very significant interaction effect between the acknowledgment rate and the mode (F=6.09; df=4; p = .001).

Figure7: Interaction effect between the acknowledgment rate and the mode of interaction when the content of interaction concern task knowledge.

The difference between the acknowledgment rate of inferences in the two modes reflect the general difference between the MOO and the whiteboard. This difference can be explained by the characteristics of the mode: the interaction in the MOO window is sequential, i.e. commands are displayed in the order of their introduction, while the whiteboard is not sequential. If there is no sequentiality, the mechanisms of turn taking fall dawn. Hence, acknowledgment plays only a role for negotiation. Precisely, facts don't have to be negotiated:

In talk/whiteboard interaction, a fact mentioned by Hercule can simply be written down by Sherlock (often facts are put without being at all mentioned before);
In whiteboard/talk, facts do not trigger negotiation or repair since their are, in this task, not ambiguous.

Hence, our interpretation is the following. Since there is a low necessity for grounding facts (low probability of misunderstanding or disagreement), their acknowledgment in MOO conversation (37%) basically means "ok, I read your message", acknowledgment simply aims to inform one's partner about shared visibility. Shared visibility could also be inferred by comparing the communication command ('say'/'page') with the MOO position (same room or not), but such an inference increases the cognitive load. At the opposite, in interaction around the whiteboard mutual visibility is the default assumption. Hence, grounding access to information (as defined in section 3.4) has not to be performed for information on the whiteboard.

To verify this hypothesis, we should code again the dialogues and discriminate different categories of acknowledgment (e.g. describing a set of dialogue moves such as simple acknowledgment, agreement, refutation, request for clarification, ...).

Sometimes what is acknowledged, both in the whiteboard/talk and in talk/talk interactions, is not the fact itself, but its importance in problem solving ("Ah ah!"). Actually, reflecting on the importance of a fact already implies to infer how this fact contributes to prove or disprove suspicions. From this point of view, the whiteboard can be considered as a filter which marks the importance of collected information.

We now relate the mode and the content of grounding with process of constructing of the solution itself. In 'shared solution', there is not only the word 'shared', but also 'solution'. Problem solving behavior is characterized by two factors, a social factor, the division of labor, and a cognitive factor, the problem solving strategy, reified through the subjects sequences of actions. The notion of problem solving strategy is an hypothetical construct which help us to account for some consistency between patterns of interaction and patterns of action. We do however not claim that these strategies are explicit, neither that they are stable

Most pairs split data collection, i.e. each partner goes to ask some questions in a different room, while a few pairs tend to stay more together. The criterion for dividing the task is often spatial, one partner focusing on suspects in the upper corridor and the other in the lower corridor. However, this criterion leaves some ambiguity regarding the suspects in the other rooms, such as the bar, the kitchen and the restaurant. Actually, this division of labor was in general respected during a very limited amount of time. We did not observe that the detectives sticked to their initial territory during the whole task (Gaffie, 1996). Two non-spatial criteria have been (partially) used (each by one pair): males versus females (pair 20, a mixed pair) and staff versus guests (pair 2). Some pairs divided for some time the work into functions, one detective collecting data and the second one updating the whiteboard (namely for intensive whiteboard activity like drawing a map).

The problem solving strategy is observable through the sequence of MOO commands. We postulate here that 'move' commands are subordinated to information commands ('ask', 'read'. 'Look, ...) since moves mainly aim to reach one room where a suspect or an object can be found. Hence, the sequence of questions reflects the problem solving strategy, at least for the data acquisiton stage. A question has two parameters, the suspect and the object of the questions (e.g. 'ask Oscar about last night', 'ask Helmut about gun'....). The matrix of all questions (suspect X object) can be explored along these two axes, i.e. by suspect or by object, but also in different orders:

Data collection "by suspect" is the most frequent strategy. It does not imply that the detectives ask all possible questions, since some objects - such as the insurance contract - are generally discovered lately. It implies that subjects ask at least the two basic questions: 'ask suspect about last night' and 'ask suspect about the victim'. In the strategy "by suspect", the most common criterion for sequencing suspects is space: the suspects are considered one by one according to the position of rooms in corridors. This is an efficient strategy for guaranteeing exhaustivity. The data collection is often conducted separately by the two subjects. The line of division of labor is easy to draw since the Auberge includes two corridors.
Data collection "by object", i.e. asking a particular question to several suspects, is less frequent because it is not economical: the detectives have to move to another room after each question. Episodes of data collection by object often occurs late in the interaction, when the detectives suddenly discover a key object. There is no real sequencing criterion, the choice of a new question simply results from the discovery of a new object (gun, jacket, painting, contract).
Data collection by "hypothesis": when a new hypothesis appear, the subject choose new questions (suspect X object)
Data collection by "question": set a sub-goal. Often "find who could get the gun between 8 and 9 p.m.".

We explained in section 5.8.4 that how we computed a coefficient QMP (questions matrix path) which indicates how the subjects explore the matrix of questions (subject X objects). This coefficient will be related to the type of problem solving strategy. We describe strategies with parameters formalized in artificial intelligence:

Breadth-first search (try to collect all data before to draw inferences from these data) versus depth-first search (when they find an interesting fact, try to push inferences as far as possible
Forward chaining (draw inferences from facts) versus backward chaining (start from assumptions and try to collect supporting evidence).

These two dimensions enable us to describe four pair profiles (table 5), in terms of problem solving processes, and to relate them with grounding. We now review this four profiles.

Strategy	Breadth-first search	Depth-first search
Forward chaining		"methodical" detectives	"opportunistic" detectives
Backward chaining		"direct" detectives	"intuitive" detectives

Table 5: Problem solving strategies in data acquisition

The "methodical" detectives explore the Auberge room by room, collecting all data without exchanging inferences. Pair 6 is very illustrative. During the first hour, they have only one utterance classified as 'inference'. They split spatially and communicate facts through the whiteboard (in figure 8, we noted 13 facts, but it is actually 13 notes, each of them including between 2 and 13 facts). During this first stage they ask 85% of all the questions. They use MOO conversation mainly (66%) for task management. The first inference is drawn after 58'. The QMP coefficient is .56, which clearly indicates a method "by subject". After 63 minutes, Sherlock suggests "let make a list of the persons who could steal the gun". This pivot-sentence is typical in this strategy. It introduces a second stage in which collected data are organized and inferences are drawn. During this second stage, 83% of MOO conversation are inferences. Almost no more facts are discussed. The strategy is not discussed much either since it mainly concerns data collection. The data acquisition method during this second stage is 0.25, i.e. much closer to the method "by object". This interaction patterns is illustrated by figure 8.

Figure 8: Variation of interactions at different problem solving stages for Pair 6

The "opportunistic detectives" collect data more or less systematically, but as soon as they get something interesting, they look for new specific data. In pair 15, they agree for a division of labor based on space, but they find facts which lead them to escape from this plan. For instance, they find quite early the fake painting. Hercule tells to Sherlock that "this would be a track..." (24.6, we translate) and goes to ask suspects about it. The QMP coefficient is 0.05. The idea of track is important in depth-first search since the quality heuristics plays a key role in the this search method. For most pairs, the first track is the murder weapon: for instance in pair 5, Sherlock says after 16 minutes "It seems that the kolonel had the gun. I go to room 5 and ask him.". Pair 15 subjects communicate facts in the MOO, these facts being important since they may re-orient data acquisition (or conversely, they re-orient data acquisition because they are important). Since data collection escapes from systematicity, more management interactions are requested for re-planning. During the first stage, they ask only 68% of the total number of questions. After 60 minutes, we find the same kind of 'pivot-sentence': "What about using the whiteboard to put on a grid what they did a what time and the possible motives" (60.8, we translate). During the second stage, they continue collecting information, since the first stage was not conducted systematically. The QMP coefficient is .33, i.e. superior to the first stage, which indicates that they then collect more systematically the data!

Figure 9: Variations of interactions at different problem solving stages for Pair 15

The "intuitive detectives" profile is almost the same, the difference is that the heuristics they use are very directly connected to the goal, i.e. expressed in terms of suspicions. For instance, in pair 13, Hercules says after 16 minutes "I have a first hypothesis. In fact, Heidi had an affair with Hans. She was jealous of his taking to Mona. So she killed this woman, left Hans' jacket in the room in order to make the police think that it was Hans, when in fact it was her". Then Hercule goes to ask Heidi about the gun and to ask Lucie, who is Heidi's alibi, what she did the night before. After 30 minutes, Hercule makes a second hypothesis, concerning Rolf Loretan, which will survive until the end of the interaction, despite the discovery of other elements accusing other suspects. We split the protocols into two parts, around the pivot-sentence indicating the need for organizing information: "don't you think we should make a kind of resume". These two stages have different duration (27 / 54 minutes), hence the comparison between stages is biased. During the second stage, the amount of new facts mentioned and the amount of management interactions is very high compared to the number of inferences (figure 10). The QMP coefficient during the first stage is 0.28 m, the first stage includes 74% of the questions.

Figure 10: Variation of interactions at different problem solving stages for Pair 13

The "direct detectives" collect data in a very systematic way, as the 'methodical' pairs. The QMP coefficient is 0.51 and 90% of questions are asked during this first stage. But, the interpretation of data is not delayed. Pair 10 collects facts and paste them directly in a table [suspect X key] in which each piece of information, by its position in the table, becomes more than a fact, is used to prove or disprove that some suspect has a motive, the opportunity to take the gun or the opportunity to kill. The interaction profile in figure 11 is obtained by splitting the protocol in two halfs (72 minutes), since we found no pivot sentence as we did for previous profiles. The first stage is characterized by a high level of interaction, namely many management interactions, but also many inferences (37). The low number of facts in the second stage indicates that the data collection in the first part has been quite exhaustive.

Figure 11: Variation of interactions at different problem solving stages for Pair 10

In general, the strategy is not consistent across the whole protocol and across subjects. For instance, in Pair 10, Hercule first uses a data collection method "by object" (QMP coefficient = -0.03!) during the first 20 minutes. The profiles presented above, established by splitting protocols into two periods, do not fully account for the dynamics of collaborative problem solving. The best granularity for describing problem would be a sequence made of different episodes:

planning: decide which action to do next;
data collection: move to rooms, ask questions, read object
data analysis/synthesis: draw inferences from collected data

In forward chaining, a typical sequence is:

Short planning episode, generally splitting rooms among subjects;

Long data acquisition episode (asking about 80% of questions)

Short data synthesis

Sometimes, a short planning episode

Short data acquisition episode: asking complementary questions appeared during the intermediate synthesis

Long data synthesis

In backward chaining, we observe the same alternation of cycles planning-acquisition-discussion, but with a higher frequency, an episode of data acquisition being often limited to a few questions.

In summary, if one consider short episodes, the content of grounding mechanisms is directly related to the type of episode, respectively 'management', 'facts' and 'inference' for the 'planning', 'data acquisition' and 'data analysis/synthesis'. The dominant mode of grounding in related to the content being negotiated: MOO >> Whiteboard for management, Whiteboard > MOO for facts and MOO > Whiteboard for inferences. The division of labor does also vary along this episodes, the data acquisition being essentially individual, except when the data seem particularly important, while the two other process, planning and synthesis, are more collective.

1 ... 10 11 12 13 14 15 16 17 18

Grounding in computer-supported collaborative problem solving

CONTENT OF ACKNOWLEGMENT

Variations of the acknowledgment rate for different contents

Relationship between the content, the mode and the strategy