Computing Point-of-View: Modeling and Simulating Judgments of Taste

səhifə	7/28
tarix	26.06.2016
ölçüsü	8.55 Mb.

1 2 3 4 5 6 7 8 9 10 ... 28

2.5 Model application

Having acquired and generalized a person model, two remaining issues are now foregrounded—1) how a generalized person model can be applied to simulate reactions to arbitrary input; and 2) how can these models be embedded in perspective-based applications, and what interactions should such applications afford? The rest of this section 1) presents a simplistic model of simulation, also considering its limitations; 2) builds a context of related work for perspective-based applications; and 3) reflects upon design lessons learned from the experience of building perspective-based applications.
§
A simplistic model of simulation. Since the generalized model has propagated the implications of a person’s textual traces as far as possible, simulation is reduced to a matter of mapping arbitrary input into the generalized model, and reading the reaction off the model. Using the computational reading approach that reduces a person’s everyday texts into a set of explicit textual traces (e.g. attitudes, interest & identity descriptors, etc.), each input likewise results in n textual traces. The interpretation of these traces is additive, meaning that reaction to the n textual traces is a linear combination of reactions to each textual trace.
In semantic fabric representations (i.e. realm of cultural taste, realm of food), simulation looks like distance measurement. A reaction to some input is positive if the input’s nodes are proximal to the individual’s taste ethos, and is neutral or negative if the input is distant, or if the input is proximal to a negative activation (in the food realm). The scalar value of this reaction, though, is not likely to be meaningful in isolation; the value is meaningful when reactions to two different inputs are compared—e.g. a person is predicted to like one thing more than something else. Hence, the design of applications like Identity Mirror and virtual tastebuds encourage constant reaction to a variety of input, in order to tease out the value of comparison for users of these applications. This notion of simulation via distance measurement is consistent with some work in experimental psychology. Montgomery’s (1994) study argued that the valence of reaction generated by someone’s perspective was proportional to the psychological distance between the experimental subject and object.
In semantic sheet representations (i.e. realm of attitudes and realm of humor), reactions are simulated via memory-based reasoning. An input is processed via computational reading and either a set of topics is extracted (if the input is simple e.g. ‘bush, war’), or a set of attitudes is extracted (if the input is opinionated e.g. ‘war is bad’). Topics evoke a set of attitudes in the generalized model, and the overall reaction is a linear combination of the PAD values for these evoked attitudes. Inputs containing attitudes also evoke their corresponding attitudes in the generalized model, but the Pleasure scale in the overall reaction is instead replaced with the degree of agreement. For example, a pro-war model will react pleasurably to the input ‘war’, but will react displeasurably to the input ‘war is bad’.
Limitations of simulation. At least two criticisms can be fairly leveled on this simple simulation strategy—at worst, generalization methods like 1) spreading activation and 2) analogy may in fact lead to incorrect inferences; and 3) even at best, a generalized model captures a stereotype of a person, but while persons defy their own stereotypes, straight-forward memory-based simulation ignores this level of human dynamism.
First, some doubt can be cast on the foundation of generalization. We have supposed that spreading activation can work because choosing from the field of cultural interests is additive. However, suppose that some cultural interests acted as contextual operators. An example is when a person lists in his profile a bunch of kitschy and over-popular interests, but lists one devastatingly insightful and hip interest. For a real person who is reading this profile, the presence of the latter acts as a contextual operator, forcing re-interpretation of the kitschy interests as a conscious act, and this transforms understanding about the tastes of the person represented by the profile. Bayesian modeling can handle contextual operators, but simple spreading activation cannot.
Second, generalization via topic hierarchies and analogy may also go awry. Ideology and other forces may violate the general assumption that attitudes toward a topic and its analogs are mutually consistent. ‘Tree’ and ‘rock’ share the super-topic of ‘nature’, so aesthetic consistency presumes that positive attitudes about ‘tree’ can induce symmetrically positive attitudes about the sister concept of ‘rock’. However, aesthetic consistency is violated in the instance of ‘dogs’ and ‘cats’, which share the super-topic of ‘pets’. It is far from clear that a sympathetic attitude toward ‘dogs’ can predict a sympathetic attitude toward ‘cats’. Looking at empirical data handled in attitude modeling, ‘dogs’ and ‘cats’ tend actually to form an aesthetic opposition—dog lovers tend toward distaste for cats, and vice versa. Pet preference seems to be a politicized and ideological space exempt from aesthetic consistency—perhaps because pets are so often invoked in the present culture to signify the personality of their owners. Dog lovers are presumed to be social, whereas cat lovers are presumed to be asocial. A hack to minimize these effects is to avoid spreading activation directly across sister nodes (i.e. nodes sharing a parent) in topic hierarchies, and also to focus on attributes (e.g. ‘propertyOf’) rather than taxonomic features (e.g. ‘isA’) when generalizing via analogies.
Third, even when a generalized model has successfully captured a stereotype of a person, the formation of reactions cannot be so simplistic; reactions may in real life deviate from expectation due to human dynamism and dialogical nature. A person’s reaction is more sophisticated than simply reiterating their views. Most views are complex enough such that a person may adopt both positive and negative attitudes about a topic under different social or political contexts, and views also evolve over time. Future work should thus look to a Bayesian dimension to modeling. An input may also provoke a particularly clever, even self-contradictory reaction, as an individual hopes to achieve sarcasm, irony, or self-overcoming. In “Principles for a Sociology of Cultural Works”, cultural theorist Bourdieu (1993) explained that theorists, for example, are keenly aware of their location in the space of criticism, and thus their reactions often deviate from what is expected of them because they are constantly playing games with their self-stereotype. The sort of reflexivity hat causes reactions to play with expectation is related to the idea of dialogism (Bakhtin 1935). Unfortunately the sophistication of this dynamism makes it difficult to simulate. At present it is beyond the computational scope of this thesis work, though thoughts for future address of this aspect are presented in Chapter 6.
§
Having presented a simplistic model of simulation, we now discuss perspective-based applications, which are enabled by taste and attitude simulation. Six such applications were implemented, and they were introduced in Chapter 1—virtual mentors; virtual tastebuds; perspective-based art; an identity mirror; a system that facilitates social introductions; and a Freudian joke teller. These perspective mirrors and simulators employ the just-in-time information retrieval (JIT-IR) paradigm in their interface, as a way to create a journey for users through someone else’s perspective. Related work on JIT-IR systems and perspectives in the interface is now surveyed.
Related work. Just-in-time information retrieval systems (JIT-IR) are software systems that monitor the user’s context, build queries from this context in the background, and pushes content to the desktop that a user may find relevant at this particular moment. Most technology users have experienced basic examples of JIT-IR—such as auto-completion in operating systems and word completion in mobile phones. Drummond (1992) pioneered auto-fetching of relevant programming libraries based on observations of a user of a programming shell, terming the technique active browsing. LETIZIA (Lieberman 1995) embedded an observational agent in a web browser, and loaded other web pages of potential interest to the user in a side pane. The Remembrance Agent (Rhodes & Starner 1996) observes users writing emails and papers, and displayed past ‘remembered’ documents based word frequency similarity.
Many other systems followed the JIT-IR paradigm (Joachims, Freitag & Mitchell 1997; Badue, Vaz & Albuquerque 1998; Budzik & Hammond 1999; Kulyukin 1999; Maglio et al. 2000), leading to recapitulation and formalization of the paradigm as autonomous interface agents (Lieberman 1997), and just-in-time information retrieval agents (Rhodes & Maes 2000). One criticism of JIT-IR systems is that pushing documents to the interface detracts from rather than enhances user’s task performance because they forces users out of the flow that is their task context. Another criticism is that JIT-IR systems incorrectly assume that document ‘similarity’ entails ‘relevance’ and ‘usefulness’, with Budzik et al. (2000) demonstrating experimentally some tasks in which similarity-based retrieval was not particularly useful.

At the confluence of interface design and storytelling, we also identify a body of work that explored the notion of perspectives in the interface. Apple Computer’s Guides project (Oren et al. 1990) was a multi-character interface that assisted users in browsing a hypermedia database. Each guide embodied a specific character (e.g. preacher, miner, settler) with a unique “life story.” Presented with the current document that a user is browsing, each guide suggested a recommended follow-up document, motivated by the guide’s own point-of-view. Don’s (1989) “We Make Memories” was an interactive multimedia installation featuring a great-grandmother who would tell stories in fragments. The particular trajectory of storytelling was triggered by a viewer’s sensed context. Guides and “We Make Memories” distinguish themselves from other character-based interfaces like Microsoft’s Bob (also now, the Microsoft Office paper clip) because they tried to impart the psychology and memories of specific persons unto a user’s present context, whereas Bob did impart any subjective substance but was merely an anthropomorphic device. Point-Counterpoint (Budzik & Hammond 2000) was a proposed JIT-IR system that would retrieve documents by both a similar query for documents according with the user’s textual context but also by an opposite query for documents by experts who dissented with the present context.

§
We reflect upon lessons learned in the development of the thesis’s six perspective-based applications. The aim of this class of applications is not so much to improve user performance in conventional tasks. Rather, they explore new capabilities for computers such as supporting self-reflection, person learning, and deep customization. Previous insight into JIT-IR systems is now parlayed into a working framework for thinking about how models of persons’ perspectives can be communicated through the interface effectively. This framework is articulated as three design principles—1) continuous observation and feedback, 2) just-in-time and just-in-context, and 3) tinkerability. These lessons introduce and incorporate related work in intelligent user interfaces.
Continuous observation and feedback. It may be said that just as light has no resting mass, perspective is not intelligible in stasis. To fully appreciate and intuit someone’s simulated perspective as captured in a generalized model, it should be animated and allowed to react to a broad many things. In the interaction design literature, Bill Gaver (1991) has foregrounded the idea that an artifact is easy-to-use when its affordances are perceptible. According to Gaver, perceptual psychologist J.J. Gibson (1979) first coined the term ‘affordance’ to describe the ability of people to intuit an object’s potential for actions from perceptual cues and feedback. For example, thin vertical door handles afford pulling, while wide horizontal doorplates afford pushing. A person’s aesthetic perspective as embodied in a generalized person model, however, affords more complex actions. According to Gaver, systems of complex actions require more active perception, and many exploratory engagements with the system. Complex objects can often be dissected into nested affordances revealed over time. Thus, the affordances of a perspective can be conceptualized as consisting of semantically nested affordances—that is, aesthetic consistencies latent in the perspective can suggest divisions of affordance-space. A useful way to expose the affordances of a person’s tastes, attitudes, ways of perceiving, etc. could be for an application to continuously observe a user’s current textual context, and proactively offer feedback. Because information push treats all user browsing and writing activity as implicit input, it avoids the time cost of users manually formulating queries. Thus, exploration of affordances can be had with less effort, and animating a model resembles an exploratory walk through a perspective. What exploratory walks achieve is an increase in the surface area of a perspective’s lessons.
Just-in-time, just-in-context. Perspective-based applications continuously observe the user’s context and offer feedback. The nature of this feedback should be just-in-time—meaning that feedback given should pertain to the user’s present textual context, such as the text on a webpage that the user is scanning over with the mouse, or the sentence that the user just finished typing. JIT-IR interaction suits perspective-based applications because the reactions they simulate make the best sense when they can be readily bound to the user’s present context. But more than just-in-time, the nature of feedback should also be just-in-context—meaning that the reaction should pertain to the gestalt of the user’s present context, rather than considering only a subset of the context. For example, Lumiere (Horvitz et al. 1998) aimed to be just-in-context because its Bayesian model tried to make sense of all the pieces in the user’s present context to infer the user’s goal—achieving synesthetic reasoning. As a counter example, a Google Ad Words advertisement is embedded in a webpage and is dynamically populated based on the presence or absence of particular keywords on the webpage. But these ads are not just-in-context because they appeal to only fragments of the text. Following Budzik et al.’s (2000) reasoning that similarity is incommensurate with relevance or usefulness, we argue that for a perspective to be communicated effectively, it should have to attempt reactions to the gestalt of a user’s context. Such reactions are exciting—they play with the idea of how prior attitudes and tastes might combine and compose under novel contexts. Taste and attitudes in their general sense, are intertextual, so they demand to be explored as such. Finally, just-in-context reactions can provoke users and motivate them to further investigation.
Tinkerability. While proactive feedback gives users of perspective-based applications many perceptual entrees into a perspective, there needs also to be a way for users to tinker with the perspective itself. If just-in-context reactions is the provocation for critical thinking, then tinkering is the follow-through. For example, WWTT allows a perspective to be changed in a text editor, and allows a user to dig deeper into any reaction by asking a virtual mentor to justify a reaction with a corpus of quotes from memory. The Aesthetiscope allows the user to tweak the perceptual perspective by moving horizontal sliders to change location in the Jungian perception space, and then immediately visualizing the consequences of this change. Avatars in the Synesthetic Cookbook can be reprogrammed with food keywords on-the-fly by clicking on the mouth to reveal the tastebuds. Tinkerability also ensures that users learn about the limitations of computational person modeling—for example, seeing quotes from a virtual mentor that just reacted allows users to verify the accuracy of the generalized model.
In summary, this chapter opened with a survey of related work in user modeling. An approach to modeling persons from everyday texts was introduced and compared with the related work in the user modeling. The approach was structured into three phases—acquire, generalize, and apply. Apropos model acquisition, knowledge representations for five modeled realms were explained, and they were inter-related along the dimensions of connectedness and consistency. Apropos generalization, it was suggested that convergence and truth maintenance are characteristics of the generalization process. Details for spreading activation, analogy, and impriming were then presented. Finally, apropos model application, a simplistic model for simulating reactions was presented, and framed by related work on just-in-time information retrieval systems, three design considerations for the class of perspective-based applications were articulated.

1 2 3 4 5 6 7 8 9 10 ... 28