Computing Point-of-View: Modeling and Simulating Judgments of Taste

səhifə	5/28
tarix	26.06.2016
ölçüsü	8.55 Mb.

1 2 3 4 5 6 7 8 9 ... 28

2.3 Model acquisition

Three key tasks are identified in the acquisition phase of person modeling.^¹⁰ First, knowledge representations are devised to best describe each of five aesthetical realms being addressed—cultural tastes, attitudes, ways of perceiving, taste for food, and sense of humor. Second, the strategy of reading persons’ everyday texts for affective themes is concretized for each realm as a reading schema—a process that yields explicit textual traces. Third, the cultural topology is acquired through culture mining. Representational choice is the primary decision, while reading strategy and culture mining are entailments of that choice. Thus, the second and third tasks will be developed as techniques in Chapter 3. This section, then, focuses, on the first task—introducing the rationale for, and features of each realm’s chosen representation.
Devising knowledge representation for the aesthetical realms was treated as a series of design problems. Three considerations were kept in mind through the process—1) what information could be gotten out of source texts, 2) how representational features might be consistent with humanistic scholars’ assessments of the aesthetical realms, and 3) how the representation could facilitate later tasks of generalization and simulating reactions. Below is a discussion of knowledge representation issues for the five realms; to be followed by a comparative discussion analyzing the commonalities exhibited across realms.
§

Realm of cultural taste. For the realm of cultural taste, social network profiles were chosen because several cultural theorists had already framed the issue of cultural taste in terms of patterns of consumer choices (Bourdieu 1984; Haug 1986), which is precisely what is captured in each social network profile. To capture a person’s particular interests, a representation for the realm needs to enumerate all possible interest and identity descriptors, so a hierarchy of 21,000 interest descriptors and 1,000 identity descriptors was assembled from various folksonomies. These descriptors are interconnected by a web of metadata relationships (e.g. parentOf(book author, book title) ). But the folksonomic tree fails as a knowledge representation, because many serendipitous taste affinities between descriptors are not captured by metadata relationships in tree. Thus, inspired by recent consumerist theories that view a person’s consumer choices as gestalts (McCracken 1988; Solomon & Assael 1987), a method was derived to learn the mutual information between each pair of descriptors. After pruning, this resulted in 12,000² pairwise numerical affinities. Overlaying these affinities with metadata relationships, what results is an almost fully connected graph.
I term this a semantic fabric representation to reintroduce the spatial metaphor. We should think of a person’s explicitly stated interests as a pattern of activation in this fabric. Furthermore, the fabric terminology is consistent with terms in cultural theory—for example, Geertz described culture as webs—“man is an animal suspended in webs of significance he himself has spun, I take culture to be those webs” (Geertz, 1973: 4-5).
The topology of the semantic fabric is lumpy. In particular identity nodes and taste cliques can be identified as semantic mediators in the fabric, acting as connector hubs and thus exerting disproportionate influence. When a person’s pattern of interests activates the fabric, activation tends to flow into and out of these attractors; thus these attractors arguably introduce an aspect of prototype-based inference into the person model generalization process. The existence of taste cliques and identity hubs are consistent with Solomon & Assael’s (1987) prediction of ‘consumption constellations’.
Realm of food. In this realm, a person’s generalized model might well correspond to the common sense of the word, tastebuds. The field of foodstuffs is as broad, and as richly connected as the field of cultural interests, so it was decided that the same representation would be adopted. The realm of taste for food is represented as a semantic fabric of interweaving recipes, ingredients, cooking procedures, sensorial keywords, and so on. Again hierarchies of food metadata were assembled from a variety of web sources, and from the Thought for Food corpus—thousands of sentences embodying cooking common sense. Also, from a corpus of 60,000 recipes—the taste coherency assumption was again made, in order to learn the implied affinities between foodstuffs. Combining metadata with mined affinities, what resulted was a richly connected semantic fabric of the foodstuffs, whose nodes include—60,000 recipes, 5000 ingredient keywords, 1000 sensorial keywords (e.g. ‘spicy’, ‘chewy’, ‘silky’, ‘colorful’), 400 cooking procedures, and 400 nutritional keywords.
Just as identity hubs and taste cliques acted as the semantic mediators for cultural taste, in the realm of food, cuisine nodes (e.g. ‘Chinese’, ‘dessert’) and basic flavor nodes (e.g. ‘sour’, ‘spicy’) act as semantic mediators for tastebuds. Tastebuds, like cultural tastes, are represented as activations of the food fabric. In the case of tastebuds, both positive (food likes) and negative (food dislikes, food allergies) activations are allowed.
Realm of attitudes. There are several possible interpretations of attitude. Jung, for example, introduced ‘extraversion’ and ‘introversion’ as two basic psychological attitudes. However, we define attitude in a simplistic and common sense way—an attitude is your feeling toward some topic, which may be a thing, or person, or event. Regarding attitude as topic+affect is consistent with the cognitive appraisal theory of emotions (Ortony, Clore & Collins 1988)—which states that emotions result from cognitive appraisal of some thing, person, or event, and is affect that is directed toward that cognitive target. Weblog diaries was chosen as a promising candidate for acquisition of personal attitudes. Just as cultural interests were backed by metadata hierarchies, topic hierarchies were mined from DMOZ (e.g. subtopicOf(feminism, philosophy)) and from ConceptNet (e.g. isA(burger, food)).
However, the application of ‘culture mining’ to acquire a semantic fabric was not effective over the corpus of weblog diaries. First, whereas 20-50 descriptors were available with high confidence in each of the 100,000 social network profiles, attitudes gotten from reading were fewer and more tenuous—due to the relative sparseness of cultural interests being mentioned in weblog diaries, and due to the difficulty in assessing with certitude that cultural descriptors mentioned in the weblog diary are intended as expressions of personal interest. Second, we hypothesize that the self-reflective context of a social network profile, which leads to a desired coherent enumeration of interests, is not as pronounced in weblog diaries. Third, whereas cultural interests and tastes for food may be described in everyday texts as clearly a preference or dispreference, an attitude’s affect is not given as binary choices in weblog diaries.

Figure 2 7. A semantic sheet representation for the realm of attitudes

As a result, a semantic sheet representation (Figure 2 -7) was designed to facilitate visual understanding of the realm^¹¹. In it, a sheet of topics supposes an enumeration of all possible topics, linked by their metadata gotten from topic hierarchies. Then, each person’s attitudes are represented as a sheet of affects which corresponds to the sheet of topics. Affect is represented with Mehrabian’s (1995b) three-dimensional PAD model. In this representation, both persons and cultures can be represented as sheets. As will be demonstrated in a political culture scenario in Chapter 4, a person’s affinity toward a culture can be measured (and visualized) as the degree of alignment (shown as vertical dashed lines) between their two sheets of attitudes.
The semantic mediators conceived for this realm are Minskian imprimers—mentors, parents, or cultures whose attitudes prime ours. As was shown in Figure 1 -4, imprimer models are sheets, and in generalization, these sheets supplement a person’s missing attitudes. For example, the model of a person who is imprimed by Warren Buffet in topics relating to business can be supplemented by attitudes on those topics from Warren Buffet’s person model.
Realm of humor. A Freudian approach was adopted to representing sense of humor. To model a person’s sense of humor, again, weblog diaries are a preferred source. The space of humor is likewise modeled using semantic sheets, except that instead of PAD values, there is a unary value standing for tension level. A person’s sheet is her own pattern of psychic tensions about various topics, which are presumed to be dually rooted in conditions of her upbringing, as well as in her more recent life frustrations.
There are many accounts of humor in literatures computational and humanistic. Some postures are that jokes are 1) motivated by feelings of superiority (traced back to Burke) and thus, derisive; 2) motivated by need to relieve frustration (cf. Freud); 3) motivated by expectation violation (cf. Kant); 4) delightful and informative as a mechanistic caricature (cf. Bergson) and thus examples of how not to be (cf. Minsky); and 5) a defense mechanism (cf. Piddington). The experiment developed here proceeds from the relief premise. Freud’s (1905) account of tendentious jokes reflects early psychoanalysis’s conceptualization of the affective unconscious as an expanding and contracting hydraulic bag, which alternatingly stores and gives catharsis to psychic energies and tensions. Freud distinguished between innocent and tendentious jokes. Innocent jokes elicit just a smile or chuckle and are not emotionally heated, while tendentious jokes have a sexual or aggressive character, and can draw out aggressed, howling laughter. Tendentious jokes are a boon to the psychic economy of the listener, says Freud. They relieve psychic tension that has pent up around particular topics.
Freud theorized that people of similar cultural background, such as members of a lifestyle or ethnic group—due to shared upbringing and experiences with family, gender, sex, and money—also tend to share a pattern of social inhibition and psychic tension. Cultural humor, then, effects aggressive laughter as a way to give catharsis to tensions created by social inhibition. Sexual jokes are prevalent, in part because almost all societies inhibit sexuality to some degree. Inspired by Freud’s account, we represent niche humors (e.g. ‘bush jokes’, ‘jewish jokes’, etc.) as pre-formed sheets of tension—in fact they are archetypal tensions because they are shared by a cultural grouping of persons. We may also think of the space of tension/topic humor as being semantically mediated by niche humors. Finally, generalization can be performed by calculating the degree of alignment between a person’s sheet and sheets of the various niche humors. For example, a person with a sheet of tension on topics such as ‘Iraq’, ‘freedom’, ‘fascism’ would best align with the archetypal sheets for ‘Bush jokes’ and ‘political jokes’, while a person tense about topics such as ‘virtue’, ‘scandal’, and ‘government’ would best align with ‘Clinton jokes’ and ‘political jokes’.
Realm of perception. We were inspired by the following premise—why do realists and romantics perceive the world so differently? Looking to existing theories, Jung’s (1921) theory of psychological type directly addressed this question. In his theory of type, Jung proposed four fundamental psychological functions—sense, intuit, think, feel—to account for all the differences in human perception. In this vocabulary, the difference between realists and romantics can be re-viewed as a difference in disposition—when asked to interpret the word sunset, a romantic leans on feelings (e.g. ‘romance’, ‘warmth’) and intuitions (e.g. ‘embrace’), while a realist leans on ratiocinations (e.g. ‘off work’, ‘dinner’) and sensations^¹² (e.g. ‘dark’).
More formally, we decided to represent the space of possible perceptions as a dimensional space whose axes are Jung’s four fundamental psychological functions—sense, intuit, think, feel. A person’s particular disposition is formalized as a coordinate location in the space. For example, a romantic is located in high feeling, high intuition, low sensing, low thinking.
A first-pass experiment to model person’s rough location in this space considers person’s patterns of affective communication evident in their weblog diaries—patterns such as PAD affect associated with ego (e.g. ‘I’, ‘me’), PAD affect associated with alters, and PAD that is pushed from ego to alters, and from alters to ego. The success of this approach was mixed. Though not yet implemented, the notion of archetypes as semantic mediators of perception can also be considered, as identifying mediators has worked well in the other realms. Archetypes such as ‘realist’, ‘romantic’, could be identified as locations in the space, each associated with some textual signature, and these could facilitate the location of persons into the four-dimensional space.
§
A comparative discussion now reviews some shared themes exposed in devising knowledge representation for the above realms.
One theme that crosses knowledge representations and realm is the notion of semantic mediators. In the cultural taste realm and food realm, supernodes like ‘identities’ and ‘cuisines’ helped to organize the cultural topology—more than other nodes, these tended to be vehicles of consistency and coherency in spaces that were otherwise non-hierarchical. In the realm of attitudes and the realm of humor, again ‘imprimers’ and ‘niche humor’ served as semantic mediators, helping to organize and prime the attitudes and tensions of individuals. Finally, in the realm of perception, we speculated that archetypes such as ‘realist’ and ‘romantic’, though unimplemented, could be a useful semantic mediator in that realm.
Another theme is a preference for the maximum amount of connectedness possible, as topological connectedness bolsters the power of model generalization, and affords perspective-based artifacts the comprehensiveness necessary to react to arbitrary inputs. One reason why rule-based user modelers tended to be prescriptive rather than reactive is that prescription is easier than reaction. To react to arbitrary input implies that the semantic distance between the user’s model and the input is known. A space wired with a few connections, such as stereotype rules, does not guarantee a sound distance calculation because its connections make too many assumptions. This is related to the semantic distance fallacy that is often talked about in regards to the poor decision of relying on semantic network link-hops as a measure of semantic distance (Collins & Quillian 1969). By contrast, a semantic fabric is a connectionist representation, and affords a much better distance calculation from a person’s model to any input. Hence, high degree of connectedness enables the reactive mode for perspective-based artifacts. Of course, the semantic sheet knowledge representation is far less connected and ideal than semantic fabrics, but we make do. Generalization over semantic sheet is opportunistic, and connections can be made via spreading activation along topic hierarchies, via analogy, and via imprimers.

Figure 2 8. A semantic diversity matrix
To make sense of these two themes—that semantic mediators lend consistency and that there is a preference for connectedness, we attempt to plot some knowledge representations along these two dimensions, to see what insight might be developed. Inspired by Marvin Minsky’s (1992) “causal diversity matrix” in which Minsky plots the space of intelligence problems along the axes of number of causes and number of effects, Figure 2 -8 presents a “semantic diversity matrix,” which summarizes the representational tradeoffs.^¹³ I ask you the reader to indulge me for a moment in the following speculation. Representations of the top row (i.e. ontology, folksonomy, dimensional-space) are symbolic and neat, while the bottom row of representations (i.e. sheets, networks, fabrics) are phenomenal and scruffy. Neat representations seem to be centralized and prescriptive, and are thus prevalent in symbolic AI; scruffy representations seem to be decentralized and descriptive, and are thus prevalent in corpus-based research. Now interpreting the columns—the leftmost column (i.e. ontology, semantic sheets) tends toward political by emphasizing disconnectedness, ideology, denotation, and control; whereas the rightmost column (i.e. dimensional spaces, semantic fabrics) tends toward the aesthetical, emphasizing both weak and strong connections, which afford connotation and evocation. If these trends are true trends, then perhaps the semantic diversity matrix can be of assistance to researchers in computational modeling, who would like to reflect upon the appropriate knowledge representation for their problem. Consider, for example, that Sack (1994) chose actor-role analysis for representing ideology. As actor-role bindings may be plotted somewhere between ontology and semantic sheets, we could, without knowing the problem being modeled, guess that the problem domain was political, and somewhat scruffy.
Having characterized the knowledge representation issues in model acquisition, the next section discusses the generalization phase of person modeling.

1 2 3 4 5 6 7 8 9 ... 28