SUMI Yasuyuki, MASE Kenji,
TSUCHIKAWA Megumu,
ITO Sadanori,
and IWASAWA Shoichiro

Media Information Science Laboratories
Department of Interaction Media




1. Experience Medium
Human beings have invented and used various media such as paper, book, newspapers, radio, and television. The Internet, the most recent mass medium, has also become one of the most indispensable. The most important advantages of the Internet over its predecessors are: everyone can instantly send out his/her ideas and experiences; and it is easy to search such information anytime and anywhere. These advantages, however, rely heavily on the verbalization process of knowledge and experiences by individuals. Accordingly, tacit knowledge (awareness, common sense, nebulous ideas, atmosphere, etc.) behind the verbalized experiences tends to be omitted, and consequently, it often becomes difficult to convey the essence of experiences and skills by current media.
  The reason why we have to rely on such media based on verbalized knowledge today is that we do not have a medium where we can deal with our experiences represented with not only verbalized information, but also their contextual information. We expect to realize a so-called "experience medium" in which we can exchange our experiences directly in the coming ubiquitous computing era. The experience medium is a medium for capturing, interpreting, and creating our experiences.
  This article shows ATR's attempts to build such an experience medium using ubiquitous computing technologies.

2. Capturing Experiences
As the first step to the experience medium, we have prototyped a system to capture human interactions by environmental sensors as well as wearable sensors. Figure 1 shows a snapshot of when we deployed our first prototype for capturing interactions between exhibitors and visitors at the exhibition site of the ATR research exposition in November, 2002. Each exhibit booth had sets of sensors (video camera and microphone) on the ceiling. Also, we asked visitors to volunteer to use our wearable sensor sets in order to capture their tour experiences, thus providing us with their first-person perspective.


Figure 1 : Experience capturing at an exhibition.

  More importantly, our system incorporates ID tags with an infrared LED (LED tags) and infrared signal tracking device (IR tracker) to record positional context along with audio/video data. The IR tracker gives the position and identity of any tag attached to an artifact or human in its field of view. LED tags were attached to possible focal points for social interactions, such as on posters and displays. By wearing an IR tracker, a user's gaze can be determined. This approach assumes that gazing can be used as a good index for human interactions.

3. Interpreting Experiences
Stored experience data have to be interpreted to become meaningful knowledge. We used "interactions" among people and artifacts as footholds to interpret semantics of "experiences." Specifically,“gazing" observed in the IR tracker data and "utterance" observed in data from microphones attached to the users were used for measuring human interactions.
  Figure 2 illustrates basic components that we considered for representing interactions, e.g.:“the user stays at a booth X"; "the user gazes at an exhibit Y"; and "the user speaks to another user A." We could automate inferences of more abstract interaction patterns, such as "the user joins a discussion with A and B" and "the user talks with C about an exhibit Z," by grouping spatio-temporal co-occurrence of the above basic components.

 

Figure 2 : Interpretation of interaction components.

  As a result, we were able to extract "highlight scenes" from the viewpoints of each individual and provide him/her with a summary video by chronologically putting together the extracted video clips. Figure 3 is an example of personalized Web pages that show a summarized diary for an individual visitor to the ATR research exposition.


Figure 3 : Automated video summary.

4. Creating Experiences
We foresee a new medium that enables us to not only record and interpret experiences, but also create new experiences augmented by it. One characteristic of our efforts is to develop a methodology to facilitate experiences as well as observe them. For instance, we prototyped a guide robot that proactively addresses visitors using data collected by ubiquitous sensors (Figure 4).
  Generally, it is difficult for a lone robot even to recognize the person standing in the front of itself. It became easier, however, to identify people and recognize what they have visited so far by using the data captured with our environmental/wearable sensors. By using the data, our robot could call the name of visitors whom the robot first met and recommend an exhibit which they have not yet visited. Such guidance ability superior to that of humans is an example to facilitate the creation of our new experiences.


Figure 4 :Guide robot using ubiquitous sensors.

5. Conclusions
The progress of computable resources for natural language processing, i.e., machine-readable dictionaries and grammar, enables many useful tools, e.g., search engines on the Web, machine translation systems, and spoken dialogue systems, to spread in our daily lives. Our attempts presented in this article can be regarded as a new challenge to build a new dictionary and grammar to treat "experience data" represented by not only verbal but also non-verbal information. We predict that a future experience medium will deeply impact our society and such industries as the Internet. It is our hope that many researchers and practitioners will join us to realize experience media.