


Scene Generation for Communication
TETSUTANI Nobuji
Facial expressions and gestures are indispensable to improving the communication
of verbal information. However, when one is speaking with another person in a
distant location with conventional telecommunications tools, the communication
isn't always so easy. We found that nonverbal information helps people to understand
one another and achieves something much closer to face-to-face communication.
We have been studying ways of extracting nonverbal information with computers.
In addition, we have sought ways of improving and supporting virtual environments.
Recognition and Generation of Human Faces
When people recognize the facial expressions of others, they effectively grasp
nonverbal information. Our research has focused on extracting facial expressions
with computers. To achieve this, it is necessary to detect facial motions. Therefore,
we developed a technique for predicting the position of the face in real time
by pinpointing the location of the forehead1
. This system is robust and functions well even when the face rotates or the person
is wearing glasses.
For facial recognition, we developed the system to assume the characteristic of
the facial expression by dividing the facial area to some partial areas. In addition,
we suggested the deformation technique using artistic anatomy in order to generate
the facial expression in real time. From this research, we built the "Virtual
Kabuki" system, the virtual metamorphosis system shown in the figure2
.
Human Motion Recognition
We also proceeded with the study of hand gesture recognition. One challenge was
to devise a way of dealing with occlusion when the hands overlap. To overcome
this difficulty, we used multiple cameras to recognize the gesture. Furthermore,
we employed the multiple cameras to recognize human motions including standing,
sitting, and taking an object from a shelf. However, we had to adjust the cameras
every time the number of cameras increased. As a solution, we developed an automatic
correction technique.
Motion capture is a method of monitoring the movement of the whole body with three
cameras3
. Combining neural networks and heuristic information, we succeeded in building
a real-time processing system without any markers4
. One of the applications of this system is "Shall We Dance?," which allows people
to dance together in a virtual space.
Synthesized Space
We explored new directions by mixing real and virtual spaces. "Magic Book" is
a system in which virtual objects are superimposed on a book placed in real space
using see-through video glasses. "Augmented Groove" is a somesthetic music interface
that produces a virtual experience with recorded music. As a result, we were able
to introduce systems through which people create new sensations. In addition,
we developed the technique of making real images of a person and trees in the
virtual space.
Concept Proposal
We suggested the "Network Theater" as a new interactive movie enabling the creation
of a play through networks. Another system we proposed was "Bunshin Communication,"
which enables a person to participate in simultaneous multiple events such as
meetings and lectures as a virtual-metamorphosed avatar and/or agents. Virtual
metamorphosis and agents are switched according to the recognized non-verbal data
generated by the other participants at each event.
Reference

