Scene Generation for Communication




TETSUTANI Nobuji



Facial expressions and gestures are indispensable to improving the communication of verbal information. However, when one is speaking with another person in a distant location with conventional telecommunications tools, the communication isn't always so easy. We found that nonverbal information helps people to understand one another and achieves something much closer to face-to-face communication. We have been studying ways of extracting nonverbal information with computers. In addition, we have sought ways of improving and supporting virtual environments.


Recognition and Generation of Human Faces

When people recognize the facial expressions of others, they effectively grasp nonverbal information. Our research has focused on extracting facial expressions with computers. To achieve this, it is necessary to detect facial motions. Therefore, we developed a technique for predicting the position of the face in real time by pinpointing the location of the forehead1 . This system is robust and functions well even when the face rotates or the person is wearing glasses.

For facial recognition, we developed the system to assume the characteristic of the facial expression by dividing the facial area to some partial areas. In addition, we suggested the deformation technique using artistic anatomy in order to generate the facial expression in real time. From this research, we built the "Virtual Kabuki" system, the virtual metamorphosis system shown in the figure2 .


Human Motion Recognition

We also proceeded with the study of hand gesture recognition. One challenge was to devise a way of dealing with occlusion when the hands overlap. To overcome this difficulty, we used multiple cameras to recognize the gesture. Furthermore, we employed the multiple cameras to recognize human motions including standing, sitting, and taking an object from a shelf. However, we had to adjust the cameras every time the number of cameras increased. As a solution, we developed an automatic correction technique.

Motion capture is a method of monitoring the movement of the whole body with three cameras3 . Combining neural networks and heuristic information, we succeeded in building a real-time processing system without any markers4 . One of the applications of this system is "Shall We Dance?," which allows people to dance together in a virtual space.


Synthesized Space

We explored new directions by mixing real and virtual spaces. "Magic Book" is a system in which virtual objects are superimposed on a book placed in real space using see-through video glasses. "Augmented Groove" is a somesthetic music interface that produces a virtual experience with recorded music. As a result, we were able to introduce systems through which people create new sensations. In addition, we developed the technique of making real images of a person and trees in the virtual space.


Concept Proposal

We suggested the "Network Theater" as a new interactive movie enabling the creation of a play through networks. Another system we proposed was "Bunshin Communication," which enables a person to participate in simultaneous multiple events such as meetings and lectures as a virtual-metamorphosed avatar and/or agents. Virtual metamorphosis and agents are switched according to the recognized non-verbal data generated by the other participants at each event.


Reference