It is indeed a great honor to write an article for ATR Up-to-Date, and it
is indeed a great moment in time to do so. For it is just about 15 years, that
our joint adventure to tackle a difficult problem between ATR in Japan and our
laboratories in Japan, the US and Europe began. Due to ATRユs early(1987)leadership,
its support, vision and friendship, a new research field could begin. ATR had
just begun as a company and speech translation was chosen as one of four key
technology areas. To address the difficulties and to consider the many aspects
of many languages, the international Consortium for Speech Translation Advanced
Research(C-STAR)was formed in 1991. It included just four partners(including
ATR and our laboratories at CMU and UKA)at first, but it has since grown to
include 20 of the leading laboratories around the world.
The problem of speech translation is enormously difficult due to the combined
difficulty of accurate speech recognition, acceptable machine translation and
naturally sounding speech synthesis, of which neither can be considered a solved
problem by any measure. Researchers in the component fields often derided the
early efforts as unmanageable, intractable, impractical, and even not useful(!),
given the poor solutions and poor performance that existed in the component
technologies at the time. Undeterred, however, those early researchers persisted,
first demonstrating feasibility(1991), then capabilities for spontaneous speech(1993-1999)and
now practical fieldable solutions. In an age of globalization, where business,
humanitarian, healthcare and security needs have rapidly grown beyond national
boundaries, the enormous importance, yes the absolute necessity of cross-lingual
technologies in every form(text, speech, image), was to be recognized before
too long.
Now, 15 years later, virtually all governments and research sponsoring agencies
in the developed world, support significant efforts in the area of speech translation.
Indeed, perhaps the largest ongoing research programs that still fund speech
and language research at all in Europe(TC-STAR, CHIL)and in the US(DARPA-GALE,
DARPA-TRANSTAC, NSF-STR-DUST), are now committed to crack the speech translation
problem and related cross-lingual language requirements. The rapidly growing
need for fast, effective response to multilingual information, the need for
effective cross-lingual communication necessitate technical solutions to deliver
the required greater speed, broader language coverage and lower cost than what
can possibly be made available by human language services alone.
Given 15 year history of speech translation research, and the tremendous effort
and investment currently underway, one might ask, if the problem is almost solved,
and if not, what challenges remain. In my view there are 4 remaining challenges:
●Robustness - Speech Translation must be reliable in all circumstances
for which it is to be employed and it must deliver trusted output. How can the
output of a technical device be trusted? Unlike humans, machines are woefully
inadequate in judging the plausibility of their own output and articulating
their own self-doubt. Robustness also remains a challenge, when we consider
not only clean speech input, but highly disfluent conversational speech, noisy
environments, distant microphones and stressed or emotional speech.
●Domain Unlimited Capability - While a number of practical applications
can be fielded that require only translation capability in limited domains,
the domain restriction of most of todayユs systems must be removed. This is necessary,
if we wish to provide translation of open-domain spoken language tasks such
as Broadcast News, Lectures/Seminars, Parliamentary Speeches, Meetings and Telephone
Conversations. Domain unlimited speech translation in turn must cope with disfluent,
conversational speech as well as large open domain language and vocabulary
coverage.
●Language Portability - Sadly, most current efforts are concentrated
around only a few languages of general interest: English, Chinese, Arabic, Spanish,
Japanese, German, ... Perhaps the greatest social impact of translation technology,
however, could come from capabilities in less commonly spoken languages and
language pairs, where language tools and human translation services are less
readily available. Short of covering 6,000 languages of the world, however,
managing even 20 languages of an expanding Europe, already presents great difficulty
and cost. Can more advanced machine learning techniques help to lower the cost
of development and language portability?
●Human Delivery - For the language barrier to become invisible, we also
have to be concerned with appropriate human interfaces that deliver language
services in an unobtrusive way. Clearly, spoken input is preferable in mobile
situations or meeting situations, but images may require photo or video input,
or a mixture of image and voice. How should output be presented? By voice? By
text? Should it be delivered via headphones, heads-up displays, speakers? Should
it run on a PDA, mobile phone, laptop, or be implanted in a ubiquitous intelligent
environment? Numerous intriguing possibilities exist.
Speech translation as a research field has grown up. It has been a privilege
to collaborate with ATR for 15 years on the problem of speech translation and
help build foundations in a field of growing importance. Looking toward the
future, many open challenges remain, but the excitement does not let up: Where
else can scientists be offered the dual benefit of scientific fascination with
a grand challenge problem and a guaranteed opportunity to change the world for
the better? As our efforts have been so fruitful, growing international recognition
of the problem also brings growing intense world-wide competition over new systems,
solutions and standards. ATRユs pioneering benchmarking exercises as launched
under IWSLT, provide a mechanism and a forum for these forces and for the best
laboratories around the world to advance the state of the art rapidly and jointly.
International exchange will continue to refresh and deepen our understanding
of the problems and accelerate the turn-around in implementing viable solutions.
Our laboratories look forward to continuing and further deepening the strong
collaboration and friendship that we have begun with ATR 15 years ago.