Whither Speech Recognition? Deep Learning to Deep Thinking

Whither speech recognition? is the title of a well-known paper written by J. R. Pierce in 1969 which unfortunately stopped Bell Labs from continuing speech recognition research for several years at the beginning of the 1970s. Even though 45 years have passed since then and we have actually observed various successes in speech recognition research, the paper is still worthwhile to read. Pierce wrote “Speech recognition has glamor. Funds have been available. Results have been less glamorous. … General-purpose speech recognition seems far away. Special-purpose speech recognition is severely limited. It would seem appropriate for people to ask themselves why they are working in the field and what they can expect to accomplish.”

It is still true that what we can do with speech recognition technology is very limited, and even though DNNs (Deep Neural Networks) using “deep learning” have significantly raised the performance since several years ago, we still have many challenges that cannot be solved simply by relying on their capability. We definitely need to deeply think about and analyze how human beings are recognizing/understanding speech, and implement various knowledge sources in speech recognition systems using advanced machine learning techniques to achieve innovations. This talk focuses on my personal perspectives for future speech recognition
research.

May 11, 2016

12:30 pm (1h)

Discovery Building, Orchard View Room

Sadaoki Furui

Video