The primary focus of my PhD is on Deep Learning for Distant-Talking (Far-Field) Speech Recognition, with a particular focus on the domestic environment.
Most state-of-the-art systems provide a satisfactory performance only in close-talking scenarios, where the user is forced to speak very close to a microphone-equipped device. Considering the growing interest towards speech recognition and the progressive use of this technology in everyday lives, it is easy to predict that in the future users will prefer to relax the constraint of handling or wearing any device to access speech recognition services, requiring technologies able to cope with distant-talking interactions also in challenging acoustic environments.
A challenging but worthwhile scenario is represented by far-field speech recognition in the domestic environment, where users might prefer to freely interact with their home appliances without wearing or even handling any microphone-equipped device.
To improve current distant-talking ASR systems, a promising approach concerns the use of Deep Neural Networks (DNNs). In particular, designing a proper DNN paradigm in a multi-channel far-field scenario can potentially help in overtaking the major limitations of current distant-talking technologies.