Andyʼs working notes
About these notes
Visual speech recognition
i.e. machine-assisted lip-reading. One approach to
Silent speech interface
.
A recent comprehensive review (also includes coverage of visual speech
generation
):
Sheng, C., Kuang, G., Bai, L., Hou, C., Guo, Y., Xu, X., Pietikäinen, M., & Liu, L. (2022). Deep Learning for Visual Speech Analysis: A Survey (arXiv:2205.10839). arXiv.
http://arxiv.org/abs/2205.10839
SotA (as of 2022-07-05):
K R Prajwal, Triantafyllos Afouras, Andrew Zisserman. Sub-word Level Lip Reading With Visual Attention. 2021.
word error rate of 22.6%
Last updated 2023-07-13.