
Yogesh Virkar
Articles
-
Jun 25, 2024 |
amazon.science | Yogesh Virkar |Prashant Mathur |Proyag Pal |Brian Thompson
To translate speech for automatic dubbing, machine translation needs to be isochronous, i.e. translated speech needs to be aligned with the source in terms of speech durations. We introduce target factors in a transformer model to predict durations jointly with target language phoneme sequences. We also introduce auxiliary counters to help the decoder to keep track of the timing information while generating target phonemes.
-
Nov 22, 2023 |
amazon.science | Surafel Melaku Lakew |Yogesh Virkar |Prashant Mathur |Marcello Federico
Automatic dubbing (AD) is among the machine translation (MT) use cases where translations should match a given length to allow for synchronicity between source and target speech. For neural MT, generating translations of length close to the source length (e.g. within ±10% in character count), while preserving quality is a challenging task.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →