MPS Experiences Timestamp Failure Due to Incorrect .double() Conversion Order, Says openai
apple openai
| Source: Mastodon | Original article
MPS word timestamp fails due to conversion issue. A fix has been reported for the Apple Silicon-related problem.
A recent issue has been discovered with OpenAI's Whisper model on Apple Silicon, specifically with the MPS backend. The problem arises when attempting to transcribe audio with word-level timestamps, causing the transcription to fail. The root cause is attributed to the conversion order of MPS tensors to float64 before being moved to the CPU. A one-line fix has been proposed to resolve this issue, which involves changing the conversion order to preserve the original intent of performing the dynamic time warping calculation in double precision.
This development matters as accurate word-level timestamps are crucial for applications such as video editing, where precise timing is necessary for removing specific frames tied to individual words. OpenAI's Whisper model is a popular choice for speech recognition tasks, and resolving this issue will improve its overall performance and usability.
As the fix has been reported in a discussion thread, it remains to be seen how quickly the OpenAI team will address and implement the solution. Users of Whisper on Apple Silicon with MPS backend should keep an eye on future updates, as a resolution to this issue will enhance the model's functionality and provide more accurate transcription results.
Sources
Back to AIPULSEN