fix:The timestamps for the second and subsequent segments are incorrect #980

caiwuu · 2024-08-29T10:32:47Z

When VAD detects multiple segments of speech in an audio clip, the timestamps from the second segment onward are incorrect, as shown in the image below
This is incorrect：

This is correct：

MahmoudAshraf97 · 2024-09-03T10:32:28Z

faster_whisper/transcribe.py

+            middle = (segment.start + segment.end) / 2
+            chunk_index = ts_map.get_chunk_index(middle)


this is not necessarily correct, as you noticed it fails the tests as it produces wrong timestamps in the test file, you need to identify the reason for the wrong timestamps in the first place to decide whether the incorrect timestamps are caused by the pre/post processing or from the whisper model

fix:The timestamps for the second and subsequent segments are incorrect

3c9e49e

MahmoudAshraf97 reviewed Sep 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix:The timestamps for the second and subsequent segments are incorrect #980

fix:The timestamps for the second and subsequent segments are incorrect #980

caiwuu commented Aug 29, 2024

MahmoudAshraf97 Sep 3, 2024

		middle = (segment.start + segment.end) / 2
		chunk_index = ts_map.get_chunk_index(middle)

fix:The timestamps for the second and subsequent segments are incorrect #980

Are you sure you want to change the base?

fix:The timestamps for the second and subsequent segments are incorrect #980

Conversation

caiwuu commented Aug 29, 2024

MahmoudAshraf97 Sep 3, 2024

Choose a reason for hiding this comment