Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix:The timestamps for the second and subsequent segments are incorrect #980

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

caiwuu
Copy link

@caiwuu caiwuu commented Aug 29, 2024

When VAD detects multiple segments of speech in an audio clip, the timestamps from the second segment onward are incorrect, as shown in the image below
This is incorrect:
image
This is correct:
image

Comment on lines +2088 to +2089
middle = (segment.start + segment.end) / 2
chunk_index = ts_map.get_chunk_index(middle)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not necessarily correct, as you noticed it fails the tests as it produces wrong timestamps in the test file, you need to identify the reason for the wrong timestamps in the first place to decide whether the incorrect timestamps are caused by the pre/post processing or from the whisper model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants