Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported features for 16-bit PCM WAV #55

Open
jasondraether opened this issue Feb 4, 2022 · 3 comments
Open

Unsupported features for 16-bit PCM WAV #55

jasondraether opened this issue Feb 4, 2022 · 3 comments

Comments

@jasondraether
Copy link

Using opensmile to process a wavfile saved as 16-bit PCM at 16000 sampling rate, I'm getting some features as 0.0 (mainly fundamental frequency / F0), which of course cascades to 0.0 for jitter and shimmer as well. Does opensmile not support 16-bit PCM? This problem appears to vanish if I convert the speech signal array to 32-bit floating point before running it through "smile.process_signal()". This has happened for all feature sets for both Functionals and LLDs.

@frankenjoe
Copy link
Collaborator

opensmile expects 32-bit float as input. You can use process_file(file) to directly process 16-bit PCM, though.

@chausner-audeering
Copy link

If 32-bit float input is a requirement, I guess we should validate that the input format matches and throw an exception if not. Otherwise the chance that you get bogus results without noticing is high. Or we could convert implicitly to 32-bit float before writing to openSMILE, or have openSMILE do the conversion by correctly setting the format settings of cExternalAudioInput.

@frankenjoe
Copy link
Collaborator

frankenjoe commented Feb 7, 2022

I checked and actually we forward int16 to opensmile, so we need to bypass the following line:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants