We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to use the model for feature exrtaction from audio? Without having any text as input?
The text was updated successfully, but these errors were encountered:
@aretii In that case, why don't you use Whisper instead, since whisper-large-v3 was used as the audio encoder in Qwen2-Audio?
Sorry, something went wrong.
@thanhtvt I thought there were modifications to the whisper-large-v3 used for Qwen.
No branches or pull requests
Is it possible to use the model for feature exrtaction from audio? Without having any text as input?
The text was updated successfully, but these errors were encountered: