You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nkululeko could be multimodal, if a
transcript
field is added to the audio files
and then,
linguistic features extractors could be added to the feature_sets
The text was updated successfully, but these errors were encountered:
Some datasets already have transcriptions (but I skip that since I don't think it will be needed). It can be added as an additional column in the CSV or audformat. If there is no transcription, we can utilize hugging face (such as a whisper) to generate transcripts during pre-processing in each dataset. Then, the "linguistic feature extractor" will process transcription in the transcript column (I propose this name as the header of transcription) to generate word embeddings (linguistic feature).
This is useful to use speech along with transcription for the detection of such degradation like Alzheimer's.
Nkululeko could be multimodal, if a
transcript
field is added to the audio files
and then,
linguistic features extractors could be added to the feature_sets
The text was updated successfully, but these errors were encountered: