Skip to content

A tool for using Llama3 to create an instance based on a (json-) schema from speech.

License

Notifications You must be signed in to change notification settings

hampusnasstrom/speech-schema-filling

Repository files navigation

Speech to ELN Structured Data Entry

This repository contains a solution for the LLM Hackathon for Materials and Chemistry 2024. The project aims to facilitate the documentation of lab experiments using audio and structured data entries in an Electronic Lab Notebook (ELN) based on the NOMAD schema.

Overview

NOMAD

The solution uses speech recognition to transcribe audio recordings of lab experiments into text. The transcribed text is then processed and structured according to JSON schema, and then written into an ELN entry in NOMAD. An example of a speech generated entry in a NOMAD Oasis can be seen here This allows for efficient and hands-free documentation of lab experiments.

Key Components

  • Speech Recognition: The speech_to_instance.py script uses the speech_recognition and whisper libraries to transcribe audio into text. The audio is captured using a microphone, and the transcription is done in real-time.

  • Text Processing and Structuring: The transcribed text is processed and structured according to the NOMAD schema defined in nomad_schema.archive.yaml. The create_solution_entry function is used to create structured data entries for NOMAD.

  • ELN Entry: The structured data is written into an ELN as a JSON file. This is done in the main function.

Usage

To run the script, use the following command:

python speech_to_instance.py

The script will start recording audio and transcribing it into text. It will then process the text and write structured data entries into an ELN.

An example notebook demonstrating the conversion from the extracted text from the audio to a structured JSON schema is available in the text_to_instance.ipynb.

Dependencies

The project depends on several Python libraries, including speech_recognition, whisper, langchain, pygame, gtts, and pydub. It is recommended to create a virtual environment first. Then the dependencies can be installed using the requirements file:

pip install -r requirements.txt

Additionally make sure you have ffmpeg installed. On Windows we recommend using the chocolately package manager to install ffmpeg.

Conclusion

This solution provides a hands-free and efficient way to document lab experiments and write structured data entries into an ELN. It is particularly useful for labs where manual documentation can be cumbersome or impractical because scientists might need both hands in the glovebox while experimenting. The use of a local instance of an LLM is very important, as these experiments protect the IP of the scientist. In this implementation, we used the Llama3:70b model served via Ollama protecting the privacy and still offering an efficient solution for the text processing and structuring (via function calling).

About

A tool for using Llama3 to create an instance based on a (json-) schema from speech.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •