Crokket

- The "I didn't come here to be a stenographer" ASR AI, powered by pretrained models from OpenAI

Requirements

Windows
Python 3.11
Venv (Should be included with the Python install, otherwise: pip install venv)
Cuda-enabled graphics card (Only necessary if you want it to be.. not slow.)
Cuda toolkit

Updates

If there has been an update, there's a possibility that some dependencies changed, if you're unsure you can run conda env update to update any dependency changes that has been made.

Installation

There are a couple of steps, but it should be a fairly quick job. This guide assumes you are at the root of the repository when running commands.

Install Python and CUDA Toolkit

winget install Python.Python.3.11
winget install Nvidia.CUDA

Create and activate your virtual environment

python -m venv .venv
.venv/Scripts/activate

Install the PIP requirements

python -m pip install -r requirements.txt

Getting started

We suggest using the terminal inside Visual Studio Code, but if you have a preferred way of command-line usage - go right ahead.

Take your audio file, (only .wav files are supported as of right now), and place it somewhere you can easily copy or remember the file path.

To transcribe the audio, use python main.py path/to/file.wav. This runs Crokket in direct invokation mode, skipping the interactive part and performing slightly better in terms of speed. (Be sure to use paths that look/something/like/this.wav and not\like\this.wav, there's no support for backslash paths yet) If you would prefer to use the graphical interface, just run python main.py and press enter on the message about using the GUI. The transcript with the same name as your audio file can be found under the data folder in this directory.

NOTE: If it is the first time you launch the program, it will download around 3 GB (assuming you're using the default medium model).

A somewhat modern consumer computer with a 30-series or newer graphics card is expected to transcribe in the ranges of 4-6 hours of audio per hour of runtime. The output is of course not perfect but it should save you some time.

ALWAYS CHECK YOUR OUTPUT AND MAKE THE NECESSARY CORRECTIONS

Best of luck.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
config		config
src		src
test		test
.gitattributes		.gitattributes
.gitignore		.gitignore
.whitesource		.whitesource
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crokket

Requirements

Updates

Installation

Install Python and CUDA Toolkit

Create and activate your virtual environment

Install the PIP requirements

Getting started

About

Releases

Packages

Contributors 2

Languages

License

sondelll/crokket

Folders and files

Latest commit

History

Repository files navigation

Crokket

Requirements

Updates

Installation

Install Python and CUDA Toolkit

Create and activate your virtual environment

Install the PIP requirements

Getting started

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages