This is a plugin for the Godot game engine that allows the user to load and run a local Large Language Model (LLM) in-engine using the LLamaSharp (v0.13.0) C# library.
- Install CUDA Toolkit 12.1 if you haven't already (12.1 recommended for compatibility with other projects like Unsloth).
- Download the latest Mind Game release for your platform and run the executable
- Download a .gguf model of the Llama, Phi, or Mistral families
- Load your model and have fun!
Smallest well-performing model: Phi-3
Larger, high-quality model: Llama3-8B-Instruct
Another 7B model: Mistral-7B-Instruct-v0.2
- Install CUDA Toolkit 12.1 if you haven't already (12.1 recommended for compatibility with other projects like Unsloth).
- Download and extract Godot 4.3 (.NET version)
- Download/install .NET8
- Clone/download this repo (or the most recent dev branch to have the most current features) and open it with Godot 4.3 .NET,
- Click Project > Project Settings > Plugins > Enabled (Mind Game). Go to the Autoload tab and make sure MindManager is enabled.
- Load a .gguf file of the Llama, Mistral, Mixtral, or Phi families to get going!
The lower quantization (q), the smaller the model is to run but at the cost of accuracy. Llama-3-8B-Instruct.Q4_K_M is a great middle-ground for those with 8GB of VRAM. The absolute smallest model Phi-3-mini-4k-instruct.IQ1_S.gguf can run on less than 1GB of VRAM. Note: quantizations that start with 'I' run very slowly on CPU, I recommend those for GPU inference only.
This plugin revolves around the MindManager autoload, which handles all backend model loading and allows every scene to access it. The user does not have to ever interact with the MindManager directly, as the configurations are handled on their own screens.
To send/receive input to the model, you add a MindAgent node to your scene, which talks with the MindManager node. It can signal out the text it receives so that you can connect it to a Label3D, as in the example. By attaching a MindAgent to a CharacterBody3D (or anything else that moves), you can give the agent a body (example scene is MindAgent3D). When I transition to the BatchedExecutor, the user will be able to have n-conversations simultaneously (limited only by the user's hardware).
Make a singleton to be able to access currently loaded model in-game(complete 0.2.0)- Transition to BatchedExecutor (begun 0.3-dev)
- Add conversation save/load, forking, and rewinding (begun 0.3-dev)
- Add network graph generation (begun 0.3-dev)
- Implement LLaVa support, including live viewport analysis
- Make Download Manager functional
- Add project script crawling
- Expose LLamaSharp methods like quantization
- Integrate Kernel Memory for document ingestion
- MindManager is now an autoload
- Model configurations can be saved/loaded
- MindAgent nodes can be added in the inspector
- 3D chat example with MindAgent3D
- First release, model loading and chat enabled in engine bottom bar