Change the repository type filter
All
Repositories list
14 repositories
konduktor
Publiccluster/scheduler health monitoring for GPU jobs on k8sexamples
Publictorchtune
Publichelm-charts
Publicunsloth
Publicllm-atc
Public archivetraining
Publicvllm
Publicairoboros
PublicFastChat
PublicRWKV-LM
PublicRWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.nodify
Public archivetrainy
Publicdynolog
PublicDynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also integrates with pytorch and can trigger traces for distributed training applications.