Skip to content

Releases: sunzeyeah/RLHF

v2.0

26 May 01:56
Compare
Choose a tag to compare

Pipelined implementation of SFT, Reward and RLHF training based on transformers, DeepSpeed and DeepSpeedChat. List of supported models: Pangu, GLM, ChatGLM