Skip to content
/ CatGPT Public

CatGPT is an open-source project aimed at replicating the PPO algorithm of InstructGPT and training it on a Chinese dataset.

Notifications You must be signed in to change notification settings

cauyxy/CatGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CatGPT: InstructGPT Replication Based on Chinese Dataset

| English | 中文 |

CatGPT is an open-source project aimed at replicating the PPO algorithm of InstructGPT and training it on a Chinese dataset. The name "CatGPT" evolved from "ChatGPT," with the removal of "h" representing a slight reduction in the project's "Helpful" and "Harmless" features compared to ChatGPT. In addition, "CatGPT" also represents the meaning of "Concatenate," indicating that the project is a combination of multiple projects. The project includes open-source code, models, and data.

CatGPT offers an excellent platform for beginners looking to explore and experience the InstructGPT training process. With ready-to-use data, code, and models, this project provides a comprehensive and user-friendly introduction to the world of InstructGPT. The easy-to-use nature of CatGPT allows beginners to quickly get started, dive into the powerful capabilities of the InstructGPT PPO algorithm, and enjoy a seamless learning experience. With CatGPT, you can effortlessly take your first steps in learning and practicing within the realm of AI-driven language models, gaining a solid understanding of the core concepts of the InstructGPT training process.

PreTrained Model: Bloomz-1b1

Features

  • Fully PPO-trained on a Chinese dataset
  • Open-source code for research and improvement
  • Open-source model for ease of use and deployment
  • Open-source data using a Chinese corpus

TODO

The following are the to-do items for the project. We will continue working to improve and perfect CatGPT:

  • Implement PPO-PTX algorithm: Due to trlx limitations, the current version of CatGPT only supports the PPO algorithm. We plan to add native support for the PPO-PTX algorithm in future versions to provide users with more options.
  • Implement LoRA: We plan to apply LoRA technology in future versions to train the CatGPT model more efficiently.

Before Training

# Clone the repository
git clone https://github.com/cauyxy/CatGPT.git

# Enter the project directory
cd CatGPT

# Create a virtual environment
conda env -n catgpt python==3.8

# Activate the virtual environment
conda activate catgpt

# Install trlx
git clone https://github.com/CarperAI/trlx.git
cd trlx
pip install torch==2.0.0 --extra-index-url https://download.pytorch.org/whl/cu116 # for cuda
pip install -e .
cd ../

# Install dependencies
pip install -r requirements.txt

Training Process

  1. Train SFT:

    cd sft/ && deepspeed train_sft.py

    Checkpoint: SFT

  2. Train Reward Model:

    cd reward_model/ && deepspeed train_rm.py

    Download reward model checkpoint:

    mkdir reward_model/rm_checkpoint
    wget https://huggingface.co/xinyu66/catgpt-rewardmodel/resolve/main/pytorch_model.bin -O reward_model/rm_checkpoint/pytorch_model.bin
    ```__
    
  3. PPO training:

    mkdir ppo
    accelerate launch --config_file configs/default_accelerate_config.yaml trlx_ppo.py

    Checkpoint: PPO

    🩹 Warning: This particular training configuration requires at least 55GB of VRAM and is setup to use 8 GPUs, decrease batch_size in case you're running out of memory.

Results

Below are some example results generated using CatGPT. These images showcase the performance of the generated text under various input conditions.

Sample 1
Sample 1: Learning methods advice.

Sample 2
Sample 2: Comforting a friend's advice

Sample 3
Sample 3: Trend Star Question

Sample 3
Sample 4: Python Program(Need Fix)

Acknowledgments

We would like to thank everyone who has contributed to this project, including but not limited to those who have submitted code, reported issues,and provided ideas. Special thanks to the following projects and their teams, whose research and achievements have provided us with valuable inspiration and technical support:

Thanks to these excellent projects, we were able to build CatGPT on their foundation and contribute to the Chinese NLP field.

Join Us

We warmly welcome you to join the CatGPT project! Your contribution will have a profound impact on the project. You can participate in the project in the following ways:

  • Submit code: Optimize the model structure, improve algorithm implementation, etc.
  • Provide high-quality datasets: Provide high-quality Chinese datasets to improve model performance.
  • Report issues: Submit any problems and suggestions encountered during use in the Issues section。
  • Improve documentation: Help us improve project documentation to make it more understandable and user-friendly.

If you are interested in contributing to the project, please open a Pull Request. We look forward to your participation and working together to make CatGPT even stronger!

About

CatGPT is an open-source project aimed at replicating the PPO algorithm of InstructGPT and training it on a Chinese dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages