GPT-X

This repository contains the implementation of GPT-based model for text generation, including the training inference pipeline. The project is build using PyTorch and includes custom dataset preparation, model architecture, and utilities for training and testing the model.

Installation

Clone the repository:

git clone https://github.com/dykyivladk1/GPT-X.git
cd GPT-X

Install the required dependencies:
```
pip install -r requirements.txt
```
Ensure that PyTorch is installed. You can install it using the official guide here.
Prepare your dataset (text file):
- Place your text data in a .txt file. The default path is ./data/sample.txt.

Usage

Training

To train the model, use the following command:

python train.py --data_path /path/to/your/dataset.txt --max_iters 10000 --block_size 128 --batch_size 64 --learning_rate 3e-4 --betas 0.9 0.95 --weight_decay 0.1 --grad_norm_clip 1.0 --mode training

Arguments:

--data_path: Path to the text file containing training data.
--max_iters: Maximum number of training iterations.
--block_size: Length of the text sequence to process.
--batch_size: Number of samples per batch.
--learning_rate: Learning rate for the optimizer.
--betas: Betas for the Adam optimizer.
--weight_decay: Weight decay for regularization.
--grad_norm_clip: Maximum norm for gradient clipping.
--mode: Mode of operation, set to training for training.

Testing

To test the model and generate text, use:

python train.py --mode test --context "Sample Input" --weights_path weights/model.pth

Arguments:

--mode: Set to test for text generation.
--context: Initial text to start the generation.
--weights_path: Path to the model weights file.
--max_tokens: Maximum number of tokens to generate.

Project Structure

GPT-X/
├── src/                   
│   ├── dataset.py         
│   ├── encoder_.py       
│   ├── main.py            
│   ├── model.py           
│   └── utils.py          
├── weights/         
│   ├── dataset.py     
└── README.md

Model Overview

The GPT-X model is a simplified version of the GPT architecture, designed to generate text based on a given context. It includes the following components:

Multi-head Self-Attention
Positional Encoding
Feedforward Layers
Layer Normalization

The model is built with flexibility in mind, allowing easy adjustments to the number of layers, heads, and embedding size.

Dataset

The dataset is a simple text file where the model will learn to predict the next character in a sequence based on previous characters. The dataset is tokenized and prepared using the GPT_XDataset class.

Dataset Class:

GPT_XDataset: Prepares sequences of text for training the GPT-X model. It converts characters into indices and creates input-target pairs for training.

Contributing

Contributions are welcome! Please open an issue or a pull request for any improvements or suggestions.

Steps to Contribute

Fork the repository.
Create a new branch for your feature/bugfix.
Commit your changes.
Open a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPT-X

Table of Contents

Installation

Usage

Training

Testing

Project Structure

Model Overview

Dataset

Contributing

Steps to Contribute

About

Releases

Packages

Languages

dykyivladk1/GPT-X

Folders and files

Latest commit

History

Repository files navigation

GPT-X

Table of Contents

Installation

Usage

Training

Testing

Project Structure

Model Overview

Dataset

Contributing

Steps to Contribute

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages