Skip to content

a seq2seq model which integrate Knowledge Graph to assist generation process, using static graph attention mechanism

License

Notifications You must be signed in to change notification settings

Leadlegend/Commonsense-based-Question-Generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Commonsense-based Neural Question Generation Model

This is a seq2seq Question Generation model referred on this repository for implementation of basic data interface and evaluation. However, the model was modified so that it can integrate extern information from Knowledge Graph to assist decoding, and we have got better test results.

Dependencies

To train or test our model, you should install the following Python Packages:

  • python >= 3.7
  • pytorch >= 1.5
  • nltk(nltk_data files are also needed)
  • tqdm
  • pytorch_scatter

Data Preprocess

Data of Knowledge Graph has been already processed by us, the original KG data is included in ./data/resource.json

Due to the corpus size, we can not provide SQuAD data on the Github, but you can download the corpus as followed:

mkdir squad
wget http://nlp.stanford.edu/data/glove.840B.300d.zip -O ./data/glove.840B.300d.zip 
unzip ./data/glove.840B.300d.zip 
wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json -O ./squad/train-v1.1.json
wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json -O ./squad/dev-v1.1.json
cd data
python process_data.py

Configuration

You might need to change model configuration in ./config.py.
If you want to train with your gpu, please set the gpu device in config.py Other model configurations and hyper-parameters can also be customized

Usage

To train the model, you can use the following commandlines:

python main.py --train (--model_path=<your_model_savepoint_path>)

The parameter --model_path is optional, if you want to train from scratch, then use python main.py --train

Once you model gets the best development set result of current training process, the model parameters will be saved in ./save/train_<timestamp>/<epoch_number>_<dev_loss>

To test the model, you can use the following commandlines:

python main.py --model_path=<your_model_paras_path> --output_file=<output_file_path>

Evaluation from this repository

cd qgevalcap
python2 eval.py --out_file <prediction_file> --src_file <src_file> --tgt_file <target_file>

Currently Best Results

BLEU_1 BLEU_2 BLEU_3 BLEU_4
46.30 30.85 22.76 17.47

Reference

About

a seq2seq model which integrate Knowledge Graph to assist generation process, using static graph attention mechanism

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages