Commonsense-based Neural Question Generation Model

This is a seq2seq Question Generation model referred on this repository for implementation of basic data interface and evaluation. However, the model was modified so that it can integrate extern information from Knowledge Graph to assist decoding, and we have got better test results.

Dependencies

To train or test our model, you should install the following Python Packages:

python >= 3.7
pytorch >= 1.5
nltk(nltk_data files are also needed)
tqdm
pytorch_scatter

Data Preprocess

Data of Knowledge Graph has been already processed by us, the original KG data is included in ./data/resource.json

Due to the corpus size, we can not provide SQuAD data on the Github, but you can download the corpus as followed:

mkdir squad
wget http://nlp.stanford.edu/data/glove.840B.300d.zip -O ./data/glove.840B.300d.zip 
unzip ./data/glove.840B.300d.zip 
wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json -O ./squad/train-v1.1.json
wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json -O ./squad/dev-v1.1.json
cd data
python process_data.py

Configuration

You might need to change model configuration in ./config.py.
If you want to train with your gpu, please set the gpu device in config.py Other model configurations and hyper-parameters can also be customized

Usage

To train the model, you can use the following commandlines:

python main.py --train (--model_path=<your_model_savepoint_path>)

The parameter --model_path is optional, if you want to train from scratch, then use python main.py --train

Once you model gets the best development set result of current training process, the model parameters will be saved in ./save/train_<timestamp>/<epoch_number>_<dev_loss>

To test the model, you can use the following commandlines:

python main.py --model_path=<your_model_paras_path> --output_file=<output_file_path>

Evaluation from this repository

cd qgevalcap
python2 eval.py --out_file <prediction_file> --src_file <src_file> --tgt_file <target_file>

Currently Best Results

BLEU_1	BLEU_2	BLEU_3	BLEU_4
46.30	30.85	22.76	17.47

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
qgevalcap		qgevalcap
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
data_utils.py		data_utils.py
infenrence.py		infenrence.py
main.py		main.py
model.py		model.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Commonsense-based Neural Question Generation Model

Dependencies

Data Preprocess

Configuration

Usage

Evaluation from this repository

Currently Best Results

Reference

About

Releases

Packages

Languages

License

Leadlegend/Commonsense-based-Question-Generation

Folders and files

Latest commit

History

Repository files navigation

Commonsense-based Neural Question Generation Model

Dependencies

Data Preprocess

Configuration

Usage

Evaluation from this repository

Currently Best Results

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages