This repo contains simplified question generator model pipeline. This is a part of Quizzy project, you can see the model in action there. Generating questions by finetuning T5 transformer on both SQuAD and SciQ datasets. PyTorch Lightning is used to finetune transformer. Context
, Question
and Answer
extracted from datasets. Context
and Answer
will be given to model as input in order to generate Question
. SQuAD is a reading comprehension dataset, model trained on that dataset is used for general purpose question generation and SciQ contains science exam questions and context, model trained on that dataset used specifically for physics, chemistry and biology related question generation on Quizzy application.
"context: {context} answer: {answer}"
- Python 3.7 or newer.
All dataset links available in notebook. SciQ dataset contains context related to Physics, Chemistry and Biology exam questions with context and SQuAD datast is a reading comprehension dataset. So, choose a dataset suitable for you.
- Export training dataset from
data_extraction.ipynb
. - Run
train.ipynb
to start training.
Skip ONNX conversion and quantization steps if you are using GPU for inference. fastT5 is used to convert PyTorch model to ONNX which only supports CPU version of onnxruntime currently.
👤 Karthick T. Sharma
- Github: @Karthick47v2
- LinkedIn: @Karthick47
@inproceedings{rajpurkar-etal-2016-squad,
title = "{SQ}u{AD}: 100,000+ Questions for Machine Comprehension of Text",
author = "Rajpurkar, Pranav and
Zhang, Jian and
Lopyrev, Konstantin and
Liang, Percy",
booktitle = "Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2016",
address = "Austin, Texas",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/D16-1264",
doi = "10.18653/v1/D16-1264",
pages = "2383--2392",
}
@inproceedings{SciQ,
title={Crowdsourcing Multiple Choice Science Questions},
author={Johannes Welbl, Nelson F. Liu, Matt Gardner},
year={2017},
journal={arXiv:1707.06209v1}
}
Contributions, issues and feature requests are welcome!
Feel free to check issues page.
Give a ⭐️ if this project helped you!