Skip to content

ProjetPP/PPP-QuestionParsing-Grammatical

Repository files navigation

PPP-QuestionParsing-Grammatical

Question Parsing module for the PPP using a grammatical approach.

Build Status Code Coverage Scrutinizer Code Quality

Introduction

The purpose of this Question Parsing module is to transform questions into trees of triples, as described into our datamodel, which can be handled by backend modules.

For instance, we aim at producing the triple (Rwanda,president,?) from the question Who is the president of Rwanda?. The formal representation of this triple is:

{
    "type": "triple",
    "subject": {
        "value": "Rwanda",
        "type": "resource"
    },
    "predicate": {
        "value": "president",
        "type": "resource"
    },
    "object": {
        "type": "missing"
    }
}

See our website to learn more about the project and test our question answering tool.

How to install

Stanford CoreNLP

Run the script dependencies.sh to install and launch the CoreNLP server.

QuestionParsing module

With a recent version of pip:

pip3 install git+https://github.com/ProjetPP/PPP-QuestionParsing-Grammatical.git

With an older one:

git clone https://github.com/ProjetPP/PPP-QuestionParsing-Grammatical.git
cd PPP-QuestionParsing-Grammatical
python3 setup.py install

Use the --user option if you want to install it only for the current user.

You can install the main dependencies (especially CoreNLP) using the script file bootstrap_corenlp.sh from https://github.com/ProjetPP/Scripts (clone the Scripts repository and run bash bootstrap_corenlp.sh).

How to test

The demo folder contains some demo files and a README.md that explains how to use them.

Overview of the main folders

  • ppp_questionparsing_grammatical/: main code of the project
  • demo/: demo files to test the algorithms
  • nounification/: scripts used to compute the nounification database. See the Readme to help us improving this part of the algorithm!
  • tests/: unit tests of the project
  • documentation/: some files that expose our current thinking on the project (mainly drafts)