You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we are using SpaCy to do low level NLP tasks (like tokenization, sentence segmentation, POS tagging and parsing). However, these models were trained on general domain text.
The folks at AllenNLP have release SciSpaCy, a SpaCy model trained on biomedical text. We should check if this model improves performance, and if so, switch to it. This would also allow us to drop our custom tokenizer (less code).
Preliminary results look good, and with SciSpaCy appearing to boost performance of coreference resolution (NeuralCoref relies on the underlying SpaCy model for preprocessing).
The text was updated successfully, but these errors were encountered:
Currently, we are using SpaCy to do low level NLP tasks (like tokenization, sentence segmentation, POS tagging and parsing). However, these models were trained on general domain text.
The folks at AllenNLP have release SciSpaCy, a SpaCy model trained on biomedical text. We should check if this model improves performance, and if so, switch to it. This would also allow us to drop our custom tokenizer (less code).
Preliminary results look good, and with SciSpaCy appearing to boost performance of coreference resolution (NeuralCoref relies on the underlying SpaCy model for preprocessing).
The text was updated successfully, but these errors were encountered: