Releases: ThilinaRajapakse/simpletransformers
Releases · ThilinaRajapakse/simpletransformers
Multiprocessing Support for `QuestionAnsweringModel`
Added
- Added multiprocessing support for Question Answering tasks for substantial performance boost where CPU-bound tasks (E.g. prediction especially with long contexts)
- Added
multiprocessing_chunksize
(default 500) toglobal_args
for finer control over chunking. Usually, the optimal value will be (roughly)number of examples / process count
.
Fixed
- Fixed bug in NERModel.predict() method when
split_on_space=False
. @alexysdussier
Option to disable model saving
Added
- Added
no_save
option to modelargs
. Setting this toTrue
will prevent models from being saved to disk. - Added minimal training script for
Seq2Seq
models in the examples directory.
Docs update and Bug fixes
Fixed
- Fixed potential bugs in loading weights when fine-tuning an ELECTRA language model. Fine-Tuning an ELECTRA language model now requires both
model_name
andmodel_type
to be set toelectra
. - Bugfix for generic Seq2SeqModel
- Bugfix when training language models from scratch
Changed
- Updated
Seq2SeqModel
to useMarianTokenizer
with MarianMT models. @flozi00
Sequence-to-Sequence task support added
Added
- Sequence-to-Sequence task support added. This includes the following models:
- BART
- Marian
- Generic Encoder-Decoder
- The
args
dict of a task-specific Simple Transformers model is now saved along with the model. When loading the model, these values will be read and used.
Any newargs
passed into the model initialization will override the loaded values.
Improvements and Bug Fixes
Added
- Support for
AutoModel
in NER, QA, and LanguageModeling. @flozi00
Fixed
- Now predict function from NER_Model returns a value model_outputs that contains:
A Python list of lists with dicts containing each word mapped to its list with raw model output. @flaviussn - Fixed T5 lm_labels not being masked properly
- Fixed issue with custom evaluation metrics not being handled correctly in
MultiLabelClassificationModel
. @galtay
Changed
- Torchvision import is now optional. It only needs to be installed if MultiModal models are used.
- Pillow import is now optional. It only needs to be installed if MultiModal models are used.
T5 Model Added
Added
- Added support for T5 Model.
- Added
do_sample
arg to language generation. NERModel.predict()
now accepts asplit_on_space
optional argument. If set toFalse
,to_predict
must be a a list of lists, with the inner list being a list of strings consisting of the split sequences. The outer list is the list of sequences to predict on.
Changed
eval_df
argument inNERModel.train_model()
renamed toeval_data
to better reflect the input format. Added Deprecation Warning.
ELECTRA model support for Classification and Question Answering tasks
Added
- Added Electra model support for sequence classification (binary, multiclass, multilabel)
- Added Electra model support for question answering
- Added Roberta model support for question answering
Changed
- Reduced logger messages during question answering evaluation
Language Generation
Language Generation is now supported!
Supported model types:
- GPT-2
- CTRL
- OpenAI-GPT
- XLNet
- Transformer-XL
- XLM
Custom Metrics for Question Answering
Added
- Added support for custom metrics with
QuestionAnsweringModel
.
Fixed
- Fixed issue with passing proxies to ConvAI models. @Pradhy729
Easier configuration for models and support for getting hidden layer outputs
Added
- Added option to get hidden layer outputs and embedding outputs with
ClassificationModel.predict()
method.- Setting
config: {"output_hidden_states": True}
will automatically return all embedding outputs and hidden layer outputs.
- Setting
Changed
global_args
now has aconfig
dictionary which can be used to override default values in the confg class.- This can be used with ClassificationModel, MultiLabelClassificationModel, NERModel, QuestionAnsweringModel, and LanguageModelingModel