You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think this could be a good trade-off for Annif, because the lemmatization performance is probably not the main bottleneck in processing, but memory can be costly.
We should investigate how enabling this support would affect Annif and make it available either as an option, or possibly just switch to it entirely if the performance isn't too bad.
One question is how to initialize the tries. Simplemma does this lazily the first time a language is needed, but this could be problematic for Annif especially if it's running as a service. So maybe there should be a separate CLI operation to perform the initialization just once for all languages.
The text was updated successfully, but these errors were encountered:
Since version 1.1.0, Simplemma has support for trie-backed data structures which reduce the memory requirements a lot, at the cost of runtime performance.
I think this could be a good trade-off for Annif, because the lemmatization performance is probably not the main bottleneck in processing, but memory can be costly.
We should investigate how enabling this support would affect Annif and make it available either as an option, or possibly just switch to it entirely if the performance isn't too bad.
One question is how to initialize the tries. Simplemma does this lazily the first time a language is needed, but this could be problematic for Annif especially if it's running as a service. So maybe there should be a separate CLI operation to perform the initialization just once for all languages.
The text was updated successfully, but these errors were encountered: