Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: OOVSkills #273

Open
JarbasAl opened this issue Feb 7, 2023 · 11 comments
Open

proposal: OOVSkills #273

JarbasAl opened this issue Feb 7, 2023 · 11 comments
Labels
discussion do we want this?

Comments

@JarbasAl
Copy link
Member

JarbasAl commented Feb 7, 2023

these are some notes i had in a gist, just an idea not something in roadmap, but i figured it would be cool to discuss

OOVSkills framework proposal

  • out-of-vocabulary intents skill class - OOVSkills
  • triggered before common_query framework
  • match an intent using something like https://github.com/GT4SD/zberta
    • this gives us an intent classification without a registred handler
    • "handle this question" becomes "handle this intent somehow"
    • OOVSkills can use this info for a richer common_query experience
  • send the intent_tag + utterance to OOVSkills

@ChanceNCounter
Copy link

I like the notion. I couldn’t help noticing that the image above depicts an overcapture that might not be avoidable.

@NeonDaniel
Copy link
Member

To summarize some notes from a chat with @JarbasAl:

I could see using this system to handle "common" intents ("time", "weather", "jokes", etc.) where the intent matching is all done in some intent engine (or bundled adapt/padatious intents) and skills simply register handlers for these known intents.

The advantage here is that skill processing is separated from intent matching, so someone wanting to make a different "joke" skill or a specialized alarm handler doesn't need to write intents or manage intent conflicts

@JarbasAl
Copy link
Member Author

JarbasAl commented Feb 23, 2023

my design would be that we add a OOV plugin in there, any downstream (such as neon-core) can then do whatever it wants such as loading default padatious/adapt intents

i think a good output of this new plugin class would be a simple list of top 5 intents + confidence

as for plugin implementations it would be nice to team up with our friends at @secretsauceai and have 1 plugin for each of their implementations https://github.com/secretsauceai/NLU-engine-prototype-benchmarks we can also collaborate in intent datasets.

ping @AmateurAcademic

@JarbasAl JarbasAl mentioned this issue Feb 23, 2023
36 tasks
@AmateurAcademic
Copy link

Gladly. All the intent tags have been cleaned for the data set and are ready. I am still working through the entity tags, that one takes a while. You are welcome to use whatever data you like, and if I can help further with data, please let me know. This is where I spend most of my leisure time. LOL

As for NLU engines, I do think the easiest solution is actually still Snips. It is fast, easy to run. I don't think you could run anything faster that performs so well. I have done many tests with logistic regression with BOWs/TFIDF, as well as other stuff like random forest, etc. and I have found nothing really beats logistic regression when it comes to these low powered engines. That CRFs for entity extraction are also based on LR is a great bonus. Snips uses this, plus it has a deterministic classifier so that it ensures that the training set works.

However, I have a few others I have been experimenting with that would provide the best results possible. The biggest, badest on the block would be the DistilBERT joint intent and entity classifier. I would highly recommend this one, based on my benchmarks. @JarbasAl, I can perhaps help you out there with this one. This will probably be the NLU engine I use in the end, since it gives the best performance. This is a PyTorch implementation I currently use: https://github.com/monologg/JointBERT. One drawback here is the lack of few-shot, to quickly train a classifier without fine-tuning.

I have also tried some few-shot models from Hugging Face. For intent tagging, it works pretty well. However, for entity tagging, it worked very poorly.

For a really good few-shot intent AND entity classifier, I am experimenting with several models, including few-shot with a t5. In theory, the t5 could do it all:

  • Few-shot for intent classification (intents, domains, or both).
  • Few-shot for entity extraction.
  • Maybe: few-shot for response generation (NLG) (I am still working on this).

For at least intent and entities it could very likely perform well with a t5-small (possibly even tiny) model, so it could be run by pretty much anyone in real-time. I am not sure when this will be ready. I want to finish cleaning the entity data first.

If there is anything else we can do, kindly let me know. Always happy to help!

@NeonDaniel
Copy link
Member

my design would be that we add a OOV plugin in there, any downstream (such as neon-core) can then do whatever it wants such as loading default padatious/adapt intents

I'm not sure I understand what "out of vocabulary" means in this context; I would think that any intent engine could be plugged in wherever it wants relative to the other intent plugins.

i think a good output of this new plugin class would be a simple list of top 5 intents + confidence

Maybe this should be a broader intent refactoring? I could see the flow looking more like:

  1. Converse
  2. Collect intents with some normalized confidence (adapt, padatious, CommonQuery, any other engines)
  3. Select best response with minimum confidence
  4. Fallback

@JarbasAl
Copy link
Member Author

JarbasAl commented Feb 28, 2023

my design would be that we add a OOV plugin in there, any downstream (such as neon-core) can then do whatever it wants such as loading default padatious/adapt intents

I'm not sure I understand what "out of vocabulary" means in this context; I would think that any intent engine could be plugged in wherever it wants relative to the other intent plugins.

i think a good output of this new plugin class would be a simple list of top 5 intents + confidence

Maybe this should be a broader intent refactoring? I could see the flow looking more like:

1. Converse

2. Collect intents with some normalized confidence (adapt, padatious, CommonQuery, any other engines)

3. Select best response with minimum confidence

4. Fallback

the current flow is:

  • Converse
  • Skill Intents
  • CommonQA
  • Fallbacks

the new flow is:

  • Converse
  • Skill Intents
  • OOV framework
  • CommonQA
  • Fallbacks

OOV means you get a intent classification without a specified handler, it was provided by core but not by any specific skill, hence "out of vocabulary" from the registered skills POV

OOV service would itself be a plugin, several implementations for testing and benchmarks would be:

  • default (hardcoded) adapt/padatious intents <- the easiest way for downstreams to use existing OVOS/Neon projects for testing
  • https://github.com/OpenJarbas/little_questions <- a simple but useful question classifier, this is the sort of output I am personally interested in
  • collect different intent datasets, provide models for all implementations from @secretsauceai benchmarks under a plugin (model to load defined in config)
  • Zberta/DistillBert/JointBERT and other BERT style plugins, just to have a SOTA implementation of sorts to stimulate research into other similar and lightweight options
  • Snips as reference baseline as suggested by @AmateurAcademic

This should cover all use cases

@JarbasAl
Copy link
Member Author

@NeonDaniel
Copy link
Member

NeonDaniel commented Feb 28, 2023

the new flow is:

  • Converse
  • Skill Intents
  • OOV framework
  • CommonQA
  • Fallbacks

OOV means you get a intent classification without a specified handler, it was provided by core but not by any specific skill, hence "out of vocabulary" from the registered skills POV

I think it would be better to implement Skill Intents as part of the OOV framework with confidence values used to determine which handler is used. i.e. a Padatious high conf match from a skill should not always take priority over a Padatious match from the OOV service.

In practice, you might end up with Adapt/Padatious intents taking top priority in any case since you'd have to weight the different OOV plugins in any case

From an implementation standpoint too, giving skill intents priority would mean that "legacy" skills would always take priority over skills implementing OOV.

@JarbasAl
Copy link
Member Author

JarbasAl commented Feb 28, 2023

I think it would be better to implement Skill Intents as part of the OOV framework with confidence values used to determine which handler is used. i.e. a Padatious high conf match from a skill should not always take priority over a Padatious match from the OOV service.

In practice, you might end up with Adapt/Padatious intents taking top priority in any case since you'd have to weight the different OOV plugins in any case

From an implementation standpoint too, giving skill intents priority would mean that "legacy" skills would always take priority over skills implementing OOV.

this is indeed intended behavior, different setups may use different OOV frameworks (by design, classifiers trained for a specific domain) which means OOV skills will NOT work everywhere.

regular skills should take precedence, that is the default way to add new intents to the OVOS intent framework without retraining anything at runtime

a skill can ofc use both frameworks, like it can use converse + intents + fallback all at once

since by default ovos-core does not force skills, any derivative voice assistant (such as classic core or neon) could use exclusively the OOV framework if desired and dedicated skills, this seems to lign with what mycroft-dinkum intended and could be implemented via something like https://github.com/OpenVoiceOS/my-assistant

on a related note hardcoded adapt/padatious are also roadmapped to be pluginified with a lot of work done already, i'd like to keep that discussion in #100. the skill intents framework itself will soon also be replaceable

@JarbasAl
Copy link
Member Author

@AmateurAcademic just made this repo public https://github.com/OpenVoiceOS/ovos-classifiers

together with https://github.com/OpenVoiceOS/ovos-datasets this will allow some proof of concept implementations of framework above

@JarbasAl JarbasAl mentioned this issue Sep 20, 2023
15 tasks
@JarbasAl
Copy link
Member Author

JarbasAl commented Oct 6, 2023

here is an approach for this https://gist.github.com/JarbasAl/e07e17a2d98a80eb6bf60139acc1b9c7

@JarbasAl JarbasAl added this to the 0.2.0 milestone Apr 8, 2024
@JarbasAl JarbasAl removed this from the 0.2.0 milestone Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion do we want this?
Projects
None yet
Development

No branches or pull requests

4 participants