proposal: OOVSkills #273

JarbasAl · 2023-02-07T21:44:48Z

these are some notes i had in a gist, just an idea not something in roadmap, but i figured it would be cool to discuss

OOVSkills framework proposal

out-of-vocabulary intents skill class - OOVSkills
triggered before common_query framework
match an intent using something like https://github.com/GT4SD/zberta
- this gives us an intent classification without a registred handler
- "handle this question" becomes "handle this intent somehow"
- OOVSkills can use this info for a richer common_query experience
send the intent_tag + utterance to OOVSkills

ChanceNCounter · 2023-02-07T22:18:20Z

I like the notion. I couldn’t help noticing that the image above depicts an overcapture that might not be avoidable.

NeonDaniel · 2023-02-23T19:01:58Z

To summarize some notes from a chat with @JarbasAl:

I could see using this system to handle "common" intents ("time", "weather", "jokes", etc.) where the intent matching is all done in some intent engine (or bundled adapt/padatious intents) and skills simply register handlers for these known intents.

The advantage here is that skill processing is separated from intent matching, so someone wanting to make a different "joke" skill or a specialized alarm handler doesn't need to write intents or manage intent conflicts

JarbasAl · 2023-02-23T19:18:00Z

my design would be that we add a OOV plugin in there, any downstream (such as neon-core) can then do whatever it wants such as loading default padatious/adapt intents

i think a good output of this new plugin class would be a simple list of top 5 intents + confidence

as for plugin implementations it would be nice to team up with our friends at @secretsauceai and have 1 plugin for each of their implementations https://github.com/secretsauceai/NLU-engine-prototype-benchmarks we can also collaborate in intent datasets.

ping @AmateurAcademic

AmateurAcademic · 2023-02-24T17:09:46Z

Gladly. All the intent tags have been cleaned for the data set and are ready. I am still working through the entity tags, that one takes a while. You are welcome to use whatever data you like, and if I can help further with data, please let me know. This is where I spend most of my leisure time. LOL

As for NLU engines, I do think the easiest solution is actually still Snips. It is fast, easy to run. I don't think you could run anything faster that performs so well. I have done many tests with logistic regression with BOWs/TFIDF, as well as other stuff like random forest, etc. and I have found nothing really beats logistic regression when it comes to these low powered engines. That CRFs for entity extraction are also based on LR is a great bonus. Snips uses this, plus it has a deterministic classifier so that it ensures that the training set works.

However, I have a few others I have been experimenting with that would provide the best results possible. The biggest, badest on the block would be the DistilBERT joint intent and entity classifier. I would highly recommend this one, based on my benchmarks. @JarbasAl, I can perhaps help you out there with this one. This will probably be the NLU engine I use in the end, since it gives the best performance. This is a PyTorch implementation I currently use: https://github.com/monologg/JointBERT. One drawback here is the lack of few-shot, to quickly train a classifier without fine-tuning.

I have also tried some few-shot models from Hugging Face. For intent tagging, it works pretty well. However, for entity tagging, it worked very poorly.

For a really good few-shot intent AND entity classifier, I am experimenting with several models, including few-shot with a t5. In theory, the t5 could do it all:

Few-shot for intent classification (intents, domains, or both).
Few-shot for entity extraction.
Maybe: few-shot for response generation (NLG) (I am still working on this).

For at least intent and entities it could very likely perform well with a t5-small (possibly even tiny) model, so it could be run by pretty much anyone in real-time. I am not sure when this will be ready. I want to finish cleaning the entity data first.

If there is anything else we can do, kindly let me know. Always happy to help!

NeonDaniel · 2023-02-25T03:06:20Z

my design would be that we add a OOV plugin in there, any downstream (such as neon-core) can then do whatever it wants such as loading default padatious/adapt intents

I'm not sure I understand what "out of vocabulary" means in this context; I would think that any intent engine could be plugged in wherever it wants relative to the other intent plugins.

i think a good output of this new plugin class would be a simple list of top 5 intents + confidence

Maybe this should be a broader intent refactoring? I could see the flow looking more like:

Converse
Collect intents with some normalized confidence (adapt, padatious, CommonQuery, any other engines)
Select best response with minimum confidence
Fallback

JarbasAl · 2023-02-28T00:57:36Z

my design would be that we add a OOV plugin in there, any downstream (such as neon-core) can then do whatever it wants such as loading default padatious/adapt intents

I'm not sure I understand what "out of vocabulary" means in this context; I would think that any intent engine could be plugged in wherever it wants relative to the other intent plugins.

i think a good output of this new plugin class would be a simple list of top 5 intents + confidence

Maybe this should be a broader intent refactoring? I could see the flow looking more like:
1. Converse

2. Collect intents with some normalized confidence (adapt, padatious, CommonQuery, any other engines)

3. Select best response with minimum confidence

4. Fallback

the current flow is:

Converse
Skill Intents
CommonQA
Fallbacks

the new flow is:

Converse
Skill Intents
OOV framework
CommonQA
Fallbacks

OOV means you get a intent classification without a specified handler, it was provided by core but not by any specific skill, hence "out of vocabulary" from the registered skills POV

OOV service would itself be a plugin, several implementations for testing and benchmarks would be:

default (hardcoded) adapt/padatious intents <- the easiest way for downstreams to use existing OVOS/Neon projects for testing
https://github.com/OpenJarbas/little_questions <- a simple but useful question classifier, this is the sort of output I am personally interested in
collect different intent datasets, provide models for all implementations from @secretsauceai benchmarks under a plugin (model to load defined in config)
Zberta/DistillBert/JointBERT and other BERT style plugins, just to have a SOTA implementation of sorts to stimulate research into other similar and lightweight options
Snips as reference baseline as suggested by @AmateurAcademic

This should cover all use cases

JarbasAl · 2023-02-28T01:03:17Z

related dataset: https://github.com/OpenJarbas/mycroft_intent_dataset

NeonDaniel · 2023-02-28T01:38:23Z

the new flow is:

Converse

Skill Intents

OOV framework

CommonQA

Fallbacks

OOV means you get a intent classification without a specified handler, it was provided by core but not by any specific skill, hence "out of vocabulary" from the registered skills POV

I think it would be better to implement Skill Intents as part of the OOV framework with confidence values used to determine which handler is used. i.e. a Padatious high conf match from a skill should not always take priority over a Padatious match from the OOV service.

In practice, you might end up with Adapt/Padatious intents taking top priority in any case since you'd have to weight the different OOV plugins in any case

From an implementation standpoint too, giving skill intents priority would mean that "legacy" skills would always take priority over skills implementing OOV.

JarbasAl · 2023-02-28T01:49:33Z

I think it would be better to implement Skill Intents as part of the OOV framework with confidence values used to determine which handler is used. i.e. a Padatious high conf match from a skill should not always take priority over a Padatious match from the OOV service.

In practice, you might end up with Adapt/Padatious intents taking top priority in any case since you'd have to weight the different OOV plugins in any case

From an implementation standpoint too, giving skill intents priority would mean that "legacy" skills would always take priority over skills implementing OOV.

this is indeed intended behavior, different setups may use different OOV frameworks (by design, classifiers trained for a specific domain) which means OOV skills will NOT work everywhere.

regular skills should take precedence, that is the default way to add new intents to the OVOS intent framework without retraining anything at runtime

a skill can ofc use both frameworks, like it can use converse + intents + fallback all at once

since by default ovos-core does not force skills, any derivative voice assistant (such as classic core or neon) could use exclusively the OOV framework if desired and dedicated skills, this seems to lign with what mycroft-dinkum intended and could be implemented via something like https://github.com/OpenVoiceOS/my-assistant

on a related note hardcoded adapt/padatious are also roadmapped to be pluginified with a lot of work done already, i'd like to keep that discussion in #100. the skill intents framework itself will soon also be replaceable

JarbasAl · 2023-03-27T21:50:45Z

@AmateurAcademic just made this repo public https://github.com/OpenVoiceOS/ovos-classifiers

together with https://github.com/OpenVoiceOS/ovos-datasets this will allow some proof of concept implementations of framework above

JarbasAl · 2023-10-06T20:54:18Z

here is an approach for this https://gist.github.com/JarbasAl/e07e17a2d98a80eb6bf60139acc1b9c7

JarbasAl added the discussion do we want this? label Feb 7, 2023

JarbasAl assigned ChanceNCounter, JarbasAl and NeonDaniel Feb 7, 2023

JarbasAl mentioned this issue Feb 23, 2023

roadmap - 0.0.9 #281

Closed

36 tasks

JarbasAl mentioned this issue Sep 20, 2023

feat/pipeline_plugins #349

Closed

15 tasks

JarbasAl added this to the 0.2.0 milestone Apr 8, 2024

JarbasAl mentioned this issue Apr 8, 2024

feat - pipeline plugins #437

Open

JarbasAl unassigned ChanceNCounter, JarbasAl and NeonDaniel Apr 8, 2024

JarbasAl removed this from the 0.2.0 milestone Oct 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: OOVSkills #273

proposal: OOVSkills #273

JarbasAl commented Feb 7, 2023

ChanceNCounter commented Feb 7, 2023

NeonDaniel commented Feb 23, 2023

JarbasAl commented Feb 23, 2023 •

edited

Loading

AmateurAcademic commented Feb 24, 2023

NeonDaniel commented Feb 25, 2023

JarbasAl commented Feb 28, 2023 •

edited

Loading

JarbasAl commented Feb 28, 2023

NeonDaniel commented Feb 28, 2023 •

edited

Loading

JarbasAl commented Feb 28, 2023 •

edited

Loading

JarbasAl commented Mar 27, 2023

JarbasAl commented Oct 6, 2023

proposal: OOVSkills #273

proposal: OOVSkills #273

Comments

JarbasAl commented Feb 7, 2023

OOVSkills framework proposal

ChanceNCounter commented Feb 7, 2023

NeonDaniel commented Feb 23, 2023

JarbasAl commented Feb 23, 2023 • edited Loading

AmateurAcademic commented Feb 24, 2023

NeonDaniel commented Feb 25, 2023

JarbasAl commented Feb 28, 2023 • edited Loading

JarbasAl commented Feb 28, 2023

NeonDaniel commented Feb 28, 2023 • edited Loading

JarbasAl commented Feb 28, 2023 • edited Loading

JarbasAl commented Mar 27, 2023

JarbasAl commented Oct 6, 2023

JarbasAl commented Feb 23, 2023 •

edited

Loading

JarbasAl commented Feb 28, 2023 •

edited

Loading

NeonDaniel commented Feb 28, 2023 •

edited

Loading

JarbasAl commented Feb 28, 2023 •

edited

Loading