Training Novel Concepts #71

adamgeddon1686 · 2024-02-21T13:01:22Z

Is there a way to train novel concepts into your blip model, like the way that textual inversions work for stable diffusion image generation? If so is there a training script provided or would one need to be created?

Also, there have been some recent innovations in computer vision software that might prove useful but I don't know how much it would require altering your model to use some of these. Kosmos2 by Microsoft has proved very promising in creating image captions for instance. Much better than my previous blip model I had used. Maybe a more powerful language model would overcome some of BLIPS shortcoming in identifying novel concepts. Further, there are new ways for these types of computer vision softwares to go about scanning an image to ensure, such as SAHI (Slicing Aided Hyper Inference) that allow for the computer to find smaller objects in larger images. I provided both of the links below for you to look at.

https://huggingface.co/docs/transformers/main/en/model_doc/kosmos-2

https://github.com/obss/sahi

xujz18 · 2024-03-05T02:35:32Z

Thank you so much for discussing and sharing! Regarding the first question, training new concepts into the model, we think that new scripts are needed for further training. Regarding your proposed new research results such as MLLM, we think it is a very worthwhile practice to try.

xujz18 added the training How to train ImageReward label Mar 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Novel Concepts #71

Training Novel Concepts #71

adamgeddon1686 commented Feb 21, 2024

xujz18 commented Mar 5, 2024

Training Novel Concepts #71

Training Novel Concepts #71

Comments

adamgeddon1686 commented Feb 21, 2024

xujz18 commented Mar 5, 2024