Skip to content
This repository has been archived by the owner on Jan 14, 2023. It is now read-only.
/ YAKbot Public archive

⚠️ DEVELOPMENT REPO - NOT MAINTAINED OR EASILY DEPLOYABLE ⚠️

License

Notifications You must be signed in to change notification settings

Frikallo/YAKbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YAKbot🐂

A complex, multipurpose AI Discord bot.

Info

[Info] [Environment] [Setup & Install] [Citations]

YAKbot is a collection of AI models based on image generation, analysis, and processing; all wrapped into one Discord Bot. YAKbot's out-of-the-box commands range from image generation with VQGAN+CLIP or guided diffusion to image analysis and captioning with personality.

A full list of YAKbots commands and uses can be found either in discord with the .help command, or right here:

Command Syntax Help Examples
.rembg [Attatchment] removes background from attached image Example
.esrgan [Attatchment] YAKbot will use a pre-trained ESRGAN upscaler to upscale the resolution of your image up to 4 times Example
.status sends embed message with all relevant device stats for YAKbot Example
.imagine [Prompt] YAKbot will use CLIP+VQGAN open generation to create an original image from your prompt Example
.facehq, .wikiart, .default, .d1024 Changes YAKbots VQGAN+CLIP model to one trained solely on faces, art or default configuration Example
.square, .landscape, .portrait YAKbot will update his size configurations for generations to your specified orientation Example
.seed [Desired Seed] Changes YAKbots seed for all open generation (if 0 will set to random) Example
.faces [Attatchment] YAKbot will look through your photo and try to find any recognizable faces Example
.latentdiffusion [Prompt] This command is another Text2Image method like .imagine but uses a method called latent diffusion. innitiate this method by using the command Example
.outline [Prompt] YAKbot will contact a local GPT3 model that will synthasize and look for essays on your prompt while outputting an outline/list of ideas/facts about your prompt to help kickstart your projects Example
[Attatchment] YAKbot will first decide if your attatchment is neutral, negative or positive and then based on that will try to caption your image with both text, and emoji Example

To see examples of all the different commands click here: Examples

Environment

  • Windows 11 (10.0.22)
  • Anaconda
  • Nvidia RTX 2070 Super (8GB)

Typical VRAM requirements:

  • VQGAN(256x256 - 512x512) ~ 5-10GB
  • Diffusion 256x256 ~ 6-7GB
  • Diffusion 512x256 ~ 10-12GB
  • Image classification & captioning ~ 4GB

Setup and Installation

First, install PyTorch 1.9.0 and torchvision, as well as small additional dependencies, and then install this repo as a Python package. On a CUDA GPU machine, the following will do the trick:

$ pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
$ pip install ftfy regex tqdm
$ pip install git+https://github.com/openai/CLIP.git

Replace cu111 above with the appropriate CUDA version on your machine or cpuonly when installing on a machine without a GPU. Next, we install the dependant packages:

Install Instructions (Click me)
$ git clone https://github.com/dzryk/cliptalk.git
$ cd cliptalk/
$ git clone https://github.com/dzryk/clip-grams.git
$ git clone https://github.com/openai/CLIP
$ pip install ftfy
$ pip install transformers
$ pip install autofaiss
$ pip install wandb
$ pip install webdataset
$ pip install git+https://github.com/PyTorchLightning/pytorch-lightning
$ curl -OL 'https://drive.google.com/uc?id=1fhWspkaOJ31JS91sJ-85y1P597dIfavJ'
$ curl -OL 'https://drive.google.com/uc?id=1PJcBni9lCRroFqnQBfOJOg9gVC5urq2H'
$ curl -OL 'https://drive.google.com/uc?id=13Xtf7SYplE4n5Q-aGlf954m6dN-qsgjW'
$ curl -OL 'https://drive.google.com/uc?id=1xyjhZMbzyI-qVz-plsxDOXdqWyrKbmyS'
$ curl -OL 'https://drive.google.com/uc?id=1peB-l-CWtwx0NKAIeAcwsnisjocc--66'
$ mkdir checkpoints
$ mkdir unigrams
$ mkdir bigrams
$ mkdir artstyles
$ mkdir emotions
$ unzip ./model.zip -d checkpoints #make sure the unzipped "model" Folder goes in ./YAKbot/Bot/checkpoints 
$ unzip ./unigrams.zip -d unigrams #make sure the unzipped "unigrams" Folder goes in ./YAKbot/Bot/unigrams 
$ unzip ./bigrams.zip -d bigrams #make sure the unzipped "bigrams" Folder goes in ./YAKbot/Bot/bigrams 
$ unzip ./artstyles.zip -d artstyles #make sure the unzipped "artstyles" Folder goes in ./YAKbot/Bot/artstyles 
$ unzip ./emotions.zip -d emotions #make sure the unzipped "emotions" Folder goes in ./YAKbot/Bot/emotions 
  • Personality-CLIP
Install Instructions (Click me)
$ git clone https://github.com/openai/CLIP
$ git clone https://github.com/CompVis/taming-transformers.git
$ pip install ftfy regex tqdm omegaconf pytorch-lightning
$ pip install kornia
$ pip install imageio-ffmpeg   
$ pip install einops          
$ mkdir steps
#place all of the following model files in ./YAKbot/Bot
$ curl -L -o vqgan_imagenet_f16_1024.yaml -C - 'http://mirror.io.community/blob/vqgan/vqgan_imagenet_f16_1024.yaml' #ImageNet 1024
$ curl -L -o vqgan_imagenet_f16_1024.ckpt -C - 'http://mirror.io.community/blob/vqgan/vqgan_imagenet_f16_1024.ckpt'  #ImageNet 1024
$ curl -L -o vqgan_imagenet_f16_16384.yaml -C - 'https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/files/?p=%2Fconfigs%2Fmodel.yaml&dl=1' #ImageNet 16384
$ curl -L -o vqgan_imagenet_f16_16384.ckpt -C - 'https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/files/?p=%2Fckpts%2Flast.ckpt&dl=1' #ImageNet 16384
$ curl -L -o faceshq.yaml -C - 'https://drive.google.com/uc?export=download&id=1fHwGx_hnBtC8nsq7hesJvs-Klv-P0gzT' #FacesHQ
$ curl -L -o faceshq.ckpt -C - 'https://app.koofr.net/content/links/a04deec9-0c59-4673-8b37-3d696fe63a5d/files/get/last.ckpt?path=%2F2020-11-13T21-41-45_faceshq_transformer%2Fcheckpoints%2Flast.ckpt' #FacesHQ
$ curl -L -o wikiart_16384.yaml -C - 'http://mirror.io.community/blob/vqgan/wikiart_16384.yaml' #WikiArt 16384
$ curl -L -o wikiart_16384.ckpt -C - 'http://mirror.io.community/blob/vqgan/wikiart_16384.ckpt' #WikiArt 16384
  • VQGAN+CLIP(z+quantize)

Once all are installed just run:

$ git clone https://github.com/Frikallo/YAKbot.git
$ cd YAKbot 
$ pip install -r requirements.txt

Before YAKbot can start make sure you have your bot token set.

#The end of your bot.py file should look something like this.
bot.run('qTIzNTA4NjMhUJI3NzgzJAAy.YcOCbw.GMYbjBWdiIWBPFrm_IMlUTlMGjM') #Your Token Here

Now finally run the bot:

$ cd Bot
$ python3 bot.py
  • Enjoy!

Note: I will not provide support for self-hosting. If you are unable to self-host YAKbot by yourself, just join my discord server where YAKbot runs 24/7.

A successful startup will look like this:

(Click me)

.env Setup

If self hosting, make sure you have a .env file within the ./Bot directory. Your evironment file should look somthing like this:

"YAKbot/Bot/.env"
prompt = '--'
seed = '42'
infile = '--'
model = 'vqgan_imagenet_f16_16384' #Model checkpoint file for .imagine
width = '412' #Width in pixels for .imagine
height = '412' #Height in pixels for .imagine
max_iterations = '400' #Max iterations for .imagine
bot_token = '--' #Discord bot token, get yours at https://discord.com/developers/applications
input_image = '--'
CB1 = 'True'
CB2 = 'True'
CB3 = 'True'
webhook = '--' #Discord webhook url for startup info
OPENAI_KEY = '--' #OpenAI API token for GPT3 generation commands

Other repos

You may also be interested in https://github.com/afiaka87/clip-guided-diffusion

For upscaling images, try https://github.com/xinntao/Real-ESRGAN

Citations

@misc{unpublished2021clip,
    title  = {CLIP: Connecting Text and Images},
    author = {Alec Radford, Ilya Sutskever, Jong Wook Kim, Gretchen Krueger, Sandhini Agarwal},
    year   = {2021}
}

@InProceedings{wang2021realesrgan,
    author    = {Xintao Wang and Liangbin Xie and Chao Dong and Ying Shan},
    title     = {Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data},
    booktitle = {International Conference on Computer Vision Workshops (ICCVW)},
    date      = {2021}
}
  • Original Caption notebook: Open In Colab from dzyrk
  • Original z+quantize notebook: Open In Colab from crimeacs

This project was HEAVILY influenced and inspired by EleutherAI's discord bot (BATbot)

License

MIT License

Copyright (c) 2021 Frikallo

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

About

⚠️ DEVELOPMENT REPO - NOT MAINTAINED OR EASILY DEPLOYABLE ⚠️

Topics

Resources

License

Stars

Watchers

Forks

Languages