Skip to content

a generative language model for thai language based on opt350m

License

Notifications You must be signed in to change notification settings

MinSiThu/thai-opt350m

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

library_name tags license language
transformers
thai
opt
generative ai
SEA
southeast-asian
mit
th

thai-opt350m-instruct-logo

thai-opt350m-instruct

a generative language model for thai language based on opt350m

Thai-OPT350M-Instruct is a fine-tuned pretrained transformer for thai language based on facebook/opt-350m.

Dataset for thai-opt350m-instruct

  • yadapruk/thai-instructions-rallio

Base Model

  • Facebook Open Pretrained Transformer

Languages

  • mainly support Thai Language
  • a few English, Chinese, Arabic

Training

  • epochs - 12
  • training loss - 0.809200

Model Page

https://huggingface.co/jojo-ai-mst/thai-opt350m-instruct

Prompt Format

User: อะไรคือวิธีที่ดีที่สุดในการทําความสะอาดพรม Assistant:

How to use

# pip install transformers

from transformers import AutoModelForCausalLM,AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("jojo-ai-mst/thai-opt350m-instruct")
tokenizer = AutoTokenizer.from_pretrained("jojo-ai-mst/thai-opt350m-instruct")

def generate_text(prompt, max_length=200, temperature=0.8, top_k=50):
    input_ids = tokenizer.encode(prompt, return_tensors="pt").cuda() # remove .cuda() if only cpu
    output = model.generate(
        input_ids,
        max_length=max_length,
        temperature=temperature,
        top_k=top_k,
        pad_token_id=tokenizer.eos_token_id,
        do_sample=True
    )
    for result in output:
      generated_text = tokenizer.decode(result, skip_special_tokens=True)
      print(generated_text)

generate_text("User: อะไรคือวิธีที่ดีที่สุดในการทําความสะอาดพรม Assistant:")

Date of Release

22/03/2024

License

MIT

Author

Min Si Thu

Notes

This ai model is a movement of MyanmarGPT-Movement.

About

a generative language model for thai language based on opt350m

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages