Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run the fine tuning model code for Indic-English (it works for English to Indic) #18

Open
Gautam-Rajeev opened this issue Jan 13, 2023 · 3 comments
Assignees

Comments

@Gautam-Rajeev
Copy link

We were running into the following error while trying to run the fine-tuning code for Indic-English

	size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([54, 256]) from checkpoint, the shape in current model is torch.Size([34, 256]).
	size mismatch for decoder.output_projection.weight: copying a param with shape torch.Size([54, 256]) from checkpoint, the shape in current model is torch.Size([34, 256]).

I have linked the modified notebook to carry that out.

The notebook works for English-Indic fine tuning.

@GokulNC-Sarvam
Copy link

GokulNC-Sarvam commented Apr 21, 2024

It seems to be working for me. (I was finetuning it for a new language)

Config:

export OPENBLAS_NUM_THREADS=1
export NUMEXPR_MAX_THREADS=1
export CUDA_VISIBLE_DEVICES=0

fairseq-train /home/gokul/IndicXlit/data/romanization/corpus-bin \
    --save-dir checkpoints/bho-rom \
    --arch transformer --layernorm-embedding \
    --task translation_multi_simple_epoch \
    --sampling-method "temperature" \
    --sampling-temperature 1.5 \
    --encoder-langtok "src" \
    --lang-dict /home/gokul/IndicXlit/app/ai4bharat/transliteration/transformer/models/indic2en/lang_list_new.txt \
    --lang-pairs bho-en \
    --decoder-normalize-before --encoder-normalize-before \
    --activation-fn gelu --adam-betas "(0.9, 0.98)"  \
    --batch-size 512 \
    --decoder-attention-heads 4 --decoder-embed-dim 256 --decoder-ffn-embed-dim 1024 --decoder-layers 6 \
    --dropout 0.5 \
    --encoder-attention-heads 4 --encoder-embed-dim 256 --encoder-ffn-embed-dim 1024 --encoder-layers 6 \
    --lr 0.00003 --lr-scheduler inverse_sqrt \
    --max-epoch 40 \
    --optimizer adam  \
    --num-workers 0 \
    --warmup-init-lr 0 --warmup-updates 200 \
    --skip-invalid-size-inputs-valid-test \
    --keep-last-epochs 5 \
    --save-interval 5 \
    --keep-best-checkpoints 1 \
    --distributed-world-size 1 \
    --patience 10 \
    --restore-file /home/gokul/IndicXlit/checkpoints/indicxlit-indic-en-v1.0/transformer/indicxlit.pt \
    --reset-lr-scheduler \
    --reset-meters \
    --reset-dataloader \
    --reset-optimizer

@ahsanalidev
Copy link

I am also getting same error. Please help

@ahsanalidev
Copy link

@GautamR-Samagra Were you able to find solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants