Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train.py: error: unrecognized arguments: --t-mult 1 #13

Open
fkjslee opened this issue Jan 10, 2022 · 0 comments
Open

train.py: error: unrecognized arguments: --t-mult 1 #13

fkjslee opened this issue Jan 10, 2022 · 0 comments

Comments

@fkjslee
Copy link

fkjslee commented Jan 10, 2022

Dear author: when I run script
python nmt_wmt16_en2ro.py --d-m 384
the following error will be given
train.py: error: unrecognized arguments: --t-mult 1

What's more, when I read the code detailly. I can't find the arg '--t-mult'.
Below is my error log:

$ python nmt_wmt16_en2ro.py --d-m 384 2022-01-10 15:23:03 - LOGS - Training command: python train.py data-bin/wmt14_en_ro --arch delight_transformer_wmt16_en_ro --no-progress-bar --optimizer adam --adam- betas '(0.9, 0.98)' --clip-norm 0.0 --weight-decay 0.0 --criterion label_smoothed_cross_entropy --label-smoothing 0.1 --min-lr 1e-09 --update-freq 1 --keep-last-epochs 10 --ddp-backend=no_c10d --max-tokens 4096 --max-update 100000 --warmup-updates 10000 --lr-scheduler linear --warmup-init-lr 1e-7 --lr 0.0009 --min-lr 1e-9 --t-mult 1 --save-dir ./results_wmt16_en2ro/delight_out_384 --distributed-world-size 8 --distributed-port 50786 --delight-emb-map-dim 128 --delight-emb-out-dim 384 --delight-enc-min-depth 4 --delight-enc-max-depth 8 --delight-enc-width-mult 2 --delight-dec-min-depth 4 --delight-dec-max-depth 8 --delight-dec-width-mult 2 | tee -a ./results_wmt16_en2ro/delight_out_384/logs.txt usage: train.py [-h] [--no-progress-bar] [--log-interval N] [--log-format {json,none,simple,tqdm}] [--tensorboard-logdir DIR] [--seed N] [--cpu] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale D] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--criterion {label_smoothed_cross_entropy,sentence_ranking,legacy_masked_lm_loss,composite_loss,label_smoothed_cross_entropy_with_alignment,adaptive_loss,adaptive_cross_entropy,nat_loss,sentence_prediction,masked_lm,cross_entropy,binary_cross_entropy}] [--tokenizer {moses,nltk,space}] [--bpe {fastbpe,subword_nmt,bert,sentencepiece,gpt2}] [--optimizer {adadelta,adamax,adagrad,adafactor,sgd,lamb,nag,adam}] [--lr-scheduler {cosine,inverse_sqrt,linear,triangular,fixed,reduce_lr_on_plateau,polynomial_decay,tri_stage}] [--task TASK] [--num-workers N] [--skip-invalid-size-inputs-valid-test] [--max-tokens N] [--max-sentences N] [--required-batch-size-multiple N] [--dataset-impl FORMAT] [--train-subset SPLIT] [--valid-subset SPLIT] [--validate-interval N] [--fixed-validation-seed N] [--disable-validation] [--max-tokens-valid N] [--max-sentences-valid N] [--curriculum N] [--distributed-world-size N] [--distributed-rank DISTRIBUTED_RANK] [--distributed-backend DISTRIBUTED_BACKEND] [--distributed-init-method DISTRIBUTED_INIT_METHOD] [--distributed-port DISTRIBUTED_PORT] [--device-id DEVICE_ID] [--distributed-no-spawn] [--ddp-backend {c10d,no_c10d}] [--bucket-cap-mb MB] [--fix-batches-to-gpus] [--find-unused-parameters] [--fast-stat-sync] [--broadcast-buffers] [--arch ARCH] [--max-epoch N] [--max-update N] [--clip-norm NORM] [--sentence-avg] [--update-freq N1,N2,...,N_K] [--lr LR_1,LR_2,...,LR_N] [--min-lr LR] [--use-bmuf] [--save-dir DIR] [--restore-file RESTORE_FILE] [--reset-dataloader] [--reset-lr-scheduler] [--reset-meters] [--reset-optimizer] [--optimizer-overrides DICT] [--save-interval N] [--save-interval-updates N] [--keep-interval-updates N] [--keep-last-epochs N] [--keep-best-checkpoints N] [--no-save] [--no-epoch-checkpoints] [--no-last-checkpoints] [--no-save-optimizer-state] [--best-checkpoint-metric BEST_CHECKPOINT_METRIC] [--maximize-best-checkpoint-metric] [--patience N] [--adaptive-input] [--adaptive-softmax-cutoff EXPR] [--adaptive-softmax-dropout D] [--adaptive-softmax-factor N] [--tie-adaptive-weights] [--tie-adaptive-proj] [--delight-emb-map-dim DELIGHT_EMB_MAP_DIM] [--delight-emb-out-dim DELIGHT_EMB_OUT_DIM] [--delight-emb-width-mult DELIGHT_EMB_WIDTH_MULT] [--delight-emb-max-groups DELIGHT_EMB_MAX_GROUPS] [--delight-emb-dropout DELIGHT_EMB_DROPOUT] [--delight-emb-depth DELIGHT_EMB_DEPTH] [--delight-enc-scaling {block,uniform}] [--delight-enc-layers DELIGHT_ENC_LAYERS] [--delight-enc-min-depth DELIGHT_ENC_MIN_DEPTH] [--delight-enc-max-depth DELIGHT_ENC_MAX_DEPTH] [--delight-enc-width-mult DELIGHT_ENC_WIDTH_MULT] [--delight-enc-ffn-red DELIGHT_ENC_FFN_RED] [--delight-enc-max-groups DELIGHT_ENC_MAX_GROUPS] [--delight-dec-scaling {block,uniform}] [--delight-dec-layers DELIGHT_DEC_LAYERS] [--delight-dec-min-depth DELIGHT_DEC_MIN_DEPTH] [--delight-dec-max-depth DELIGHT_DEC_MAX_DEPTH] [--delight-dec-width-mult DELIGHT_DEC_WIDTH_MULT] [--delight-dec-ffn-red DELIGHT_DEC_FFN_RED] [--delight-dec-max-groups DELIGHT_DEC_MAX_GROUPS] [--no-glt-shuffle] [--define-iclr] [--norm-type NORM_TYPE] [--act-type ACT_TYPE] [--delight-dropout DELIGHT_DROPOUT] [--ffn-dropout FFN_DROPOUT] [--print-stats] [--src-len-ps SRC_LEN_PS] [--tgt-len-ps TGT_LEN_PS] [--dropout D] [--attention-dropout D] [--pe-dropout D] [--activation-dropout D] [--encoder-normalize-before] [--decoder-normalize-before] [--share-decoder-input-output-embed] [--share-all-embeddings] [--decoder-learned-pos] [--encoder-learned-pos] [--no-token-positional-embeddings] [--no-scale-embedding] [--label-smoothing D] [--adam-betas B] [--adam-eps D] [--weight-decay WD] [--use-old-adam] [--warmup-updates N] [--warmup-init-lr LR] [-s SRC] [-t TARGET] [--load-alignments] [--left-pad-source BOOL] [--left-pad-target BOOL] [--max-source-positions N] [--max-target-positions N] [--upsample-primary UPSAMPLE_PRIMARY] [--truncate-source] [--eval-bleu] [--eval-bleu-detok EVAL_BLEU_DETOK] [--eval-bleu-detok-args JSON] [--eval-tokenized-bleu] [--eval-bleu-remove-bpe [EVAL_BLEU_REMOVE_BPE]] [--eval-bleu-args JSON] [--eval-bleu-print-samples] data train.py: error: unrecognized arguments: --t-mult 1
thank you

It looks like the format distorted by github. I paste it below

https://paste.ofcode.org/fkBdqtjQdEFr6QeymGY49F

plz take a look if you need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant