Skip to content

my implementation of seq2seq model using Tensorflow

Notifications You must be signed in to change notification settings

huosan0123/seq2seq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Implementation of seq2seq using tensorflow=1.4.0 and python=3.6.2

I write a attention-based seq2seq model for neural machine translation. It can runs on multiple GPUs(one PC with multiple GPUs)

1. data

My data was downloaded from nlp.stanford.edu/projects/nmt/. Trained on small dataset english-Vietnamese. Pickle data is avaliable if you want

2. script explanation:

  1. build_dict.py preprocess dataset. I preprocessed input dataset into pickle files. Transfer string into int32, filter length(3~50). Note your file path.
  2. config.py some model parameters.
  3. model_topbah.py Bahanau attention on top layer of decoder and encoder
  4. train_vi.py entrance for training. set up your own parameters at the beginning of this file. At line 37, set gpu_id like gpus = "5,6,7", No space in string.
  5. gpuloader.py dataloader for multiple gpu training.
  6. dataloader.py dataloader to feed in data to tf.placeholder. Not used.

3. some tips

  1. You can try Luong attention as well, but I didn't get well result using Luong, not so easy to train. RMSProp and Adam need small lr(like 0.001), while SGD need bigger like 1.0. But SGD is much harder to train.
  2. Best result is output att=False, rmsp, (lr=0.001, start_decay=8000,0.8), got bleu=20.5% on tst2012.vi without beam search.
  3. decode phase not tested.
  4. Spend lots of time on writing multiple gpu training. how to feed in data? how to compute loss and gradient?

4. to be continue

My code, especially model.py and train.py, is not well organized, may be updated if I have spare time.

may add comments.

Want to use tf.data.Dataset api. I wonder how to set validation_per_train_step.

5. reference

  1. https://github.com/JayParks/tf-seq2seq/blob/master/seq2seq_model.py # good for beginners
  2. https://github.com/tensorflow/nmt/tree/master/nmt
  3. https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10_multi_gpu_train.py
  4. Effective Approaches to Attention-based Neural Machine Translation
  5. neural machine translation by jointly learning align and translate

About

my implementation of seq2seq model using Tensorflow

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages