Skip to content

yflyzhang/bigData

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bigData

big data course project

  1. Please download the database and models from our google drive, link below

https://drive.google.com/drive/folders/0ByehYI2Lxf1nQWZTQ05LbzV2SGM

1.) database/reviews.txt 2.) database/train_data.json 3.) doc2vec/Archive.zip

  1. Put 1.) and 2.) in database folder and unzip 3.) into doc2vec folder where you downloaded from github repository.

  2. Run tf_train.py (You can cahange the "learning_rate", "loops", "mode". Also you can change comparison sets by tweeking "CLASSES = [[1], [5]]" in myconstants.py)

  3. You don't need to run preprocess.py any more.

Note: Please Keep the folder structure the same as below, otherwise you might get unexpected Errors(Read this structure in your text editor)

--Folder structure:

bigData myconstants.py preprocess.py tf_train.py README.md LICENSE .gitignore --database train_data.json validate_data.json reviews.txt --test reviews_Electronics_5.json train_data.json validate_data.json reviews.txt --matrix b1_1234_5.npy w1_1234_5.npy ...... --ml init.py functions.py tf_training.py w2v.py d2v.py --word2vec w2v_model --test reviews.txt w2v_model w2v_model.txt --doc2vec d2v_model d2v_model.docvecs.doctag_syn0.npy d2v_model.syn0.npy d2v_model.syn1.npy --test d2v_model --results d2v_1234_5.txt mean_123_45.txt ......

About

big data course project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%