Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for large data #88

Merged
merged 17 commits into from
Aug 3, 2023
Merged

Better support for large data #88

merged 17 commits into from
Aug 3, 2023

Conversation

jiangyi15
Copy link
Owner

Use tf.data.Dataset for large data in the option

data:
    lazy_call: True

@codecov
Copy link

codecov bot commented Jul 30, 2023

Codecov Report

Merging #88 (cc8ba51) into dev (f3e7662) will increase coverage by 0.21%.
The diff coverage is 76.64%.

@@            Coverage Diff             @@
##              dev      #88      +/-   ##
==========================================
+ Coverage   73.83%   74.05%   +0.21%     
==========================================
  Files         104      104              
  Lines       14640    14768     +128     
  Branches     2717     2746      +29     
==========================================
+ Hits        10810    10937     +127     
+ Misses       3177     3163      -14     
- Partials      653      668      +15     
Flag Coverage Δ
unittests 74.05% <76.64%> (+0.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
tf_pwa/data_trans/helicity_angle.py 93.65% <ø> (ø)
tf_pwa/cal_angle.py 76.38% <20.00%> (+1.38%) ⬆️
tf_pwa/config_loader/multi_config.py 62.16% <33.33%> (-1.11%) ⬇️
tf_pwa/fit.py 22.58% <50.00%> (+0.35%) ⬆️
tf_pwa/config_loader/plot.py 69.10% <64.51%> (-0.23%) ⬇️
tf_pwa/amp/core.py 74.43% <66.66%> (-0.27%) ⬇️
tf_pwa/config_loader/data.py 61.66% <70.83%> (+1.12%) ⬆️
tf_pwa/utils.py 66.18% <77.77%> (+0.52%) ⬆️
tf_pwa/model/model.py 50.44% <81.25%> (+0.82%) ⬆️
tf_pwa/data.py 75.88% <83.11%> (+0.04%) ⬆️
... and 4 more

... and 1 file with indirect coverage changes

@jiangyi15
Copy link
Owner Author

Currently, the best option (26) for large dataset is

data:   
    lazy_call: True
    use_tf_function: True
    no_id_cached: True
    jit_compile: True
    cached_lazy_call: cached_data/

batch=100000
Step time (nll_grad()) for 1M data + 10M PHSP MC about 10s.

Screenshot 2023-07-31 at 13-48-08 Weights   Biases

@jiangyi15 jiangyi15 merged commit 377a000 into dev Aug 3, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant