-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contraction f16, bf16, f32_f16, f32_bf16, f64_f32 #158
Commits on Dec 8, 2023
-
Add support to f16 and bf16 to contraction
- Support _Float16 - Support hip_bfloat16 - Add unit test of _Float16 and hip_bfloat16 - Add sample of _Float16 and hip_bfloat16
Configuration menu - View commit details
-
Copy full SHA for c5fbcec - Browse repository at this point
Copy the full SHA c5fbcecView commit details -
Add support to f32_f16, f32_bf16, f64_f32 to contraction
- Support ABCD data type f32 and compute type f16, bf16 - Support ABCD data type f64 and compute type f32 - Fixed bug: alpha, beta were passed in as wrong data type in unit test of contraction - Create sample template of contraction
Configuration menu - View commit details
-
Copy full SHA for 185a2ab - Browse repository at this point
Copy the full SHA 185a2abView commit details -
Add placeholder for solution unique_id
Solution unique_ids of Actor Critic are have not been ready yet, but we put some placeholders in the new Actor Critic to make the unit tests be able to pass.
Configuration menu - View commit details
-
Copy full SHA for ab8d557 - Browse repository at this point
Copy the full SHA ab8d557View commit details -
Update contraction device instances
Update contraction device instances since CK has updated them.
Configuration menu - View commit details
-
Copy full SHA for df27e32 - Browse repository at this point
Copy the full SHA df27e32View commit details -
1. Initiate the data with 0.01, 0.02, ... by default 2. Print C
Configuration menu - View commit details
-
Copy full SHA for f85df83 - Browse repository at this point
Copy the full SHA f85df83View commit details -
Set CK contraction instance only run once
When logger level is set to HIPTENSOR_LOG_LEVEL_PERF_TRACE, we make CK instances measure the running time. The problem is that CK internally will run the contraction 10 times by default. This leads to an issues: 1. It returns wrong result for C = alpha A x B + beta C Set StreamConfig.nrepeat_ = 1, the contraction will be run once
Configuration menu - View commit details
-
Copy full SHA for 5c45a8c - Browse repository at this point
Copy the full SHA 5c45a8cView commit details -
1. ck::bhalf_t cannot cast to float or double by static_cast. Use ck::type_convert() to fix it. 2. epsilon() is not good value to measure the relative difference of data. It is too small for double (eps < 10e-13).
Configuration menu - View commit details
-
Copy full SHA for f631818 - Browse repository at this point
Copy the full SHA f631818View commit details -
Configuration menu - View commit details
-
Copy full SHA for e5cefe7 - Browse repository at this point
Copy the full SHA e5cefe7View commit details -
Rename contraction sameple files
The pattern of contraction sameple file is - bilinear: simple_bilinear_contraction_<A>_<B>_<C>_<D>_compute_<compute>.cpp - scale : simple_scale_contraction_<A>_<B>_<C>_compute_<compute>.cpp
Configuration menu - View commit details
-
Copy full SHA for 4345a1c - Browse repository at this point
Copy the full SHA 4345a1cView commit details -
Improve CPU reference accurary
The relative difference between contraction result and CPU reference is less than 0.1% after the improvement.
Configuration menu - View commit details
-
Copy full SHA for 43f33ee - Browse repository at this point
Copy the full SHA 43f33eeView commit details -
Configuration menu - View commit details
-
Copy full SHA for fec9065 - Browse repository at this point
Copy the full SHA fec9065View commit details -
1. Revert the default threshold of relative difference to (100 * std::numeric_limits<T>::epsilon()) 2. Update CPU reference to make the difference between CPU reference and output of contraction instance is less than (100 * std::numeric_limits<T>::epsilon()).
Configuration menu - View commit details
-
Copy full SHA for b21fe0b - Browse repository at this point
Copy the full SHA b21fe0bView commit details