- Rewrite HDF5 model and format for parallel processing on distributed systems.
- Optimize HDF5 library & tools for parallel processing on distributed systems.
- Improve security and reliability for parallel processing on distributed systems.
The existing HDF5 Parallel Library / MPI-IO has some issues.
- Can't build.
- Can't test.
- Can't scale.
pnetcdf can't create NetcDF-4/HDF5, only NetCDF-3. You need to use NetCDF-3 to NetCDF-4 conversion tool.
Parquet is great for distributed system. You need to use Pandas to convert parquet to HDF5.
Hide MPI/Dask/Spark calls.
#include <h5p.h>
h5p_use("mpi"); /* replace mpi with dask or spark */
H5P_FILE* fp = h5p_open("s3://test.h5p", "w");
h5p_write(fp, "/g/d", data);
h5p_close(fp);
H5P_FILE* fp = h5p_open("s3://test.h5p", "r");
data = h5p_read(fp, "/g/d");
h5p_close(fp);
- bin/h.bat: test script for Intel OneAPI on Windows
- bin/d.bat: debugging script for Intel OneAPI on Windows