-
Notifications
You must be signed in to change notification settings - Fork 114
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
4 changed files
with
128 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
TBD |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
# Topic | ||
|
||
Syncing about the next steps for CM, CM4MLOps, CM4MLPerf, CM4ABTF, etc. | ||
|
||
# People | ||
|
||
* Grigori Fursin | ||
* Arjun Suresh | ||
|
||
# Discussion | ||
|
||
## Need proper attribution | ||
|
||
Need to agree on the common text in the CM documentation and CM, CM4MLOps and CM4MLPerf GitHub | ||
using these examples: | ||
|
||
* https://github.com/spack/spack?tab=readme-ov-file#authors | ||
* https://cython.org (see Financial contributions section) | ||
|
||
* Author/creator | ||
* Core developers | ||
* Contributors (from the community, MLCommons and the Automation and Reproducibility TaskForce): | ||
See https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md . | ||
* Sponsorship & financial Contributions | ||
|
||
Should add to main GitHub and docs.mlcommons.org ... | ||
|
||
## Remove/reduce dependencies on non-MLCommons GitHub repositories | ||
|
||
At this moment, various non-MLCommons GATEOverflow GitHub repositories are used | ||
in the official MLPerf workflows by default - that creates many possible legal issues | ||
for CM and MLPerf users. | ||
|
||
We should either move all such repositories to MLCommons | ||
or, if it's not easily possible, create another neutral GitHub ID | ||
such as mlcommons-aux with a clear governances and agreement | ||
with MLCommons to keep all dependencies in the MLCommons space. | ||
|
||
## Improve cm4mlops package | ||
|
||
Current cm4mlops package hides extra installation of various system dependencies | ||
and CM repositories while using non-default branches and is difficult to debug if something goes wrong. | ||
|
||
A most standard way is to install cmind package and have a function to bootstrap cm4mlops | ||
with a proper control over the flow, CM repositories and branches that can be changed via flags. | ||
|
||
For example: | ||
```bash | ||
pip install cmind | ||
cm bootstrap cm4mlperf | ||
cm bootstrap cm4mlops --branch=mlperf-inference | ||
... | ||
``` | ||
|
||
That will perform the same functions as cm4mlops package but will be easy to debug and will have easy to trace errors | ||
that can be used in GitHub actions or other CI | ||
|
||
To be brainstormed further ... | ||
|
||
## Coordinate further developments | ||
|
||
* Document the roadmap and responsibilities for Q3-Q4 2024 | ||
* Regular dev meetings (once a week or every two weeks)? | ||
* Resume Discord channel discussions or mailing list (to be able to track discussions)? | ||
|
||
## CM4MLPerf inference v4.1 & v5.0 automation | ||
|
||
* Add CM for as many v4.1 submissions as possible to make it easier for everyone to reproduce results shortly after publication of results. | ||
* Sync on the plans for inf v5.0 with MLCommons | ||
|
||
## CM4ABTF automation | ||
|
||
* Sync on the next steps during next meetings | ||
|
||
## Collaboration with Croissant | ||
|
||
* Sync on the next steps during next meetings | ||
|
||
## Testing infrastructure for CM4MLOps and CM4MLPerf | ||
|
||
* GitHub actions are not enough to test all dependencies and their versions for diverse hardware for CM-MLPerf workflows. | ||
Brainstorm infrastructure for continuous testing (Grigori started prototyping some infrastructure). | ||
|
||
## Optimize MLPerf inference reference implementations | ||
|
||
* We need to add known optimizations to the MLPerf inference implementations | ||
|
||
## Support MLPerf training | ||
|
||
* We should start prototyping the unified CM interface and automation for MLPerf training and wrap existing MLCube tasks | ||
|
||
## Prepare tutorials | ||
|
||
* Sync on the tutorial about CM internals and scripts | ||
* Sync on the tutorial for SCC'24 | ||
|
||
## Common paper | ||
|
||
* Start preparing a common paper about CM on GitHub | ||
|
||
## Collect feedback from companies | ||
|
||
* There were various discussions with MLCommons companies about using CM for reproducibility. | ||
We need to collect and aggregate all the feedback in one place. | ||
|
||
## Next generation of CM | ||
|
||
Grigori started testing some ideas and prototyping the next generation of CM, CM4MLOps and CM4MLPerf | ||
bsaed on 3 years of using CM to modularize and automate MLPerf and will share notes in the future dev meetings. | ||
|
||
## Sync with MLCommons | ||
|
||
* Prepare official CM page - should we do it with the MLPerf in v4.1 release? | ||
* Prepare Press-release about CM with MLPerf inf v4.1 release? | ||
* Where to host CM developments and discussions within MLCommons? | ||
* Infra WG? | ||
* Create a new *official* taskforce or WG on automation and reproducibility? | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
TBD |