Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: new PyPi classifiers and packaging everything up with pip as standard way of sharing architectures #217

Open
SamuelMarks opened this issue Apr 18, 2022 · 5 comments
Assignees
Labels

Comments

@SamuelMarks
Copy link

SamuelMarks commented Apr 18, 2022

What is your opinion on this, that I originally posted almost 3 years ago? keras-team/keras#15762

There are a huge number of new statistical, machine-learning and artificial intelligence solutions being released every month.

Most are open-source and written in a popular Python framework like TensorFlow, JAX, or PyTorch.

In order to 'guarantee' you are using the best [for given metric(s)] solution for your dataset, some way of automatically adding these new statistical, machine-learning and artificial intelligence solutions to your automated pipeline needs to be created.

(additionally: useful for testing your new optimiser, loss function, &etc. across a zoo of datasets)

Ditto for transfer learning models. A related problem is automatically putting ensemble networks together. Something like:

import some_broke_arch  # pip install some_broke_arch
import other_neat_arch  # pip install other_neat_arch
import horrible_v_arch  # builtin to keras

model   = some_broke_arch.get_arch(   **standard_arch_params  )
metrics = other_neat_arch.get_metrics(**standard_metric_params)
loss    = horrible_v_arch.get_loss(   **standard_loss_params  )

model.compile(loss=loss, optimizer=keras.optimizers.RMSprop, metrics=metrics)
print(model.summary())
# &etc.

In summary, I am petitioning for standard ways of:

0. exposing algorithms for consumption;

1. combining algorithms;

2. comparing algorithms.

To that end, I would recommend encouraging the PyPi folk to add a few new classifiers, and a bunch of us trawl through GitHub every month sending PRs to random repositories—associated with academic papers—linking up with CI/CD so that they are now installable with pip install and searchable by classifier on PyPi.

Related, my open-source multi-ML meta-framework:

  • uses builtin ast and inspect modules to traverse the module, class, and function hierarchy for 10 popular open-source ML/AI frameworks;

  • will enable experimentation with entire 'search-space' of all these ML frameworks (every transfer learning model, optimiser, loss function, &etc.)

[…]with a standard way of sharing architectures will be able to expand the 'search-space' with community contributed solutions.

Related:

IMHO there are a number of advantages to using existing approaches to finding and installing components of machine-learning models (and ensemble-able models).

Would appreciate your perspective (@bhack referenced your project)

@arjunsuresh
Copy link
Contributor

Thank you @SamuelMarks for your idea. It aligns with what we would like to achieve with CM (CK2).

@gfursin
Copy link
Contributor

gfursin commented Apr 20, 2022

Hi @SamuelMarks. Thank you for your notes - very interesting and indeed related to our project as mentioned by @arjunsuresh ! We plan to have a prototype of a portable ML pipeline using our new CK2 (CM) framework within a few weeks. Will you be interested to check it out and discuss your ideas at some point? We will be glad to get your feedback! Thanks!

@SamuelMarks
Copy link
Author

Great to hear.

Sure thing, just @ tag me when ready.

PS: At some point I'll finish my own multi-ML meta-framework also (been building it with the aforementioned ast module in cdd-python) which should also benefit greatly from a deployment of this [meta] architecture. When ready I'll probably CC0 it.

@gfursin
Copy link
Contributor

gfursin commented Sep 19, 2022

Hi again @SamuelMarks .
We have released the next generation of the CK framework (CM) and we are now creating a new open workgroup in MLCommons to simplify MLPerf inference benchmark and make it easier to plug in any real world model, data set, framework, compiler and hardware. Please feel free to join us at https://github.com/mlcommons/ck/blob/master/docs/mlperf-education-workgroup.md - I think you experience is very relevant and your feedback will be very appreciated!
Thanks!

@SamuelMarks
Copy link
Author

@gfursin Great, I replied to another thing you tagged me in. I'll try and make one of your meetings to discuss further. My Python compiler library—that I'm using to generate my multi-ML meta-framework and contribute strong types to major frameworks including TensorFlow—is about to gain some new features and fixes of old whitespace-related bugs. Watch this space!

In terms of the subject of this thread, what do you think about the PyPi centric solution? - Should we start a mailing-list thread or something with them? - Petition Google to ask them for the new classifiers?

I think my multi-ML meta-framework needs to finish its Proof-of-Concept phase before proceeding. Unless you have other ideas?

@gfursin gfursin assigned gfursin and unassigned gfursin, hanwenzhu and arjunsuresh Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants