-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Come up with a recommended way to render/store tutorials #40
Comments
I'd want something like flowchart TD
A[push to PR] --> B[run tests]
A --> C[build tutorials on GH actions]
C --> D[trigger RTD build]
|
I believe @michalk8 has something like this running for squidpy. @adamgayoso and I have also talked about this a few times. I think one point here is that some tutorials should opt out of being built. We often have tutorials involving large computations or data and realistically don't want to spend the CI resources or time on every commit running them. |
@michalk8, can you elaborate how the squidpy setup works? And maybe point me to a PR where the tutorials were actually built, I only find PRs where that particular action was skipped. |
Inspired by the pre-commit CI, the following approach could work:
|
I think I'll give this approach a try in one of my repos. |
MyST-nb might be a good alternative to nbsphinx. Once benefit is that it supports caching through jupyter-cache. It could be nice to restore that cache when running notebooks through github actions. |
My idea was along the lines
|
Isaac wants submodules back in scanpy, see scverse/scanpy#2636 I think we should plan the best approach here first (or approaches: Might be that different projects have different needs).
No, unless you have some excellent arguments for it that you haven’t shared yetBespoke markdown dialects are a bad bad thing. The same applies to not storing notebooks in an .ipynb format. All the advantage of having tooling goes out of the window in both cases.
I’ll need some excellent arguments to be convinced that storing notebooks in any other format than .ipynb might be a good idea in some niche case. I’m very happy that I don’t do R anymore, since Rmd instead of .ipynb cost the whole ecosystem so much potential synergy. *Y’all like MyST, I’m not going to convince you to go back on it, some “I told you so”s nonwithstanding. Maybe in a few years. Maybe by then some actual extensible widely supported Markdown dialect will have emerged and MyST will migrate to it. Then it’d actually be a quarter-step up from rST instead of a downgade. So your plan, but taking my veto into account:
I like it. It supports linking to specific versions of notebooks from specific versions of the documentation while making sure that CI works. |
agreed. Might need a temporary solution in scanpy that could be reverting to the submodule until we have something everyone is happy with.
The main argument for me is that
I think there are two formats that are particularly interesting and have excellent editor support. I also don't think jupytext with >6k stars is a niche tool.
But it's not the hill I am going do die on, I can live with an ipynb with stripped out outputs. The main selling point is that tutorials without output (in whatever format) can be kept in the same repo as the main documentation. |
As said there, if someone wants that, they can do it, I don’t think it has a significant enough advantage to spend effort on it personally 😄 I’m not convinced about text formats.I think my main problem with this can be boiled down to
The authors of .ipynb had valid arguments why they chose JSON instead of text, which are still relevant.
Good argument. There’s multiple extensions to do it (e.g.), but of course they aren’t seamlessly integrated in GitHub. The question is how much that matters that things aren’t perfectly seamless here.
I never had problems with this, nbdime is great at this. Granted, you need to always do it locally instead of being able to click the GitHub UI button most of the time.
Still one language-specific editor’s proprietary format that hasn’t garnered tool support outside of the R world. Nope. I predicted then that Rmd would become big enough in the R world that it would prevent the scientific landscape to grow together and create interoperability with tools that work on the same format. It made me sad. I was right. |
Happy to go with stripped ipynb if you prefer. Technically it would be easy to support both at some point, as it's probably just a jupytext call in step 1. |
@flying-sheep, maybe an even easier solution than writing a sphinx plugin is to use DVC. It's basically "git for data", similar to git lfs, but with a bunch of supported backends (e.g. S3 bucket). I tried it at work for some data analysis projects and it's pretty neat. Basically
And on readthedocs, one would just add |
How would that look like from a workflow perspective? I think the ideal workflow (without going a completely different route than GitHub) would be:
Of course, adding 1-2 simple steps like that tool wouldn’t be the end of the world, but do we need to? Or in other words, what’s missing from what we do in https://github.com/scverse/scverse-tutorials that this enables? |
If committing the notebooks to the main scanpy repo is an option, I'm all for it, but I thought Isaac didn't want that. Having this separate scanpy-tutorials repo that is integrated into the main repo as a submodule is a weird construct:
With DVC (or a sphinx-plugin as described above), you could have the tutorials versioned in the scanpy repo directly, without inflating its size. |
I’m not a fan of the submodule approach either. I think they should just be moved back into the scanpy repo. With @ivirshup you said it was decided to go that route and seemed unwilling to go back on that decision. Can you link to where that’s discussed? I’d like to see what submodules bring to the table. |
I think that wherever it is computationally feasible, tutorials should be run as a CI check and the tutorial use the version rendered by the CI to avoid failing or outdated tutorials.
With the current template it is already possible to have tutorials built by
nbshpinx
onreadthedocs.org
, but it would be nice to have something that scales better. Also it needs to be documented somewhere how to enable this behavior.ipynb
files when not using CI builds?)See also #19.
The text was updated successfully, but these errors were encountered: