-
Notifications
You must be signed in to change notification settings - Fork 25
2018 meeting notes
Keith and Mike talked about coding requirements, and Mike will update the strawman rules based on some comments. Also looked through the MARBL documentation for CESM 2.0 and cleaned up quite a bit of it.
Gokhan and Mariana shared timeline for CESM 2.1 -- we were worried about MARBL development stalling, but CESM 2.1 release window is pretty short (mid- or late-September). Also talked about MARBL documentation for CESM 2.0, and Mike will work with Alice B to get that up and running.
To-do:
- clean up developer's guide prior to CESM 2.0 release
- break up generic "code clean-up" issues into discrete individual tickets
- Look into TravisCI to enforce code consistency
Mike will send everyone with read access to NCAR/MARBL an email informing them of the move to public development and the marbl-ecosys/MARBL repository. Email will go out May 23, and the move to the marbl-ecosys organization will hopefully happen on May 29.
Maintaining a ChangeLog for the development tags is not necessary - developers should be able to rely on git diff
to see what changed between two tags and git log
to see the first tag that contains a specific commit. Moving forward, the metadata Mike was putting in the "release notes" for development tags will go in the commit message when merging a branch onto development
. Release notes will only be used for tags on stable
or on release branches.
The frequency of tags on development
has not yet been determined. Perhaps our current "tag every pull request" method is too frequent? We will develop a rule-of-thumb to determine which pull requests should be tagged.
Lots of discussion on branching and tagging. We want a long-lived stable
branch (default when cloning repository), and GCM-specific releases will also have branches that start from a commit on stable
.
stable | cesm2.0 | cesm 2.1
----------------------------------------
v1.0.beta01 | cesm2.0_n00 |
v1.0.beta02 | cesm2.0_n01 |
v1.0.beta03 | | cesm2.1_n00
| cesm2.0_n02 |
v1.0.0 | |
Things we need before going public:
- License
- Accurate documentation
Soon after going public:
- TravisCI for testing (python at least) as part of pull request
- Read the Docs for documentation (want to have multiple versions of documentation available)
We definitely want to use the marbl-ecosys organization rather than just making NCAR/MARBL public (so make NCAR/MARBL public and then transfer)
Discussion around the idea of going public for development -- would we still have separate development and release repositories? Consensus was no, have a single repository that come up with a branch / tag naming convention that (a) gives the user the latest release code by default, and (b) makes it clear that a user is switching to development code.
Also, an open question: do we want the latest release to be able to run with the same scientific configuration as old releases? E.g. should v2.0.0 be able to generate v1.0.0-esque solutions?
Discussed potential release paths for two different options:
- private development repo + public release repo [we have been talking about keeping development private because we are not comfortable with users relying on un-vetted science]
- public development and release (two ways to implement)
- one repository (
master
branch for releases, and adev
branch for development) - two different repositories (separate development repository that is public, but releases would be mirrored in a separate repository without any of the development branches or tags)
- one repository (
We will need to figure this out soon (between May 11 code freeze and June 1 CESM release)
Fixing the POP%prod
bug in marbl0.28.8 highlighted another issue: zooplankton (with a fixed P:C ratio) grazes on autotrophs that can have a much lower P:C ratio. When this happens, the zooplankton pulls more P out of the column to maintain its ratio, and this can occasionally drive POP%prod
to negative values. There was a bug where POP%prod
was reset to zero when POC%prod
was negative, rather than forcing POP%prod
to be non-negative, and fixing this bug started causing occasional mass-balance failures in Jint_Ptot
. We discussed possible fixes to the newly uncovered bug, such as drawing P from somewhere else or dumping C back into the column. This is holding up the POP tagging process, and will hopefully be resolved soon.
MARBL releases will be numbered X.Y.Z
, where X
is the science version number, Y
is the API version number, and Z
is a patch version number. There is no indication about whether a patch release is bit-for-bit with its predecessor or not.
MARBL development will add a forth digit with the beta prefix, and the location of this digit indicates what tag is being developed:
-
X.Y.Z.betaN1
is developing the patch releaseX.Y.Z
-
X.Y.betaN1.N2
is developing the API releaseX.Y.0
; all tags whereN1
is the same share the same API -
X.betaN1.N2.N3
is developing the science releaseX.0.0
; all tags whereN1
andN2
are the same share the same API
We will have a very broad definition of API.
Release strategy: the development repository will maintain release branches for each CESM version (and other models that need to make the code public). Following a successful control run, this branch will also be pushed to the public repository.
We have decided not to use semantic versioning, as it does not seem appropriate for scientific software. Mike will work on a versioning scheme to use instead, and we will continue with the current v0.X.Y approach in development.
Most of the time was spent discussing the development / release plan for the code. We are hoping to continue to develop in a private repository, and only make [publicly available] release tags for updates that have been scientifically vetted - i.e. have been used in a control run. Mike will put together a more detailed proposal over the coming weeks.
We also talked about merging vs rebasing - current decision is to continue to merge branches back to the trunk, but use the --no-ff
option so that git log --first-parent
accurately tracks master
. We do NOT recommend rebasing, as it effectively edits history. Rebasing would make it impossible to rerun a test using an old code base (unless we keep branches indefinitely) and also makes it harder to track when changes were actually made.
CESM 2 plan: in the next week, Mike will focus on the quick fix issues (#240, #241, #244, and #248) and Keith L will work on an fe_bioavail
fix from Keith M. After the freeze, Mike will fix the ecosys_restart_file
bug (#246) and Keith L will be responsible for #105, #218, and #211.
We also talked about how to make MARBL available for CESM 2.0; we definitely want POP to pull MARBL down from a git repository, but that opens up a can of worms:
- MARBL development is private, and our test suite doesn't have enough coverage for us to be comfortable making the entire development process public (Keith L and Mike would still like private forks from which to submit pull requests)
- The current implementation of the
manage_externals
will produce an unhelpful error when a subversion repository has been replaced with a git repository - We haven't clearly defined a workflow for developing MARBL once it has been made public, but our current workflow is not set up to support old versions of code; so we need to make sure that it is easy to provide updates to the version of MARBL that is used in CESM 2.0.0 (and hopefully also accept pull requests that are developed using the version of MARBL in CESM 2.0)
Talks at CESM Workshop:
- OMWG talk geared towards changes since CESM 1.2... how to enable CISO code, how to set MARBL parameters, and how to change BGC-related diagnostic output
- SEWG talk geared towards coding details... how POP's build scripts interact with MARBL's python scripts, how POP Fortran calls MARBL code. Emphasize MARBL independence (no reliance on
build-namelist
or other POP tools)
Maybe instead of another tutorial we could convince interested parties to help us when we reach the code clean-up / documentation step prior to MARBL 1.0.0 release?
Collaborating with CFP might be useful, but the real question is "is it worth the time?" If all we are doing is pulling marbl_logging_mod.F90
and marbl_utils_mod.F90
into their own software package and we don't envision much development happening there, then why spend the time? We can just let CFP copy those modules and hand-edit them.
Lastly, there are no longer open issues with no milestone.
Most of the discussion was looking several steps ahead in my work to create a test for set_surface_forcing()
and set_interior_forcing()
. Some key points:
- We should do a better job organizing the
tests/
directory:-
init-twice
is a unit test, in the sense that it really testsmarbl_instance%shutdown()
to ensure we are deallocating everything that has been allocated. - Once
set_forcing
(which should be renamedcompute_tendencies
or something) is finished, we don't need separate a regression test forinit
(we can print the MARBL log in the full test and then changes in init will result in differences in standard out) - All the
requested_*
tests would be better labelled asexamples
instead oftests
; would this require a completely different executable?
-
- We need to improve the build system to handle the need for netCDF
- require netCDF for test directory (but maybe not for examples)?
- require users to build via python scripts (add
FROM_PYTHON=FALSE
to Makefile, overwrite from python scripts, abort unlessFROM_PYTHON == TRUE
) -
nc-config
may not be consistent enough across machines to use it to get include / linking flags (nc-config --flibs
does not give reasonable results when using homebrew to install on a Mac; other options did not play nicely on hobart)
- If we rely on python for the build, could also use config files (read by python, passed to
make
) to store machine-specific settings (and also state of current build)
Ahead of the MARBL v1 release, I should sort remaining MARBL 1.0.0 issues into "API changing" and "not API changing"; it may make sense to push some of the non-API-changing issues back until after the release and focus on finalizing the interface. Speaking of API changes, we should probably include both short name and long name in the forcing metadata type (instead of varname
, which is currently what we provide to the GCM).
For the jupyter notebook generating forcing data sets, Matt recommended using xarray instead of netCDF4... also, besides the southern ocean data point a location in the eastern equatorial Pacific would be interesting. lat_in
and lon_in
should be lists, but each individual column should be written to its own netcdf file.
Jessica's issues will be addressed, and some in-ticket comments were made but no clear priority decided. Currently the python code processing templated diagnostics in YAML assumes one template replacement per variable, so something like {auto}_graze_{zoo}_to_{zoo}
would require python overhaul. We also discussed idea of having multiple POC pools, which would be a huge undertaking because of the effect it would have on routing.
For the set_forcing
test, besides outputting tracer tendencies and surface forcing values, I'll also include diagnostics. For now I'll just output all computed diagnostics, though it might be nice in the future to have an optional argument to the test akin to --input-file
to allow users to customize the diagnostic output.
Now that the low-hanging fruit has been dealt with, we talked about how to move on to the next big issue -- improving the functionality of marbl_domain_type
. We decided that it made the most sense to do the following:
- Create a stand-alone test for
set_surface_forcing
andset_interior_forcing
- Use this new test to make sure we only loop from 1 to
kmt
(lots of loops currently run tokm
, which isn't necessary) - Once all loops are the right size, we can work on the interface to allow GCMs where
k = 1
is the bottom of the column rather than the surface.
More talk on #1 (aborting if mass balance fails). This would have saved Jessica a lot of time when she stumbled upon bug #224; current plan would be to introduce an assert_near_zero()
function. Keith would like to hard-code in the tolerance, which will require looking at current POP output to determine how big each of the mass balance terms can be. This is preferable to determining tolerance on the fly, because the latter might mask some changes we would like to see. (Also, we can't necessarily use the 100m integral as a basis for determining the tolerance because if the column is <100 m deep then the mass balance = the 100 m integral).
Testing the diagnostics with Jessica's size-structured set-up highlighted a bug with how logical settings are handled - e.g. length of tracer_restore_vars depends on whether ciso_on = ".true." but some users would set ciso_on = "T" instead. (Solution is to convert all acceptable Fortran logical values to either ".true." or ".false." so the comparison is always valid.) We also discussed combining POP and MARBL diagnostics into a single file (ecosys_diagnostics
).
My plan once the diagnostics work is on the trunk will be to go back and address several tickets that seemingly have quick fixes; this should coincide nicely with everyone else being at Ocean Sciences.
The majority of time in this meeting was spent digging into the weeds of the YAML / python setup for defining diagnostics. Lots of good suggestions, such as better implementation of templating (things like ((autotroph_sname))
for autotroph short name) and not introducing an additional level of dictionaries just to handle per-autotroph and per-tracer diagnostics.
PGI continues to introduce compiler bugs that prevent MARBL from running (which is actually an improvement from the days where PGI compiler bugs prevented MARBL from building). Need to figure out if dropping PGI support is okay with CESM / E3SM, or if some nominal testing needs to continue as PGI updates its compiler.
Walking through the updates to default_settings.yaml
(to include tracer short-names instead of just total count) raised some internal inconsistencies -- marbl_interface_private_types.F90
refers to tracers in all lower-case (po4
and caco3
) while marbl_init_mod.F90
actually sets up the tracer metadata using proper atomic case (PO4
and CaCO3
). The YAML file will use the marbl_init_mod.F90
case, and hopefully we can auto-generate both marbl_interface_private_types.F90
and marbl_init_mod.F90
based on the YAML so that they will be consistent as well.
We prioritized current open tickets to determine what we want in the CESM2.0 sub-milestone of the MARBL 1.0.0 milestone.
Discussion on how to fix the bug in interior restoring, which should be on master
by the next meeting. This fix will definitely go into CESM 2.0, but it does not need to be in the next beta tag... so the fix will get merged onto marbl_dev
immediately but not pushed to the trunk just yet.
Not much MARBL-specific conversation, just walked through NCAR/manage_externals and how it will be used in CESM tags following cesm2_0_beta08
(and also in POP starting in whatever trunk tag goes into cesm2_0_beta09
)