-
Notifications
You must be signed in to change notification settings - Fork 25
2017 meeting notes
Most of the meeting was spent talking about how to improve how MARBL communicates with the GCM regarding what diagnostics will be provided / what to do with them. The MARBL tool MARBL_generate_diagnostics_file.py
should not just recommend frequency, but also what operator the GCM should apply over the time period (average, minimum, maximum, instantaneous). This will change the format of the diagnostic file to
# DIAGNOSTIC_NAME : frequency1_operator1, frequency2_operator2, ..., frequencyN_operatorN
We also discussed ways to implement this change in POP, with two goals:
- No need for tavg id variables like
Instead, add a
integer (int_kind) :: tavg_ECOSYS_IFRAC_2 ! ice fraction duplicate integer (int_kind) :: tavg_ECOSYS_XKW_2 ! xkw duplicate integer (int_kind) :: tavg_DpCO2_2 ! delta pco2 duplicate integer (int_kind) :: tavg_DIC_GAS_FLUX_2 ! dic flux duplicate
num_occurrences
dimension totavg_ids_interior_forcing(:)
andtavg_ids_surface_forcing(:)
, which already have anum_diags
dimension - In addition to the
tavg_contents
file,MARBL_diags_to_tavg.py
script should create another text file containing a list of all variables being accumulated and what operator to use in the accumulation. Format would be something likeand for each diagnostic, POP would track number of occurrences (somewhere between 0 and# DIAGNOSTIC_NAME : operator1, operator2, ..., operatorN
max_num_occurrences = size(tavg_ids_surface_forcing, dim=2)
). So both the define and accumulate loops would beand similar fordo i=1,size(diags, dim=1) do j=1,tavg_num_occurrence_surface_forcing(i) [define or accumulate tavg_ids_surface_forcing(i,j)] end do end do
interior_forcing
; not that thedefine_tavg_forcing()`
call would start using thetavg_method
argument rather than defaulting totavg_method_avg
I also raised the question about what to do with the ALT_CO2
diagnostics -- having MARBL_diags_to_tavg.py
ignore any diagnostic containing ALT_CO2
in its short name unless LMARBL_TAVG_ALT_CO2 = True
is fine. Lastly, I will move MARBL_wrappers/
out of cime_config/
; MARBL_diags_to_tavg.py
and MARBL_wrappers/
(the former isn't a wrapper and therefore doesn't belong in the latter) will be in $POPROOT/MARBL_scripts/
.
Lots of talk about what issues to tackle (and when)... I'll keep the MARBL 1.0.0 project up to date - there are a few tickets in the CESM2 category that we don't want to let slip through the cracks. After updating the diagnostics, I'll tackle some of the "unsorted" tickets that strike me as "quick fixes" and then update marbl_domain_type
.
We will meet next week, but that will be the last meeting of 2017 (CGD's holiday party is the 19th and then the 26th is the day after Christmas).
Resolved several POP issues via a mix of trunk tags, marbl_dev
tags, and branch merges. I should have a new POP tag using MARBL scripts to generate marbl_in
by the end of the week. My next task will be to use MARBL scripts to generate tavg_contents
file.
Jessica has a set of scripts (written in R) to let her explore size-structured PFT classes. If we make these part of the MARBL repository, it probably makes sense to convert them to python or a single Jupyter notebook.
Issues with cheyenne slowed us down a bit, but lots of POP issues need to be addressed. I'm getting closer to stripping MARBL configuration out of the CESM XML environment files, current hang-up is in reading POP's config_cache.xml
file in buildnml
to make sure we haven't changed MARBL_NT
.
It makes sense to add yaml_to_json.py
to the MARBL_tools
package (and have it use the python class in that package). Open question about where the output JSON file should go, though -- one possibility is to look to see where CIME puts Fortran autogenerated by genf90
, and name a directory in a similar manner.
The temporary solution of POP
using put_settings()
to make sure init_bury_coeff_opt
is set correctly is okay, but the flag controlling it should be named lmarbl_bury_coeff_vars_in_restfile
. Keith points out that we want to keep init_bury_coeff_opt
in MARBL even after MARBL is computing its own running means, because a user may want to re-initialize the running mean.
Next step for marbl_dev_levy
will be getting MARBL_NT
directly from MARBL instead of OCN_TRACER_MODULES_OPT
.
Demoed the new POP buildnml
script, which calls a MARBL python script to create marbl_in
instead of relying on build-namelist
; lots of suggestions for cleaning up the MARBL scripts:
- Instead of input file, refer to
marbl_in
as a MARBL settings file-
MARBL_input_file.py
->MARBL_generate_settings_file.py
-
MARBL_parameter_values.py
->MARBL_settings_file_class.py
-
lib_dir
->MARBL_settings_class_dir
-
- Instead of
marbl/tools/generate_input_file
, have all python inmarbl/MARBL_tools
and use__init__.py
to allow code likeimport MARBL_tools MARBL_tools.gen_settings_file()
- Move
default_values.yaml
->marbl/src/default_settings_values.yaml
-
marbl.parameters_input
->user_settings_marbl
Also, some design decisions to keep thinking about / ask CSEG about:
- Can we have users put
user_settings_marbl
in$CASEROOT
? - Where should MARBL
SourceMods
go? Currently inSourceMods/src.pop
, but maybeSourceMods/src.pop/marbl
orSourceMods/src.marbl
? - Need to make sure that
sys.path.insert(0,SourceModsDir)
does what we want -- namely, that if an updated version ofMARBL_settings_file_class.py
is in SourceMods, that gets imported rather than the one in fromMARBL_tools
And a bug we noticed looking through the code: MARBL probably does not work with multi-instance; we need to make marbl_nml_filename
a namelist parameter in ecosys_driver.F90
and make sure it looks for marbl_in_####
if ninst > 1
.
Keith asked about generating the settings file via a class method instead of having a separate script, but I don't like that idea because I think we want separate scripts to act on the class (especially once we are ready to auto-generate the Fortran code).
It sounds like using YAML will soon be blessed by the powers that be (aka CIME devs), so we'll move forward with YAML for generating the input file. Worst case scenario, we end up writing a tool to convert our YAML file into XML.
After POP is up and running with YAML-based input file generation, we will move grazing
into zooplankton_type
(so zooplankton(:)%grazing(:)
instead of grazing(:,:)
. Also, we discussed how it doesn't make sense to call grazing%construct
from marbl_settings_define_PFT_count
, it should be called from a separate subroutine that is called between marbl_settings_define_PFT_count
and marbl_settings_set_defaults_PFT_derived_type
(maybe marbl_settings_construct_PFT_derived_types
?). And really, it will be the zooplankton(:)
constructor (which will call the grazing
constructor in turn).
For POP use: introduce marbl.parameters_input
in $POPROOT/input_templates
(empty file save for comments about using it to change MARBL parameters). If file is not present in SourceMods/src.pop
use the [empty] file in input_templates/
.
Next priority (after input file work is complete) will be having MARBL provide a list of diagnostics. This will likely require converting ocn.ecosys.tavg.csh
to Python; we want to use YAML to define all possible diagnostics. It will be a simple dictionary, with diags[diag_name] = requested_freq
; requested_freq
should be one of none, low, medium, high
(or a list if it should appear in multiple streams). Care will be needed for tracer-based diagnostics, because most of those will be computed by the GCM itself rather than being passed by MARBL. So something like
diagnostics.yaml
----------------
_nonliving_tracers :
- DIC
- NO3
- NH4
_per_tracer:
default_tracer :
surface_flux : none
virtual flux : none
ALK :
surface flux : medium
virtual flux : medium
ECOSYS_ATM_PRESS : medium
ECOSYS_IFRAC :
- medium
- high
ECOSYS_XKW :
- medium
- high
SCHMIDT_O2 : medium
SCHMIDT_CO2 : medium
Possibly with more keys in the _per_tracer
subdictionary.
The python script to parse this dictionary should also allow a text file override (similar to the parameter input file), maybe just
DIAGNAME requested_freq
Like with the parameters, input_templates/marbl.diagnostics_input
could be empty but copied to SourceMods and edited. Tracer-specific changes would be harder to implement in this manner.
Not much beyond what is listed in the agenda, but we agreed that I should pause work on the input file name generation tool for a day to move iron_frac_in_dust
and iron_frac_in_bc
from MARBL to POP (was originally on Keith's list ahead of splitting the dust flux into fine and coarse components).
Keith pointed out that the YAML file contains a mix of python logicals (keys that look like "ciso_on = True
") and YAML logicals (default_value : true
, append_to_config_keywords : true
, etc)... what we want is mostly Fortran logicals (ciso_on = .true.
, default_value : .true.
), to be used when referring to a value that eventually ends up in Fortran code with some YAML thrown in (append_to_config_keywords : true
) for the rest. When we start to support input files in the python, we also will need to support a range of acceptable Fortran: e.g. .true.
, True
, and T
.
As seen in the append_to_config_keywords : true
example, I need to be more consistent with the leading _
; rule of thumb is "if does not correspond to something in the MARBL Fortran, it is YAML specific and therefore needs to start with _
.
To support different tracer counts, I will add a get_tracer_count
routine to the class that will parse
_array_size : *NT
_array_size_increment :
variable_PtoC = .false. : -3
ciso_on = .true. : 14
Also, when it comes time to add this tool to POP, we will remove POPROOT/source/marbl
(which is just MARBLROOT/src
) and add the full MARBLROOT
to POP's SVN externals. So POPROOT/marbl
will contain the full MARBL checkout; also POP will add MARBLROOT
to env_build.xml
and then the buildcpp
, buildnml
, and buildexe
scripts can check SourceMods
before falling back to MARBLROOT
when looking for MARBL code and utilities.
Matt pointed out that following the input file generation, the next big task for MARBL will be to provide the GCM with a list of diagnostics being provided. POP, for example, should use MARBL output to decide what goes into tavg_contents
.
More discussion on how to get resolution-dependent defaults into the YAML. Going to make default_value
a dictionary if there are multiple possible defaults, and then pass a list (or dictionary?) of keys. Luckily variables either seem to depend on resolution OR the value of a previous variable, rather than a combination of the two because the logic to figure out the default when there are multiple keys to match is far tougher.
Keith will update POP to require the abio
and ciso
modules to use the same d14c
forcing when both modules are being run, but will NOT impose a similar restriction on atmospheric CO2 when abio
and ecosys
are both used.
Compiling and running a Fortran executable to generate CPP macros and the MARBL namelist is not a reasonable approach: some machines require different compiler flags to run executables on a login node rather than on the compute nodes, and then we would not be able to run the MARBL executable when the namelist is regenerated during a continuation run. Other smaller issues presented themselves too, but the big one alone prevents us from even prototyping this workflow.
Instead we will look at auto-generating Fortran code that contains the default values provided by an XML (or JSON?) file. The Fortran code would be generated by a python script, and a different python script would also generate marbl_in.
This still leaves us needing a tool to generate the tracer count, but for now we will focus on the namelist. One possibility for the tracer count would be to use the auto-generation tool for tracer initialization as well, so we'll keep that in mind as we move forward.
In more detail-oriented discussion, we talked about how to generate marbl_in
; one possibility is to have build-namelist
do it, but rather than rely on namelist_definitions.xml
we could prepend something to MARBL variables in user_nl_pop
. For example:
! POP variables are entered in the usual way
ltavg_ignore_extra_streams = .true.
n_tavg_streams = 1
tavg_freq_opt = 'nday'
tavg_file_freq_opt = 'nday'
lecosys_tavg_all = .true.
! MARBL variables have a "MARBL: " prefix
MARBL: autotroph_cnt = 4
MARBL: ciso_on = .true.
I showed a brief demo of the generic MARBL interface, and we talked about the best way to distribute it. For now we will leave it in its own repository, but after the folks at UCI have a chance to expand on it, we will look at bringing it into NCAR/MARBL. We would need to rethink the top-level directory structure, possibly adding a drivers
directory and moving driver_src/
and driver_exe/
out of the tests
directory.
A significant amount of time was spent discussing how to get MARBL's inputfile generation out of POP's build namelist. As a first pass, POP's buildcpp script will build MARBL's stand-alone driver; this can be run for two tasks:
-
buildcpp
: Determine tracer count (I'll add a new test that just outputsECOSYS_BASE_NT
,CISO_NT
, andMARBL_NT
) -
buildnml
: Write out the MARBL input file (usinggen_inputfile
and passinguser_nl_marbl
to overwrite defaults).
We would need to be aware of MARBL files in SourceMods/src.pop
when we build the driver, and it would need to be rebuilt in buildnml
if files are modified after buildcpp
is run.
We also found a bug in how I hard-coded marbl_in
as the MARBL namelist file name; this needs to be compatible with multi-instance runs, so marbl_nml_filename
should be in &ecosys_driver_nml
rather than a Fortran parameter.
Mariana would like to sit down with Jim E and some of the MARBL team to discuss this approach in more detail.
Most of the meeting was spent discussing the best way to initialize PFTs. We will introduce a new MARBL parameter named PFT_defaults
; a value of 'user-specified'
will force the user to set all PFT-related variables; the default value will be 'CESM2'
, which will provide the 3 autotrophs, 1 zooplankton, and 3 grazing prey classes that are currently the default.
We will also rename grazer_prey_cnt
to max_grazer_prey_cnt
as that is a more descriptive name -- if we have 2 zooplankton, one of which grazes on three different biomass aggregates and another that only grazes on two then we need max_grazer_prey_cnt = 3
; in the future, we may consider grazer_prey_cnt(zooplankton_cnt)
.
We will have some CESM-specific comments in MARBL, but only temporarily -- not all of the MARBL parameters that can be changed via put_setting()
calls are available in the POP build-namelist
script, and we want to make it clear to CESM users which parameters need to be changed in the Fortran code as opposed to which default values are over-written by put_setting()
calls from POP. After MARBL has its own tool to generate input files, the comments will change to alert users to variables that should be changed via input file.
Our plan for CESM 2.0 is to continue to develop at our natural pace and see where MARBL stands at the code freeze - we already have several more features in than we expected thanks to delays in the CESM finalization. Keith is working on a new initial condition file; ecosys_jan_IC_gx1v6_20161123.nc
is an intermediate file useful for some testing but is not needed in place of ecosys_jan_IC_gx1v6_20150108.nc
given that it will not be the final version of the file and updating from the old version will require new baselines for all our tests.
Lots of progress was made towards #189. Most notable is that POP's build-namelist tool generates a separate marbl_in
file that MARBL parses without Fortran's namelist functions. This is a great step towards having MARBL generate its own input file, which will be tackled after #189 is accepted.
I have a fork of the repository holding the MARBL documentation, and I will keep it up to date with my code changes so that the documentation can be updated once 189 is pulled to master
.
Most of the meeting was spent discussing initialization; the current plan is to have marbl_instance%user_settings
replace both %configuration
and %parameters
; the %put()
call would store the keyword - value pairs in a linked list until after each round of %add_var()
calls; at that point, variables that have been added would be updated (and the linked list entry would be deleted), while put()
calls for variables that haven't been added yet would be kept (and tried again after the next round of %add_var()
). If the linked list is not empty at the end of init()
then it indicates a put()
call did not match any of the parameters and MARBL will abort with an error.
We also talked about using a single monolithic init()
instead of breaking the calls into different phases. One reason for doing this is that the different phases were introduced just so that GCMs had the opportunity to call put()
in between phases, but that will no longer be necessary. Also, we will add some error checking to make sure different initialization routines are called in the right order (for example, we can not initialize tracer_restore_vars
until we have constructed marbl_tracer_indices
): this will likely mimic POP's initialization error checking, where we register a string with the subroutine name at the beginning of each phase and check to see if routines that are depended on have been run by looking for that name in the registry.
In the future, we can perhaps abandon the MARBL namelist altogether -- POP's build-namelist
tool could write a separate file (marbl_in
?) that contains a generic format like
var1_name var1_type var1_value
var2_name var2_type var2_value
...
Which would turn into the default format of MARBL's gen_parameters
tool. The POP driver would then read this file and call marbl_instance%settings%put()
and we could strip all namelist support out of MARBL.
We talked about what the next steps in MARBL development should be. After bringing runtime configurability to PFT counts in to MARBL (currently in progress), it will probably be a good time to work on MARBL setting up its own namelist, and it would be useful to have better control over diagnostic output before bringing abio tracers into MARBL. So the recommended path forward is
- Finish runtime-configurable PFT
- MARBL building its own namelist
- More flexibility in
marbl_domain_type
- Better control over what diagnostics are returned to GCM
For the namelist generation, we compiled a list of options the namelist generation tool should support
-
--output-file-format
: default will be Fortran namelist, but also support MPAS and MOM parameter file formats -
--default-file
: XML file containing general defaults -
--non-default-files
: a way to have MARBL provide multiple XML files that build on each other (e.g. turning CISO on) -
--user-specified-file
: a way for the user to specify changes from the default settings -
--user-specified-file-format
: XML file? something like CESM'suser_nl
text files? Something MPAS or MOM specific? etc etc.
For better control of diagnostics, what if MARBL did not allocate field_2d
or field_3d
but instead left that up to the GCM? Instead of compute_now
we could just look to see if memory was allocated. (Maybe make them pointers, then check if associated?) This led to the follow up question about whether there are other parts of the interface that could be updated in this manner (state, forcings, fluxes, tendencies, etc). Conclusion was that it probably doesn't make sense to determine which tracer tendencies are returned in this manner, it's easier to think of tracers in natural groupings than as individual quantities).