-
Notifications
You must be signed in to change notification settings - Fork 5
Injectable Configuration
Some of the messiest and most complex parts of DSpace revolve around managing non-code state information that configures, controls, or otherwise affects the runtime system. A monolithic single config file has given way to modular files, and new mechanisms (such as 'build.properties') have been added to an already complex system of filtering, interpolating, overriding, etc values from properties files and pom files.
But throughout all these changes, one core characteristic remains the same: all configuration management is part of, and inextricably bound up with, the build process. On the face of it, this makes some sense, since the configuration files are part of the code distribution, and need to be deployed along with the jar artifacts, etc. But in other respects, the model doesn't altogether hold up:
- configuration values generally do not affect the built artifacts themselves (i.e. produce different binaries)
- the build process is needlessly repeated for each configuration just to pass the values through to deployment
- to change any simple configuration value in a runtime system, one must rebuild/redeploy
- there are many configuration values (like passwords, API keys, etc) one would not want to manage with code
There are many other cases in which the build-based model is not optimal.
MDS attempts to address many of these issues with a new configuration system. It does not replace traditional configuration management, but rather complements it for cases where it makes sense. The system is very simple: for any configuration (i.e. name/value pair defined in kernel.cfg or any module *.cfg files), one can declare and inject an environment variable that will supersede the value read from the config file at system startup. Injection here simply means that the environment variable is in scope/visible for the mds process in question, which may be a command-line tool, or a web application, etc. MDS has built-in support to make management of these environment variables easy and seamless. In this model, only an application restart is required to change one or more configuration values. Since the properties declared this way are typically specific to a 'site', like hostname of the server, we often call them site variables, although in fact the mechanism will work for any configuration property.
Use site environment variables to store values which should be kept out of regular source control (passwords, API keys, etc) and/or those that are particular to a site, or server instance. In this way, all the sensitive/variable details about a deployment can be managed in one or more files distinct from the application source control. Note that this means you should create a different environment file for each deployment context. Maybe one for testing ('test.env'), one for production ('prod.env'), one for each server ('server1.env', etc). Use descriptive names to help you keep track. Simply update the environment file and restart the app.
mds comes with a file in the 'bin' directory of the kernel module called 'setenv.sh'. This is where one declares all the site variables one wants to use. After a deployment, but before starting the application, one would typically copy a file containing the values appropriate for that deployment environment (our 'prod.env', e.g.) over it. When a 'setenv.sh' file is present, mds will automatically load its contents into the shell used to launch java (in the 'dspace' shell command); you do not need to do anything else to make the variables visible. Similarly, to inject the variables into Tomcat, just copy the 'setenv.sh' file to 'CATALINA_HOME/bin'. Tomcat will likewise automatically pick up these variables and make them visible to any webapps running.
There is a simple syntactic relationship between environment variable names and configuration property names. For example, suppose you want to inject a new name for your repository (the DSpace property name is 'dspace.name', in mds it is called 'site.name') that will override the value in 'dspace.cfg' - you would add to the setenv.sh the lines:
MDS_SITE_NAME="My shiny new repository"
export MDS_SITE_NAME
In general, for any non-modular property 'foo.bar', the variable name would be 'MDS_FOO_BAR'. For modular properties, a similar rule applies. In module 'baz', the 'foo.bar' property variable would be 'MDS_MOD_BAZ_FOO_BAR'. These name transformations are made to ensure first that the variable names conform to standard practice, and second, to add the 'MDS_' prefix so there will be no collisions with other environment variables. A few other guidelines when adding your own variables:
- Use uppercase, underscore separated variable names
- Do not put whitespace between the variable name and the value, i.e NAME=val
- Put double quotes around value strings that have internal spaces or special characters, i.e. NAME="my value"
There is another use of site variables besides the property 'override' described above. You may also use site (environment) variables within value strings otherwise defined in standard configuration. This can be called the 'interpolation' use as opposed to the 'override' use. For example, there may be a config property for database connection parameters like:
db.url = jdbc:postgresql://${DB_PORT_5432_TCP_ADDR}:${DB_PORT_5432_TCP_PORT}/dspace
where a pair of site variables would inject the necessary values. This syntax makes it very easy to use the same build/deploy configuration for both test and production: the difference is just the database connected to at runtime.