-
Notifications
You must be signed in to change notification settings - Fork 5
Modules
Although the 'm' in mds originally signified modernized, it can with almost equal justification mean modular. This page explains what mds modules are, how to create them, and how the mds system handles them.
In most general functional terms, an mds (DSpace) instance is a set of 2 or more modules, where one of them must be a special module known as the 'kernel'. The kernel contains the content API and an implementation of that API that can produce concrete objects in the content model, i.e. Items, Bitstreams, Communities, etc. It does not, however, contain any application code that uses the implementation. Expressed otherwise, the kernel is essentially a library that application code can use to produce real repositories. Therefore at least one other module must be present containing application code, for a real instance to exist. There is no limit to the number of modules an instance may contain, and importantly, modules may be added to existing instances at any time after the instance has been created, provided they are compatible with the instance. Modules may also be updated in an instance. In addition to the core APIs mentioned, the kernel module contains tools and code to perform these module operations (additions and upgrades, etc), so it may also be thought of as a bootstrap system (one that can build itself up). As such, the kernel must be the first module to be installed when creating a DSpace instance.
In technical terms, a module is any maven project that obeys certain conventions and restrictions and follows certain practices. Briefly, these are:
- Module projects must produce one of the natural maven artifacts, viz. a jar or a war (or ear, etc), not a pom project.
- Module projects must use maven artifact IDs that begin with 'dsm-' (DSpace module). This helps the system distinguish them from other projects, and apply special rules to them.
- Module projects must link to and employ a common maven assembly descriptor, which guarantees that the published version -- a zip archive containing the module - is uniform. This descriptor is published as an ordinary maven project that can be retrieved from a maven repository.
- Module projects must invoke the maven dependency plugin in a prescribed manner to produce a stored list of dependencies that will be used during module installation and upgrade.
- Module projects must place additional resources in standardized locations, but otherwise use regular maven conventions. (See detailed description below in Module Layout).
- Module projects are strongly encouraged to include the source code (in the standard maven 'src' directory) to permit local customization. Modules without source are considered 'locked', but may still be installed and upgraded. (This feature is intended to accommodate vended, or other controlled-source modules).
Beyond this, there are no particular restrictions on modules. They may use any package names, third party jars, etc. There is, of course, no guarantee that any given version of any module will be compatible with any given instance: this determination is made at installation time, not module build time.
As mentioned above, modules must organize their content in a very specific way. The basic layout is:
[bin/] [conf/] [db/] deps.txt lib/ pom.xml [reg/] src/
| |
modules/ main/
emails/ |
java/
webapp/
In addition to the basic maven required files (a pom.xml and a src directory), a file called 'deps.txt' will be present which is automatically generated by the maven dependency plugin. The directory structure is further expanded to include several optional directories, whose names and contents are:
bin : contains OS-specific shell or other executables. Since DSpace encourages the use of the 'script launcher' for most such command-line tools, this directory should rarely be needed.
conf : contains configuration files used by ConfigurationManager/Service. They should obey the same rules as DSpace; in particular, most all configuration properties files should reside under 'conf/modules'. Other configuration data, like email templates, follow the same rules.
db : contains DDL files (SQL scripts) needed by the module. Since the DDL for the instance is contained in the kernel module, mostly this directory will not be used.
reg : contains load files for DSpace registries (these are the files that formerly resided in 'config/registries').
Modules can best be seen from two distinct vantage points: that of the the producer/author, and that of the consumer/installer. A rough lifecycle:
A module producer begins with an ordinary maven project (just a pom.xml and src files). This project is enhanced by the addition of the shared assembly (or an equivalent) descriptor (bound to the package phase), and the dependency plugin call (bound to the same phase). When:
mvn package
is run, the assembly process will create a archive (.zip) consisting of the built artifact (jar or war), all required dependencies placed in the 'lib' directory, the 'deps.txt' file, and any special files (conf, reg, etc), together with the src files and the pom itself. This zip archive (found in the producer's maven 'target' directory), becomes a distribution package for consumers of the module. They simply need to download/acquire the zip archive, unzip it, and they will see the directory structure described above.
The module consumer obtains the module package (zip archive), and installs it into a local DSpace instance. The exact sequence of steps to install will depend on several factors (i.e. whether the consumer wishes to customize the code, etc), but in the simplest (and typical) case, all the consumer will have to so is supply local configuration values. This usually means editing one or more properties files in the 'conf' directory of the module. When configured, the module installation process itself is managed by kernel module tools. Specifically, the tools are available in the 'script-launcher' ('dspace') application, so that the steps are reduced to:
edit conf/modules/themodule.cfg
./dspace install themodule
If a subsequent change in a module needs to be made (e.g. a configuration value), then the consumer edits the version of the module file that was downloaded, and just applies it:
edit conf/modules/themodule.cfg
./dspace update themodule
Note that in these simple cases, mds handles all installation details - neither maven nor ant are required to be present on the consumer's system. If the consumer wishes, however, to modify the module code, they simply make the changes:
edit src/main/java/org/dspace/themodule/Foo.java
mvn package
./dspace install themodule
since the module as distributed in the zip archive is a valid and complete maven project.
On the consumer side, it is important to understand how distributed module files relate to resources in the deployed instance. MDS, like DSpace, distinguishes three locations related to an instance: the 'installation' directory, the 'source-directory', and the webapp deployment directory. DSpace documentation tags these directories as '[dspace]' (installation) and '[dspace-source]'. We will roughly follow that convention, but call the source directory '[stage]', the installation '[install]' and the webapp location or locations '[deploy]'. This change in emphasis is intended to convey that the staging directory need not be a compilation source (only if customization needed)