mmm-marshall

The module io.github.mmm.marshall (artifactId mmm-marshall) provides the API to marshall (serialize) and unmarshall (deserialize) data from/to structured formats such as JSON, XML, YAML, or even ProtoBuf (gRPC) without implementing your mapping code for multiple formats.

Problem

When you want to do (un)marshalling with Java EE standards the current state is:

JSON-P to map to/from JSON.
JAXB to map to/from XML.
no support for other formats like YAML, gRPC / ProtoBuf, Avro, etc.

As a result you implement your marshalling and unmarshalling code for a single format. If you want to support a different format, you have to start from scratch. Further, the reference implementation for JSON-P is lacking obvious features. E.g. the indentation is hardcoded as you can see here. You can also not configure to include or omit null values. Also, there is no build in support to skip a value and doing so requires an awful lot of code (in case the value might contain nested arrays and/or objects).

Solution

So instead we define a universal API for (un)marshalling. You can write your mapping code once and then you can benefit from all formats supported by this project or even add custom formats as plugin yourself.

Features

Simple but powerful API to marshall and unmarshall your data in a format agnostic way.
Configurable indentation
Configuration to include or omit null values
Works in JVM as well as in the browser using TeaVM or as cloud-native binary using GraalVM.
You can even use mmm-bean to get the entire marshalling and unmarshalling for free with many other cool features.
You can further use mmm-rpc to implement client and/or server for RCP communication with minimum code but maximum benefits (support for different formats, sync/async/reactive support, etc.).

We provide the following implementations:

human readable text formats:
- mmm-marshall-json (native implementation for JSON with flexibilty)
- mmm-marhsall-jsonp (implementation for JSON based on JSON-P)
- mmm-marshall-stax (implementation for XML based on StAX)
- mmm-marshall-tvm-xml (implementation for XML using TeaVM and XML API of the browser)
- mmm-marshall-yaml (implementation for YAML with flexibility)
- mmm-marshall-snakeyaml (implementation for YAML based on SnakeYaml)
highly efficient binary formats:
- mmm-marshall-protobuf (implementation for ProtoBuf/gRPC)

Usage

Maven Dependency:

<dependency>
  <groupId>io.github.m-m-m</groupId>
  <artifactId>mmm-marshall</artifactId>
  <!-- <version>${mmmVersion}</version> -->
</dependency>

Gradle Dependency:

implementation 'io.github.m-m-m:mmm-marshall:${mmmVersion}'

For ${mmmVersion} please fill in the latest version that you can find in the badge above.

Module Dependency:

  requires transitive io.github.mmm.marshall;

Example

To get started we use a very simple example based on a stupid POJO that every vanilla Java developer will immediately understand. Please note that we provide build-in marshalling with mmm-property and mmm-bean. If you decide to use this, you get all this and many other advanced features for free. However, here comes our stupid example:

public class Person implements StructuredIdMappingObject {
  private String name;
  private int age;
  public String getName() { return this.name; }
  public void setName(String name) { this.name = name; }
  public int getAge() { return this.age; }
  public void setAge(int age) { this.age = age; }
  public StructuredIdMapping defineIdMapping() {
    return StructuredIdMapping.of("name", "age"); // only needed for optimal support of binary formats such as gRPC
  }
}

Now, we can marshall a Person like this:

public void marshall(StructuredWriter writer, Person person) {
  writer.writeStartObject(person);
  writer.writeName("name", 1);
  writer.writeValue(person.getName());
  writer.writeName("age", 2);
  writer.writeValue(person.getAge());
  writer.writeEnd();
}

Now we run the following code:

Person person = new Person();
person.setName("John Doe");
person.satAge(42);
StringBuilder sb = new StringBuilder();
StructuredWriter writer = StandardFormat.json().writer(sb);
marshall(writer, person);
String json = sb.toString();
System.out.println(json);

This will print the following output:

{
  "name": "John Doe",
  "age": 42
}

The interesting fact is that you can exchange StandardFormat.json() with something else to get a different format without changing your implementation of marshal. So you can also use StandardFormat.xml() to produce XML or you can generate YAML or even gRPC/ProtoBuf.

To unmarshall a Person you can do something like this:

public void unmarshall(StructuredReader reader, Person person) {
  // for better design and reuse you would typically keep these 3 lines outside of this method
  reader.require(START_OBJECT);
  boolean start = readStartObject(person);
  assert start;

  while (!reader.readEnd()) {
    if (reader.isName("name")) {
      person.setName(reader.readValueAsString());
    } else if (reader.isName("age")) {
      person.setAge(reader.readValueAsInteger());
    } else {
      // ignore unknown property for compatibility
      reader.readName();
      reader.skipValue();
      // we have dynamic properties support in mmm-bean
      // even much better than gRPC generated unknownFields
    }
  }
}

Now you could use it like this:

Person person = new Person();
StructuredReader reader = StandardFormat.json().reader(json);
unmarshall(reader, person);
System.out.println(person.getName() + ":" + person.getAge());

Schemas

Formats like JSON, YAML, or XML are generic and can be used without a schema. These formats are human readable therefore transparent, widely adopted, and fully inter-operable. However, looking at efficiency and performance these formats are rather poor: XML is full of redundancy due to closing tags. But even JSON and partially YAML are not following DRY principle and cause lots of waste and overhead especially for arrays of homogeneous objects:

[
{"longPropertyName":"value1","someNumber":1},
{"longPropertyName":"value2","someNumber":2},
{"longPropertyName":"value3","someNumber":3},
{"longPropertyName":"value4","someNumber":42}
]

Here already CSV would be more efficient:

"longPropertyName";"someNumber"
"value1";1
"value2";2
"value3";3
"value4";42

For advanced performance there are optimized formats that are binary and typically use some schema as metadata shared by the service provider and the consumer. The most fundamental form of such a schema is a mapping from property names to unique IDs and vice versa for an object. So instead of encoding "longPropertyName", you agree to map this property to e.g. the ID 1 and "someNumber" would have Id 2. Already for JSON you can quickly see the benefit in size of the payload. However, with binary formats, you can encode the information much more efficient. This is exactly the purpose of such binary and schema based formats. A disadvantage of such binary formats is that they are not human readable what makes it harder to debug for developers. As with mmm-marshall you get support for all formats for free, you can simply do your own service communication in binary formats for optimum performance. However, if you want to debug something, you can get the same data also as JSON what can also be a good choice for communication between different parties that can not agree on a binary format for arbitrary reasons and prefer the simplicity, transparency, and inter-oparability of JSON. Also a browser may prefer to get the data as JSON what is the natural language of browser technology.

ID mapping

If you want to (also) support binary formats, you need to somehow provide a schema for your data. Therefore you need to consider the following aspects:

We have decided to abstract as much as possible from these technical implications in the API of mmm-marshall. So if you develop mapping code, you simply read or write property names as String.
However, under the hood we then need to map names to IDs and vice versa what happens via the interface StructuredIdMapping.
The only impact is that the methods writeStartObject and readStartObject take a StructuredIdMappingObject as argument what is typically the object to write or read. The interface allows the object to provide its custom StructuredIdMapping. In the example above we have shown how to implement this to get fully portable and optimal results (see defineIdMapping). However, this leads to extra maintenance effort and therefore we give you flexible alternatives.
You may also pass an instance of StructuredIdMapping directly instead of the object to write or read. This can be especially helpful if you need two different marshallings for different representations of the same object.
When creating an according binary structured format, you can also provide your own implementation of StructuredIdMappingProvider with the configuration. It will receive the passed StructuredIdMappingObject instances and returns the according StructuredIdMapping. Here you may also read a .proto or .avro file derived from the type of the object to map and return an according StructuredIdMapping. Also you could even create a global ID mapping by collecting all property names of your entire data-model and allow your marshalling code to passes null as StructuredIdMappingObject.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.adoc

README.adoc

mmm-marshall

Problem

Solution

Features

Usage

Example

Schemas

ID mapping

Files

README.adoc

Latest commit

History

README.adoc

File metadata and controls

mmm-marshall

Problem

Solution

Features

Usage

Example

Schemas

ID mapping