OpenAPI References Architecture #643

darrelmiller · 2021-10-31T15:25:32Z

darrelmiller
Oct 31, 2021
Collaborator

From day 1, a design goal of this tooling has been to be able to efficiently process OpenAPI descriptions of almost any size. One way that users of OpenAPI keep down the size of their documents is via the use of reusable components and references to those components.

When loading an OpenAPI description into a DOM, there needs to be some processing done on the references in the document that point to components. There are a few approaches that have been considered:

Proxy object

The minimalist approach would be to represent the references as an object that points to the reused component object. The downside of this approach is that it requires client code to handle references differently than inline objects. It also means that the reference objects would need to share some common interface with real object so that it could live in the place of the real object. Every client interaction with a referenceable object would have to start with a check to see if it is real or a reference and then different code paths would be needed to access that object. Reflecting back on our choice to not take this path, it might have been possible to improve the client code by using something like a IProxy with a value property that either returned the real object or referenced object. It would have made for ugly code like doc.paths["/"].value["get"].parameters["id"].value.schema.value.type. However, there are definitely benefits of that approach.

Inlining

Another approach would be to just clone all referenced objects and insert them into their target context That would make the developer experience for reading the model simple. However, it could significantly increase the memory footprint for some documents and it would make doing edits to the document very complex when re-used components were edited.

Replacing the temporary reference object

The approach that is actually implemented in the library involves replacing a temporary reference object with the actual reused object in a second pass after parsing. It produces a cleaner experience where the consumer of the DOM doesn't need to care about whether the description used a reference or an inline object. It is optimized for the read scenario. When the document is first read in using an OpenApiReader, an OpenApiReference object is created for every $ref.

After reading the entire document, the resulting DOM is walked for unresolved references. Each reference object is replaced by the real object defined in the components section. The referenced component object knows that it is something that is referenced. It can be rendered out as either just a $ref or as a full object. The Writers needs to maintain some context to know whether they are rendering components or just references to components. Inlining references only requires tweaking this writer behaviour to not write out components, but render all references to a component as full objects.

This implementation has been one of the reasons I have not been a fan of updating the specification to explicitly allow references to re-use objects that are not components. It would require knowing that an object contains a reference to itself in order to know to render it fully.

The advantage of this approach is that it has the lowest memory footprint and it makes references mostly invisible to consumers of the DOM. This is especially appealing for documents with external references.

There have been challenges though. The code to determine whether to render a reference or a complete document was tricky to get right. It gets really tricky when components have references and we still have one scenario that is very weird where a component is just a reference. That doesn't happen much in single document descriptions but it is not uncommon for external references to be used to create components that the rest of the document then references locally.

While this approach does optimize for reading, it also makes creating references somewhat confusing. When building a DOM, users often think they need to create an instance of an OpenApiReference to reference a component, when in fact all they need to do is re-use the component instance and ensure that it is marked as a resolved reference with a valid Id value. Creating a DOM in memory involves creating a resolved model. In theory you could build an unresolved DOM and then walk the references to resolve them, but that is unnecessary.

The OpenAPIWorkspace object is now getting to the point where it is usable. There were issues because resolving references required reading from files that is an async operation and so the entire Read process had to be enabled for async, but that is now working. The challenge now is when rendering a document that has had external references resolved, the references are not being rendered as external. Just as previously we needed context to know if we were rendering an object that was in the components, section, now we need to know if the object is in the current document being rendered. If document A references a component in document B and we are rendering document A, then the object must be rendered as an external link. However, if we are rendering document B then the same object should be rendered in full. Unless the writer is provided a setting that says external references should be inlined, in which case the object should be rendered in full. But what does that mean for an local reference in an external object?

It is possible to pass the document context down to the objects being rendered so that they can decide whether a reference needs to be rendered as external or not. However, that is going to require changing the signature of all the SerializeAs methods

Future problems

When we attempt to implement OpenAPI 3.1 we are going to run into a new problem. In OpenAPI 3.1 references allow summary and description properties to be changed for a reference. For OpenApiSchema references you can set any properties on the reference. With the current design, we don't have a place to store per-reference data.

The question is, should we consider a re-design? Would the proxy solution solve our per-reference data issue? How bad would the .Value experience be? What would be the impact on the "creating a model" experience be? Does the proxy model solve our "current" vs "external" document problem? I believe it does because it the proxy object would be in the current document and it would be flagged as external upon reading.
Would the proxy object make the ResolveReferences pass go away because the dereferencing could happen on the lazily? I suspect we would run into async issues again.

And the bigger question how do we move forward? 1.3.0 is not the place to do this kind of major change. Can we get 1.3.0 out with a partial Workspaces feature our do we cut it completely?

peteraritchie · 2021-11-15T20:57:56Z

peteraritchie
Nov 15, 2021

I agree that some major changes would be required just to support external references. I've been fiddling for a while on that, realizing that in the current architecture it would have to happen in the YAML serializer. This isn't too unexpected, but it has little context of where it is processing most nodes. I'd argue a YAML serializer shouldn't have knowledge of OpenAPI-specific syntax or semantics anyway.

The way I've been viewing the data in my mind is a graph of nodes. From the standpoint of "reading" the spec, there is one view or one way of traversing the nodes (hiding references). A serializer would have another view of the nodes, where it would care about the references to either inline them or serialize them to another file.

2 replies

darrelmiller Nov 25, 2021
Collaborator Author

My challenge is that I want the DOM to be able to round-trip a related set of documents. To do that I need to preserve all the references. When someone is editing the DOM they need to be editing the pre-resolved DOM. However, for many read scenarios we the consumer wants the "effective" object. I had previously considered that we should try and make the distinction fairly transparent as $ref was really the only "effective" resolution we did. However, pushing path parameters into Operations and allowing servers defined at the operation level to override globally defined servers are two other "resolution mechanisms" that would be useful to implement.

My thinking is that is evolving to the point where we should create a EffectiveObject() method on every DOM object and do all the resolution in there. This would get rid of the reference resolution step. It would only happen as required. External reference resolution would need to happen or we would run into Async read issues for external references.

peteraritchie Nov 26, 2021

What's the why, the vision, of this API?
Round-tripping is an excellent attribute of the API.
Read, write; there's one or more things in between.
What will be done between Read and Write that will realize that vision?

darrelmiller · 2022-07-07T23:08:12Z

darrelmiller
Jul 7, 2022
Collaborator Author

As an update, 1.3 has released with partial support for external references. Version 2.0 of the library will most likely be needed to add support for OpenAPI 3.1.

The current idea to enable support for 3.1 and address "proxy references" is to implement derived model objects for each object that supports references. Properties of the objects will become virtual and accessing properties of the reference model will either read the inline value or the property of the reference target object. Some experimental code is here https://github.com/microsoft/OpenAPI.NET/tree/dm/ProxyReference

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAPI References Architecture #643

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

OpenAPI References Architecture #643

darrelmiller Oct 31, 2021 Collaborator

Proxy object

Inlining

Replacing the temporary reference object

Future problems

Replies: 2 comments · 2 replies

peteraritchie Nov 15, 2021

darrelmiller Nov 25, 2021 Collaborator Author

peteraritchie Nov 26, 2021

darrelmiller Jul 7, 2022 Collaborator Author

darrelmiller
Oct 31, 2021
Collaborator

Replies: 2 comments 2 replies

peteraritchie
Nov 15, 2021

darrelmiller Nov 25, 2021
Collaborator Author

darrelmiller
Jul 7, 2022
Collaborator Author