Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to serve multiple graphql schemas on a node #91

Open
gausby opened this issue Sep 18, 2017 · 8 comments
Open

Make it possible to serve multiple graphql schemas on a node #91

gausby opened this issue Sep 18, 2017 · 8 comments

Comments

@gausby
Copy link
Contributor

gausby commented Sep 18, 2017

Currently it is only possible to serve one GraphQL schema on an Erlang node. It would be useful to be able to run multiple GraphQL schemas on a single node for a couple of reasons:

  • In a HTTP context different schemas could be served from different endpoints
  • In a test context tests could be run in parallel

This would require all the internal lookups to be name-spaced.

@kalta
Copy link

kalta commented Nov 6, 2017

I'm very interested in this possibility.
How is it going? I can help if it is not already being worked on.

Without a deep analysis, I guess it would be a matter of allowing multiple instances of graphql_schema, and updating all related functions to carry the name of the schema instance (an atom, that would be the same as the registered module in my mind).

@jlouis
Copy link
Owner

jlouis commented Nov 6, 2017

We have not started this yet.

Keep one graphql_schema. Note that it maintains an ETS table, and inserts into this table only happens once under bootup, so it isn't performance criticial. ETS lookup happens outside the gen_server. The way this table should be altered is that a key is not Entity but rather {Namespace, Entity}. Default a namespace default which is the one which is used if no other namespace is defined.

  • Alter the graphql_schema to insert under a namespace {default, Entity}
  • Make sure that every lookup uses the default namespace. This should be pushed as a milestone before continuing.
  • Allow the Ctx context to contain a namespace => Namespace binding. Use this in lookup. This might require some code to propagate the Ctx into the right functions as well.

Other things which requires work here:

  • The Schema parser must know to which namespace it should attach the parsed schema.
  • The Schema contains default built-in types and coercers which are injected as part of initialization. This requires a step for the Namespace as well. Note that it could be beneficial if the namespaces could run with different coercers (think: upgrade scenarios).
  • Validation of the schema needs to take the namespace into account.
  • Validation of GraphQL query documents need to take namespace into account.
  • You must handle introspection as well for the namespace. We currently just grab everything in the ETS tables, so some kind of filtering is necessary. Don't fret too much about lookup speed for now.
  • The ?ENUMS ETS table must be handled correctly.

The graphql_schema currently maintains two tables. One is the schema (called ?OBJECTS). The other is a 2ndary index on enum values (called ?ENUMS) mapping to their respective types (to speed up lookup). This must also be maintained. We'd also like an ?INTERFACES schema here which can speed up type checking by quite a lot.

I'd suggest we might want to start by introducing the ?INTERFACES 2ndary index (i.e., a bag type ETS table) first, optimize ?ENUMS, and invert the mapping of values in #enum_type{} before embarking on this task. It doesn't currently have an issue, but it is going to mess in the same region of the code and getting it in before is a somewhat simpler task. Then we can start adding this on top.

If you need, I can write that thing up as an issue to be solved as well, so we can get a PR going for it.

@kalta
Copy link

kalta commented Nov 6, 2017

Thank you for the directions.

I started a first attempt but the needed changes are massive... even on the yrl parser I'm not familiar with...

...no sure if this is the right way,

@jlouis
Copy link
Owner

jlouis commented Nov 7, 2017

Yeah, that approach ain't gonna work in the long run. But it is not in vain in that it spotlights a number of things we want to do to the code base.

I think the right approach is to first do some refactoring runs over the parts that needs change. Then propose those as PRs and get them into the code base. Once that is done, you can probably add namespaces in a fairly small patch in the end because the code has been "phase shifted" into the right direction.

Patch considerations

The main problem with the patch is that it doesn't use Ctx but rather adds a new parameter to all functions. Ctx is a context or environment in compiler-speak. It is a read-only section, which is lexically scoped: you may alter the context for every call to a sub-function, but it is not updated as a State would in a gen_server.

In the system, Ctx is a map of type #{ atom() => term() }. So you would like to bind Ctx#{ namespace => Namespace } at the top and then send the Ctx around in the different phases of the compiler. Whenever you have a schema lookup, you can then do foo(#{ namespace := Namespace } = Ctx, Path, ...) -> lookup(Namespace, Ty) ... and so on. This avoids having to add a new parameter to all the functions[0].

Now, as written above, there are situations where Ctx is not pushed into a subfunction call. This should probably be done where necessary.

However: Before embarking on the above, there are at least three areas of the code which will need refactoring. This should be done first in PRs before attempting the namespace changes in order to cut down such a PR to a manageable size. The idea is that what we cannot reach in one commit/PR, we can reach over several.

  • Remove the lookup in the .yrl grammar.

Rationale: The grammar file does a lookup on the schema. This is bad form and a hack. It builds a tight coupling between the schema system and the parser where they could have been isolated. Rather than doing a lookup on a type Ty, store {scalar, Ty} in the abstract syntax tree. Then when graphql_elaborate runs, handle {scalar, Ty} by looking it up in the schema and replace it with the #scalar_type{} value from the schema. The rest of the system is unchanged. This avoids the need to have a Namespace propagated into the yecc (.yrl) grammar altogether.

  • Rename the badly worded name Context in the graphql_elaborate system.

Rationale: I recently needed to know in which "context" a directive is used. I used the variable name Context for that, but the name is a bad one. It conflicts with Ctx. Rename the variable to Location which is a synonym from the spec. It simplifies a later patch where Ctx needs to be pushed down into the elaborator (for the above lookup maps:get(namespace, Ctx)).

  • Alter the graphql_schema:lookup/1 to call into graphql_schema:lookup/2.

Rationale: Prepare the schema system for multiple namespaces. Define

-module(grpahql_schema).

....

lookup(Ty) -> lookup(Ty, default). % Forward to the namespace variant

lookup(Ty, Namespace) ->
    % new code here

such that it works with the old code, but allow the graphql_schema system to support multiple namespaces before using it in the rest of the system. It is likely that you will have to do some work here to pass tests:

  • We need some tests for the namespacing construction, isolated as unit tests to the schema system.
  • Introspection is likely to need alterations in order to pass tests.
  • Schema parsing is likely to need alterations too.

Other considerations

We may want to think about the API up front. In the tutorial, we have this section

https://github.com/shopgun/graphql-erlang-tutorial/blob/0acb50661242c8cbd47cf9396ccfb99df0b648c4/apps/sw_web/src/sw_web_graphql_handler.erl#L127-L130

In the code. Clearly, the system will call with a context Ctx#{ namespace => star_wars } or something such if you don't want to use the default namespace here. We may want to think about how the proposal alters the documentation for the Cowboy Handler as well (see this section: https://shopgun.github.io/graphql-erlang-tutorial/#cowboy-handler). Being up-front about the API change might help us with understanding how to change the code (If we can't document it easily, the change isn't that good as a first benchmark).

Final remarks

Feel free to ask more questions if something is unclear. But from what I could skim out of your patch, this is probably going to provide you with a path that is easier to tread.

In short: split it into multiple PRs and handle each section by itself, while keeping the system without functional changes. Once every refactoring is in, make the "real" patch which should now be easy to write since everything is phased correctly.

[0] Another way of saying it: the Ctx provides a Reader monad for the computation.

@goertzenator
Copy link
Contributor

goertzenator commented Apr 12, 2018

Instead of introducing namespaces, could this app be modified to run multiple instances of itself? One approach would be:

  • Remove globally named ets tables, tuck the table ids into graphql_schema genserver state.
  • Adjust graphql:xxx() API to accept a graphql_schema genserver pid (or atom).
  • Existing graphql:xxx() API can be preserved by having a default graphql_schema genserver.

This might be less intrusive than namespaces, and also gives the ability to start and stop different graphql endpoints at will.

@jlouis
Copy link
Owner

jlouis commented Apr 13, 2018

I think your approach would work too, and I'll happily take that approach as well. Essentially, you could just run one gen_server per schema in the supervisor tree and it should work out. Perhaps we'd have to stuff the target Pid into the Ctx and thread that around a bit, but that is probably all that is needed for making this work.

goertzenator added a commit to goertzenator/graphql-erlang that referenced this issue Apr 13, 2018
- Multiple endpoints may now run independently each with their own schema process.
- Global context has been replaced with `endpoint_context` type containing schema PID and ets tables.
- New API in `graphql` with `ep_` prefix (ex `ep_execute()`).  These take an `endpoint_context`.
- New API in `graphql` with `p_` prefix (ex `p_execute()`).  These take a schema PID/atom.  This is most useful when you have a named schema process that can be
referred to by its name atom.
- Existing API wraps the `ep_` API and uses the named schema process `graphql_default_endpoint`
- `endpoint_context` is passed inside `Ctx` maps.  The variable `EP` is added when `endpoint_context` is needed but no `Ctx` is available.
- ct tests pass
- addresses case jlouis#91
@Premwoik
Copy link

Hi! This discussion was quite long ago, but I have some questions. Anything new on this topic? Do you still want to have support for multiple schemas? Pull request #160 looks promising, but I see it was abandoned. @jlouis Do the latest changes in the repo make it easier to add such functionality?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants