Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routing based on multiple keys #690

Open
winterec opened this issue Jun 30, 2024 · 0 comments
Open

Routing based on multiple keys #690

winterec opened this issue Jun 30, 2024 · 0 comments

Comments

@winterec
Copy link

Describe the bug
I'm tinkering with a fully horizontally distributed application using Snowflake IDs described by Instagram: https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c

In my application there is no interaction between accounts, so when an account it first created it is assigned to a logical shard, and all future dependent objects for that account will also be located in that shard.

I've been able to make some very small patches to SPQR to extract the shard from the snowflake ID, and to add my snowflake hash function to the parser grammar. Overall it's working very well and it's able to intelligently route SELECTs to a single shard when given an ID, but also route to all shards when no shard is provided.

Multiple Keys

I'm adding multiple distribution keys on the same table, but in my use case they will always all refer to the same shard. These generally seem to resolve to a MultiMatchState in the code and error out. When I've patched out the errors those queries are working in my use case.

There are some notes in the code that the multi-shard routing is a TODO, but I would like to suggest that perhaps a first version of this could be to route to the single shard when the multiple calls to the hash function all resolve to the same single shard.

I can give this a go myself if that would be useful?

Insert statements

I'm having issues with queries of the INSERT INTO table (a, b, c) VALUES ($1, $2, $3); type, where both b and c are distribution keys. It seems that this is not supported. But when I skip past that error for debugging I'm seeing an additional issue where an empty input seems to be provided to the hash function.

Syntax errors in psql built-in queries.

Lastly, some psql commands like \dt seem to be failing. The work when run directly against the database but throw a syntax error when run through spqr-router.

example:

postgres=# \dt
ERROR:  client processing error: error processing query 'SELECT n.nspname as "Schema",
  c.relname as "Name",
  CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 't' THEN 'TOAST table' WHEN 'f' THEN 'foreign table' WHEN 'p' THEN 'partitioned table' WHEN 'I' THEN 'partitioned index' END as "Type",
  pg_catalog.pg_get_userbyid(c.relowner) as "Owner"
FROM pg_catalog.pg_class c
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
     LEFT JOIN pg_catalog.pg_am am ON am.oid = c.relam
WHERE c.relkind IN ('r','p','')
      AND n.nspname <> 'pg_catalog'
      AND n.nspname !~ '^pg_toast'
      AND n.nspname <> 'information_schema'
  AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;': syntax error on pos 363, tx status IDLE

Additional context
Currently working off latest master, 214dc2d
Postgresql 16.3 both server and client

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant