Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing the Env Prefix separator to be configurable #122

Open
brendanodwyer opened this issue Sep 17, 2024 · 7 comments
Open

Changing the Env Prefix separator to be configurable #122

brendanodwyer opened this issue Sep 17, 2024 · 7 comments

Comments

@brendanodwyer
Copy link

Is your feature request related to a problem? Please describe.
It's not related to a problem but more of a design decision. Currently the tool enforces double underscores for separating environment prefixes and roles names.

This enforces the database names and roles to follow a specific naming standard to be able to use the tool, which is awesome by the way, but a lot of environments don't follow this standard.

Describe the solution you'd like
I would like to have an optional command line flag, something like --identifier-separator or --env-separator that we could supply that would use a different default separator.

Our environment and previous environments I have worked in that use snowflake have almost always used a single underscore for separation, so it would be awesome to have support for this.

Describe alternatives you've considered
Forking the codebase entirely and adding in the changes, but this seems like overkill. This seems like a configuration detail that could be opted into by development teams that would make the tool more flexible and get more adoption.

@littleK0i
Copy link
Owner

littleK0i commented Sep 17, 2024

Single underscore might work in isolated and controlled environment, but it does not work at large scale with more complex names. Let me show you.

For example, let's say we have database ALICE with schema BOB_CLOE. But we also have database ALICE_BOB with schema CLOE. If we use single underscore, schema roles in both cases will be named ALICE_BOB_CLOE_S_ROLE, which causes naming clash.

With env_prefix it is the same. For example, let's say we have databases REVENUECAT and OLD_REVENUECAT. If we use env_prefix OLD with single underscore, it will cause naming clash if we have environment with prefix and environment without prefix on the same account. Currently it is possible to mix & match.

Another problem is parsing of identifiers. Double undrescore can be used to split complex identifier into multiple parts and parse these identifiers reliably. With only single underscore parsing is not possible.

Now the question is ... can we use some other characters? In theory, we could use $ dollar sign. Snowflake allows us to use a single dollar sign for identifiers without quotes. For example OLD$REVENUECAT. It seems to work initially, but it creates other problems:

  • only one "dollar" can be used per identifier, if identifier already had a dollar, adding one more is not possible;
  • some other tools might break when encountering $ in names, but underscore is pretty much guaranteed to work, since it is very common;

As far as I know, any other delimiter would force users to use double-quotes around identifiers, which ruins usability.

I've tested various approaches to this problem, including configurable delimiter, but it simply did not work well enough. At this moment I am not aware of any other truly reliable options for multi-part identifiers. But I am happy to dicuss it and adapt if community finds a better option eventually.

@littleK0i
Copy link
Owner

littleK0i commented Sep 18, 2024

I did some extra testing for $.

Interestingly enough, it seems Snowflake adjusted the rules a bit. Now it is possible to use multiple $ dollar-signs in identifiers without double-quotes. But not all systems are updated to take this into account. Most notably, it breaks code highlight in Snowflake Web GUI.

image

When all the small issues are sorted, $ could become a pretty good env prefix separator eventually.

@brendanodwyer
Copy link
Author

I definitely understand the parsing of identifiers argument, that wasn't something that I had taken into consideration.

I can see how in a larger environment that there could be name clashes where you have similar names for different parts of the identifier, especially in the case of a mono-repo.

In the use case i'm envisioning for my team, we would be using the tool to manage databases independently of each other instead of having a mono-repo with everything in it, so the name overlaps don't seem like that large of an issue for us, additionally we currently try to avoid using the names of developers when we create things, preferring ID's instead of names.

We have been looking for a DCM tool for our snowflake environment, and this tool seems to be one of the better options out there. Our biggest challenge is that we have a lot of existing pieces that we have to consider and the double underscore identifier requirement is really the only limiting factor for us adopting this tool.

@littleK0i
Copy link
Owner

Could you tell a little bit more about specific use case?

Is it like this:

DEV1_ALICE_DB
DEV1_BOB_DB
DEV2_ALICE_DB
DEV2_BOB_DB

Or is it closer to this:

DEV1_SOMETHING
DEV2_COMPLETELY_DIFFERENT

Maybe there is no need for env_prefix, and natural database names can be used instead. The process which applies config can collect folders from individual develoeprs or other stake holders, merge everything together and apply in one go.

It might be a better option due to:

  • Ability to set hooks and automatic validation;
  • Fully controlled deployment process;
  • Centralized history of changes;
  • Ability to create cross-database grants, if necessary;

Env prefix is mainly for debugging and short-lived disposable dev environments. It is primarily intended to prevent name clashing.

@brendanodwyer
Copy link
Author

Apologies for the delay in response.

So what we want is to have the same database name across environments, but prefixed with the environment name.

We use names like this in our environment:

DEV_DATALAKE_RAW
UAT_DATALAKE_RAW
PROD_DATALAKE_RAW

DEV_DATALAKE
UAT_DATALAKE
PROD_DATALAKE

Almost all of our databases follow this naming scheme where we have an environment prefix in the front.

@littleK0i
Copy link
Owner

littleK0i commented Oct 7, 2024

@brendanodwyer , I'll try to add this option in the next major update.

Current protections around env_prefix can be removed. All prefix building logic can go to app instead of config. The only big question is maintaining backwards compatibility for people who built some custom automation around prefixes and Config object.

@littleK0i
Copy link
Owner

littleK0i commented Oct 13, 2024

@brendanodwyer , please check version 0.34.0 or higher.

I've added CLI option --env-prefix-separator, which accepts one of three pre-defined choices:

  • __ - double underscore, default
  • _ - single underscore
  • $ - dollar sign

Alternatively, ENV variable SNOWFLAKE_ENV_PREFIX_SEPARATOR might be used instead of CLI argument.

I suspect there is some risk in using single underscore, but SnowDDL ownership checks should protect from most obvious problems.

Please give it a try and let me know if it suits your needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants