Merge branch 'main' into hectorcast-db/automatic-test-trigger

databricks · Oct 24, 2024 · 2201955 · 2201955
2 parents 68c67a2 + 03659b6
commit 2201955
Show file tree

Hide file tree

Showing 18 changed files with 643 additions and 47 deletions.
diff --git a/docs/guides/unity-catalog.md b/docs/guides/unity-catalog.md
@@ -262,7 +262,7 @@ resource "aws_iam_policy" "external_data_access" {
 
 resource "aws_iam_role" "external_data_access" {
   name                = local.uc_iam_role
-  assume_role_policy  = data.aws_iam_policy_document.this.json
+  assume_role_policy  = data.databricks_aws_unity_catalog_assume_role_policy.this.json
   managed_policy_arns = [aws_iam_policy.external_data_access.arn]
   tags = merge(var.tags, {
     Name = "${local.prefix}-unity-catalog external access IAM role"

diff --git a/docs/resources/alert.md b/docs/resources/alert.md
@@ -13,15 +13,15 @@ resource "databricks_directory" "shared_dir" {
 }
 
 # This will be replaced with new databricks_query resource
-resource "databricks_sql_query" "this" {
-  data_source_id = databricks_sql_endpoint.example.data_source_id
-  name           = "My Query Name"
-  query          = "SELECT 42 as value"
-  parent         = "folders/${databricks_directory.shared_dir.object_id}"
+resource "databricks_query" "this" {
+  warehouse_id = databricks_sql_endpoint.example.id
+  display_name = "My Query Name"
+  query_text   = "SELECT 42 as value"
+  parent_path  = databricks_directory.shared_dir.path
 }
 
 resource "databricks_alert" "alert" {
-  query_id     = databricks_sql_query.this.id
+  query_id     = databricks_query.this.id
   display_name = "TF new alert"
   parent_path  = databricks_directory.shared_dir.path
   condition {
@@ -77,7 +77,11 @@ In addition to all the arguments above, the following attributes are exported:
 
 ## Migrating from `databricks_sql_alert` resource
 
-Under the hood, the new resource uses the same data as the `databricks_sql_alert`, but is exposed via a different API. This means that we can migrate existing alerts without recreating them.  This operation is done in few steps:
+Under the hood, the new resource uses the same data as the `databricks_sql_alert`, but is exposed via a different API. This means that we can migrate existing alerts without recreating them.  
+
+-> It's also recommended to migrate to the `databricks_query` resource - see [databricks_query](query.md) for more details.
+
+This operation is done in few steps:
 
 * Record the ID of existing `databricks_sql_alert`, for example, by executing the `terraform state show databricks_sql_alert.alert` command.
 * Create the code for the new implementation by performing the following changes:
@@ -109,7 +113,7 @@ we'll have a new resource defined as:
 
 ```hcl
 resource "databricks_alert" "alert" {
-  query_id     = databricks_sql_query.this.id
+  query_id     = databricks_query.this.id
   display_name = "My Alert"
   parent_path  = databricks_directory.shared_dir.path
   condition {
@@ -179,6 +183,20 @@ resource "databricks_permissions" "alert_usage" {
 }
 ```
 
+## Access Control
+
+[databricks_permissions](permissions.md#sql-alert-usage) can control which groups or individual users can *Manage*, *Edit*, *Run* or *View* individual alerts.
+
+```hcl
+resource "databricks_permissions" "alert_usage" {
+  sql_alert_id = databricks_alert.alert.id
+  access_control {
+    group_name       = "users"
+    permission_level = "CAN_RUN"
+  }
+}
+```
+
 ## Import
 
 This resource can be imported using alert ID:
@@ -191,6 +209,6 @@ terraform import databricks_alert.this <alert-id>
 
 The following resources are often used in the same context:
 
-* [databricks_sql_query](sql_query.md) to manage Databricks SQL [Queries](https://docs.databricks.com/sql/user/queries/index.html).
-* [databricks_sql_endpoint](sql_endpoint.md) to manage Databricks SQL [Endpoints](https://docs.databricks.com/sql/admin/sql-endpoints.html).
+* [databricks_query](query.md) to manage [Databricks SQL Queries](https://docs.databricks.com/sql/user/queries/index.html).
+* [databricks_sql_endpoint](sql_endpoint.md) to manage [Databricks SQL Endpoints](https://docs.databricks.com/sql/admin/sql-endpoints.html).
 * [databricks_directory](directory.md) to manage directories in [Databricks Workpace](https://docs.databricks.com/workspace/workspace-objects.html).
diff --git a/docs/resources/custom_app_integration.md b/docs/resources/custom_app_integration.md
@@ -15,7 +15,7 @@ resource "databricks_custom_app_integration" "this" {
     redirect_urls = ["https://example.com"]
     scopes = ["all-apis"]
     token_access_policy {
-        access_token_ttl_in_minutes = %s
+        access_token_ttl_in_minutes = 15
         refresh_token_ttl_in_minutes = 30
     }
 }

diff --git a/docs/resources/job.md b/docs/resources/job.md
@@ -224,14 +224,14 @@ One of the `query`, `dashboard` or `alert` needs to be provided.
 
 * `warehouse_id` - (Required) ID of the (the [databricks_sql_endpoint](sql_endpoint.md)) that will be used to execute the task.  Only Serverless & Pro warehouses are supported right now.
 * `parameters` - (Optional) (Map) parameters to be used for each run of this task. The SQL alert task does not support custom parameters.
-* `query` - (Optional) block consisting of single string field: `query_id` - identifier of the Databricks SQL Query ([databricks_sql_query](sql_query.md)).
+* `query` - (Optional) block consisting of single string field: `query_id` - identifier of the Databricks Query ([databricks_query](query.md)).
 * `dashboard` - (Optional) block consisting of following fields:
   * `dashboard_id` - (Required) (String) identifier of the Databricks SQL Dashboard [databricks_sql_dashboard](sql_dashboard.md).
   * `subscriptions` - (Optional) a list of subscription blocks consisting out of one of the required fields: `user_name` for user emails or `destination_id` - for Alert destination's identifier.
   * `custom_subject` - (Optional) string specifying a custom subject of email sent.
   * `pause_subscriptions` - (Optional) flag that specifies if subscriptions are paused or not.
 * `alert` - (Optional) block consisting of following fields:
-  * `alert_id` - (Required) (String) identifier of the Databricks SQL Alert.
+  * `alert_id` - (Required) (String) identifier of the Databricks Alert ([databricks_alert](alert.md)).
   * `subscriptions` - (Optional) a list of subscription blocks consisting out of one of the required fields: `user_name` for user emails or `destination_id` - for Alert destination's identifier.
   * `pause_subscriptions` - (Optional) flag that specifies if subscriptions are paused or not.
 * `file` - (Optional) block consisting of single string fields:
@@ -372,7 +372,6 @@ This block describes the queue settings of the job:
 * `periodic` - (Optional) configuration block to define a trigger for Periodic Triggers consisting of the following attributes:
   * `interval` - (Required) Specifies the interval at which the job should run. This value is required.
   * `unit` - (Required) Options are {"DAYS", "HOURS", "WEEKS"}.
-
 * `file_arrival` - (Optional) configuration block to define a trigger for [File Arrival events](https://learn.microsoft.com/en-us/azure/databricks/workflows/jobs/file-arrival-triggers) consisting of following attributes:
   * `url` - (Required) URL to be monitored for file arrivals. The path must point to the root or a subpath of the external location. Please note that the URL must have a trailing slash character (`/`).
   * `min_time_between_triggers_seconds` - (Optional) If set, the trigger starts a run only after the specified amount of time passed since the last time the trigger fired. The minimum allowed value is 60 seconds.

diff --git a/docs/resources/pipeline.md b/docs/resources/pipeline.md
@@ -80,7 +80,8 @@ The following arguments are supported:
 * `photon` - A flag indicating whether to use Photon engine. The default value is `false`.
 * `serverless` - An optional flag indicating if serverless compute should be used for this DLT pipeline.  Requires `catalog` to be set, as it could be used only with Unity Catalog.
 * `catalog` - The name of catalog in Unity Catalog. *Change of this parameter forces recreation of the pipeline.* (Conflicts with `storage`).
-* `target` - The name of a database (in either the Hive metastore or in a UC catalog) for persisting pipeline output data. Configuring the target setting allows you to view and query the pipeline output data from the Databricks UI.
+* `target` - (Optional, String, Conflicts with `schema`) The name of a database (in either the Hive metastore or in a UC catalog) for persisting pipeline output data. Configuring the target setting allows you to view and query the pipeline output data from the Databricks UI.
+* `schema` - (Optional, String, Conflicts with `target`) The default schema (database) where tables are read from or published to. The presence of this attribute implies that the pipeline is in direct publishing mode. 
 * `edition` - optional name of the [product edition](https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-concepts.html#editions). Supported values are: `CORE`, `PRO`, `ADVANCED` (default).  Not required when `serverless` is set to `true`.
 * `channel` - optional name of the release channel for Spark version used by DLT pipeline.  Supported values are: `CURRENT` (default) and `PREVIEW`.
 * `budget_policy_id` - optional string specifying ID of the budget policy for this DLT pipeline.

diff --git a/docs/resources/query.md b/docs/resources/query.md
@@ -0,0 +1,195 @@
+---
+subcategory: "Databricks SQL"
+---
+# databricks_query Resource
+
+This resource allows you to manage [Databricks SQL Queries](https://docs.databricks.com/en/sql/user/queries/index.html).  It supersedes [databricks_sql_query](sql_query.md) resource - see migration guide below for more details.
+
+## Example Usage
+
+```hcl
+resource "databricks_directory" "shared_dir" {
+  path = "/Shared/Queries"
+}
+
+# This will be replaced with new databricks_query resource
+resource "databricks_query" "this" {
+  warehouse_id = databricks_sql_endpoint.example.id
+  display_name = "My Query Name"
+  query_text   = "SELECT 42 as value"
+  parent_path  = databricks_directory.shared_dir.path
+}
+```
+
+## Argument Reference
+
+The following arguments are available:
+
+* `query_text` - (Required, String) Text of SQL query.
+* `display_name` - (Required, String) Name of the query.
+* `warehouse_id` - (Required, String) ID of a SQL warehouse which will be used to execute this query.
+* `parent_path` - (Optional, String) The path to a workspace folder containing the query. The default is the user's home folder.  If changed, the query will be recreated.
+* `owner_user_name` - (Optional, String) Query owner's username.
+* `apply_auto_limit` - (Optional, Boolean) Whether to apply a 1000 row limit to the query result.
+* `catalog` - (Optional, String) Name of the catalog where this query will be executed.
+* `schema` - (Optional, String) Name of the schema where this query will be executed.
+* `description` - (Optional, String) General description that conveys additional information about this query such as usage notes.
+* `run_as_mode` - (Optional, String) Sets the "Run as" role for the object.
+* `tags` - (Optional, List of strings) Tags that will be added to the query.
+* `parameter` - (Optional, Block) Query parameter definition.  Consists of following attributes (one of `*_value` is required):
+  * `name` - (Required, String) Literal parameter marker that appears between double curly braces in the query text.
+  * `title` - (Optional, String) Text displayed in the user-facing parameter widget in the UI.
+  * `text_value` - (Block) Text parameter value. Consists of following attributes:
+    * `value` - (Required, String) - actual text value.
+  * `numeric_value` -  (Block) Numeric parameter value. Consists of following attributes:
+    * `value` - (Required, Double) - actual numeric value.
+  * `date_value` - (Block) Date query parameter value. Consists of following attributes (Can only specify one of `dynamic_date_value` or `date_value`):
+    * `date_value` - (String) Manually specified date-time value
+    * `dynamic_date_value` - (String) Dynamic date-time value based on current date-time.  Possible values are `NOW`, `YESTERDAY`.
+    * `precision` - (Optional, String) Date-time precision to format the value into when the query is run.  Possible values are `DAY_PRECISION`, `MINUTE_PRECISION`, `SECOND_PRECISION`.  Defaults to `DAY_PRECISION` (`YYYY-MM-DD`).
+  * `date_range_value` - (Block) Date-range query parameter value. Consists of following attributes (Can only specify one of `dynamic_date_range_value` or `date_range_value`):
+    * `date_range_value` - (Block) Manually specified date-time range value.  Consists of the following attributes:
+      * `start` (Required, String) - begin of the date range.
+      * `end` (Required, String) - end of the date range.
+    * `dynamic_date_range_value` - (String) Dynamic date-time range value based on current date-time.  Possible values are `TODAY`, `YESTERDAY`, `THIS_WEEK`, `THIS_MONTH`, `THIS_YEAR`, `LAST_WEEK`, `LAST_MONTH`, `LAST_YEAR`, `LAST_HOUR`, `LAST_8_HOURS`, `LAST_24_HOURS`, `LAST_7_DAYS`, `LAST_14_DAYS`, `LAST_30_DAYS`, `LAST_60_DAYS`, `LAST_90_DAYS`, `LAST_12_MONTHS`.
+    * `start_day_of_week` - (Optional, Int) Specify what day that starts the week.
+    * `precision` - (Optional, String) Date-time precision to format the value into when the query is run.  Possible values are `DAY_PRECISION`, `MINUTE_PRECISION`, `SECOND_PRECISION`.  Defaults to `DAY_PRECISION` (`YYYY-MM-DD`).
+  * `enum_value` - (Block) Dropdown parameter value. Consists of following attributes:
+    * `enum_options` - (String) List of valid query parameter values, newline delimited.
+    * `values` - (Array of strings) List of selected query parameter values.
+    * `multi_values_options` - (Optional, Block) If specified, allows multiple values to be selected for this parameter. Consists of following attributes:
+      * `prefix` - (Optional, String) Character that prefixes each selected parameter value.
+      * `separator` - (Optional, String) Character that separates each selected parameter value. Defaults to a comma.
+      * `suffix` - (Optional, String) Character that suffixes each selected parameter value.
+  * `query_backed_value` - (Block) Query-based dropdown parameter value. Consists of following attributes:
+    * `query_id` - (Required, String) ID of the query that provides the parameter values.
+    * `values` - (Array of strings) List of selected query parameter values.
+    * `multi_values_options` - (Optional, Block) If specified, allows multiple values to be selected for this parameter. Consists of following attributes:
+      * `prefix` - (Optional, String) Character that prefixes each selected parameter value.
+      * `separator` - (Optional, String) Character that separates each selected parameter value. Defaults to a comma.
+      * `suffix` - (Optional, String) Character that suffixes each selected parameter value.
+
+## Attribute Reference
+
+In addition to all the arguments above, the following attributes are exported:
+
+* `id` - unique ID of the created Query.
+* `lifecycle_state` - The workspace state of the query. Used for tracking trashed status. (Possible values are `ACTIVE` or `TRASHED`).
+* `last_modifier_user_name` - Username of the user who last saved changes to this query.
+* `create_time` - The timestamp string indicating when the query was created.
+* `update_time` - The timestamp string indicating when the query was updated.
+
+## Migrating from `databricks_sql_query` resource
+
+Under the hood, the new resource uses the same data as the `databricks_sql_query`, but exposed via different API. This means that we can migrate existing queries without recreating them.  This operation is done in few steps:
+
+* Record the ID of existing `databricks_sql_query`, for example, by executing the `terraform state show databricks_sql_query.query` command.
+* Create the code for the new implementation performing following changes:
+  * the `name` attribute is now named `display_name`
+  * the `parent` (if exists) is renamed to `parent_path` attribute, and should be converted from `folders/object_id` to the actual path.
+  * Blocks that specify values in the `parameter` block were renamed (see above).
+
+For example, if we have the original `databricks_sql_query` defined as:
+
+```hcl
+resource "databricks_sql_query" "query" {
+  data_source_id = databricks_sql_endpoint.example.data_source_id
+  query          = "select 42 as value"
+  name           = "My Query"
+  parent         = "folders/${databricks_directory.shared_dir.object_id}"
+
+  parameter {
+    name  = "p1"
+    title = "Title for p1"
+    text {
+      value = "default"
+    }
+  }
+}
+```
+
+we'll have a new resource defined as:
+
+```hcl
+resource "databricks_query" "query" {
+  warehouse_id = databricks_sql_endpoint.example.id
+  query_text   = "select 42 as value"
+  display_name = "My Query"
+  parent_path  = databricks_directory.shared_dir.path
+
+  parameter {
+    name  = "p1"
+    title = "Title for p1"
+    text_value {
+      value = "default"
+    }
+  }
+}
+```
+
+### For Terraform version >= 1.7.0
+
+Terraform 1.7 introduced the [removed](https://developer.hashicorp.com/terraform/language/resources/syntax#removing-resources) block in addition to the [import](https://developer.hashicorp.com/terraform/language/import) block introduced in Terraform 1.5.  Together they make import and removal of resources easier, avoiding manual execution of `terraform import` and `terraform state rm` commands.
+
+So with Terraform 1.7+, the migration looks as the following:
+
+* remove the old query definition and replace it with the new one.
+* Adjust references, like, `databricks_permissions`.
+* Add `import` and `removed` blocks like this:
+
+```hcl
+import {
+  to = databricks_query.query
+  id = "<query-id>"
+}
+
+removed {
+  from = databricks_sql_query.query
+
+  lifecycle {
+    destroy = false
+  }
+}
+```
+
+* Run the `terraform plan` command to check possible changes, such as value type change, etc.
+* Run the `terraform apply` command to apply changes.
+* Remove the `import` and `removed` blocks from the code.
+
+### For Terraform version < 1.7.0
+
+* Remove the old query definition and replace it with the new one.
+* Remove the old resource from the state with the `terraform state rm databricks_sql_query.query` command.
+* Import new resource with the `terraform import databricks_query.query <query-id>` command.
+* Adjust references, like, `databricks_permissions`.
+* Run the `terraform plan` command to check possible changes, such as value type change, etc.
+
+## Access Control
+
+[databricks_permissions](permissions.md#sql-query-usage) can control which groups or individual users can *Manage*, *Edit*, *Run* or *View* individual queries.
+
+```hcl
+resource "databricks_permissions" "query_usage" {
+  sql_query_id = databricks_query.query.id
+  access_control {
+    group_name       = "users"
+    permission_level = "CAN_RUN"
+  }
+}
+```
+
+## Import
+
+This resource can be imported using query ID:
+
+```bash
+terraform import databricks_query.this <query-id>
+```
+
+## Related Resources
+
+The following resources are often used in the same context:
+
+* [databricks_alert](alert.md) to manage [Databricks SQL Alerts](https://docs.databricks.com/en/sql/user/alerts/index.html).
+* [databricks_sql_endpoint](sql_endpoint.md) to manage [Databricks SQL Endpoints](https://docs.databricks.com/sql/admin/sql-endpoints.html).
+* [databricks_directory](directory.md) to manage directories in [Databricks Workpace](https://docs.databricks.com/workspace/workspace-objects.html).