Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor python config handling #830

Merged
merged 41 commits into from
Oct 22, 2024
Merged
Show file tree
Hide file tree
Changes from 38 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
6a17257
Add support for Serverless jobs / refactor api usage (#706)
benc-db Jun 24, 2024
3cf4aba
Merge branch 'main' into 1.9.latest
benc-db Jun 25, 2024
4795063
Cleanup test warnings (#713)
benc-db Jun 25, 2024
bcab2a4
Merge branch 'main' into 1.9.latest
benc-db Jul 1, 2024
855bb5e
Fix dbt seed command error when seed file is partially defined in the…
kass-artur Jul 8, 2024
17433f3
Readd external type (#728)
benc-db Jul 9, 2024
8e88384
Upgrade to PySQL 3.2.0 (#729)
benc-db Jul 9, 2024
3577f85
Merge branch 'main' into 1.9.latest
benc-db Jul 26, 2024
708cf60
Merge branch 'main' into 1.9.latest
benc-db Aug 12, 2024
a35918b
Extend Merge Capabilities (#739)
mi-volodin Aug 19, 2024
a2b8ce9
Merge branch 'main' into 1.9.latest
benc-db Aug 20, 2024
b8486d1
Forward porting latest 1.8 changes into 1.9 branch (#788)
benc-db Sep 13, 2024
98177fe
Upgrade PySql to 3.4.0 (#790)
benc-db Sep 13, 2024
092296b
Add custom constraint option (#792)
roydobbe Sep 16, 2024
dcbeb0c
Merge branch 'main' into 1.9.latest
benc-db Sep 19, 2024
60b487d
Merge branch 'main' into 1.9.latest
benc-db Sep 25, 2024
41c164e
Behavior: Get column info from information_schema Part I (#808)
benc-db Sep 27, 2024
3412461
Simple Iceberg support (#815)
benc-db Oct 3, 2024
103f1b1
Merge branch 'main' into 1.9.latest
benc-db Oct 4, 2024
7e6b450
fix merge issue
benc-db Oct 4, 2024
74c7862
Merge branch 'main' into 1.9.latest
benc-db Oct 10, 2024
0e821b0
Draft: #756 - implement python workflow submissions (#762)
kdazzle Oct 10, 2024
00dd9f8
Behavior for external path (#823)
benc-db Oct 11, 2024
d0378d2
Implement microbatch incremental strategy (#825)
benc-db Oct 15, 2024
c93e376
wip
benc-db Oct 16, 2024
8e2c4a9
Merge branch 'main' into refactor_python_config
benc-db Oct 16, 2024
48ea076
wip
benc-db Oct 16, 2024
e16cbd9
try refactoring
benc-db Oct 17, 2024
8af88aa
fix some things
benc-db Oct 17, 2024
7ade86c
is it fixed?
benc-db Oct 17, 2024
8e8f1ac
fix last test failure
benc-db Oct 17, 2024
6e24da9
rename config
benc-db Oct 17, 2024
30622fe
first pydantic refactor
benc-db Oct 18, 2024
327fa16
remove unsafe assert
benc-db Oct 18, 2024
8864f4e
another fix attempt
benc-db Oct 18, 2024
96d0033
add factory methods, doc
benc-db Oct 18, 2024
d608847
adding unit tests
benc-db Oct 21, 2024
7872676
changelog
benc-db Oct 21, 2024
a4d8800
Merge branch 'main' into refactor_python_config
benc-db Oct 22, 2024
8c2fff8
address Jacky's comments
benc-db Oct 22, 2024
6d226be
additional nits addressed
benc-db Oct 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@

### Under the Hood

- Significant refactoring and increased testing of python_submissions ([830](https://github.com/databricks/dbt-databricks/pull/830))
- Fix places where we were not properly closing cursors, and other test warnings ([713](https://github.com/databricks/dbt-databricks/pull/713))
- Upgrade databricks-sql-connector dependency to 3.4.0 ([790](https://github.com/databricks/dbt-databricks/pull/790))

Expand Down
57 changes: 57 additions & 0 deletions dbt/adapters/databricks/python_models/python_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
from typing import Any, Dict, List, Optional
import uuid
from pydantic import BaseModel, Field


DEFAULT_TIMEOUT = 60 * 60 * 24


class PythonJobConfig(BaseModel):
"""Pydantic model for config found in python_job_config."""

name: Optional[str] = None
grants: Dict[str, List[Dict[str, str]]] = Field(exclude=True, default_factory=dict)
existing_job_id: str = Field("", exclude=True)
post_hook_tasks: List[Dict[str, Any]] = Field(exclude=True, default_factory=list)
additional_task_settings: Dict[str, Any] = Field(exclude=True, default_factory=dict)

class Config:
extra = "allow"


class PythonModelConfig(BaseModel):
"""
Pydantic model for a Python model configuration.
Includes some job-specific settings that are not yet part of PythonJobConfig.
"""

user_folder_for_python: bool = False
timeout: int = Field(DEFAULT_TIMEOUT, gt=0)
job_cluster_config: Dict[str, Any] = Field(default_factory=dict)
access_control_list: List[Dict[str, str]] = Field(default_factory=list)
packages: List[str] = Field(default_factory=list)
index_url: Optional[str] = None
additional_libs: List[Dict[str, Any]] = Field(default_factory=list)
python_job_config: Optional[PythonJobConfig] = None
cluster_id: Optional[str] = None
http_path: Optional[str] = None
create_notebook: bool = False


class ParsedPythonModel(BaseModel):
"""Pydantic model for a Python model parsed from a dbt manifest"""

catalog: str = Field("hive_metastore", alias="database")

# Schema is a reserved name in Pydantic
schema_: str = Field("default", alias="schema")

identifier: str = Field(alias="alias")
config: PythonModelConfig

@property
def run_name(self) -> str:
return f"{self.catalog}-{self.schema_}-{self.identifier}-{uuid.uuid4()}"

class Config:
allow_population_by_field_name = True
Loading
Loading