-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arun sqc interop #5
base: Feature_Branch_for_QC
Are you sure you want to change the base?
Conversation
6594669
to
741b6d0
Compare
.github/workflows/ci.yml
Outdated
@@ -697,6 +697,16 @@ jobs: | |||
- run: python -m pytest modin/tests/pandas/dataframe/test_reduce.py | |||
- run: python -m pytest modin/tests/pandas/dataframe/test_udf.py | |||
- run: python -m pytest modin/tests/pandas/dataframe/test_window.py | |||
- run: python -m pytest modin/tests/pandas/interop/test_binary.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- run: python -m pytest modin/tests/pandas/interop/test_binary.py | |
- run: python -m pytest modin/tests/native_df_mode/pandas/interop/test_binary.py |
I envision we could implement a native df mode for polars so let's change the test directory stucture a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
""" | ||
Module contains ``QueryCompilerTypeCaster`` class. | ||
|
||
``QueryCompilerTypeCaster`` is used for va. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extend the docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -585,7 +586,6 @@ class NativeQueryCompiler(BaseQueryCompiler): | |||
_shape_hint: Optional[str] | |||
|
|||
def __init__(self, pandas_frame, shape_hint: Optional[str] = None): | |||
assert NativeDataframeMode.get() == "Pandas" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now constructor of NativeQueryCompiler can be also called when NativeDataframeMode.get() is default, Irrespective of NativeDataframeMode when we call this constructor in operations where the left operand is a NativeQueryCompiler we construct a new query compiler of type NativeQueryCompiler
Fn = TypeVar("Fn", bound=Any) | ||
|
||
|
||
class QueryCompilerTypeCaster: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class QueryCompilerTypeCaster: | |
class QueryCompilerCaster: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
return arguments | ||
|
||
|
||
def apply_argument_casting(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def apply_argument_casting(): | |
def apply_argument_cast(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
A decorator function. | ||
""" | ||
|
||
def decorator(obj: Fn) -> Fn: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably use functools.wrap(obj) to keep properties of a function correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now added functools to cast_args the function returned from this decorator, from what I understand functools wrapping is required only for the wrapper method returned from the decorator and the decorator by itself does not need functools.wraps()
elif isinstance(obj, staticmethod): | ||
return staticmethod(decorator(obj.__func__)) | ||
|
||
def cast_args(*args: Tuple, **kwargs: Dict) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -2993,9 +2993,8 @@ def _create_or_update_from_compiler( | |||
DataFrame or None | |||
None if update was done, ``DataFrame`` otherwise. | |||
""" | |||
assert ( | |||
isinstance(new_query_compiler, type(self._query_compiler)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new_query_compiler can also be an instance of a different query compiler(Nativequerycompiler or PandasQueryCompiler),
Eg during operations like insert in the _create_or_update_from_compiler the constructor gets called directly eg ,
Thus it would be normal for the cases where constructor may return a new query compiler for a case where user changes the query compiler mode between creating data_frame and insert operation
current_qc = None | ||
if isinstance(args[0], BaseQueryCompiler): | ||
current_qc = args[0] | ||
|
||
if current_qc: | ||
args = cast_nested_args_to_current_qc_type(args, current_qc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current_qc = None | |
if isinstance(args[0], BaseQueryCompiler): | |
current_qc = args[0] | |
if current_qc: | |
args = cast_nested_args_to_current_qc_type(args, current_qc) | |
if isinstance(args[0], BaseQueryCompiler): | |
current_qc = args[0] | |
args = cast_nested_args_to_current_qc_type(args, current_qc) |
What about kwargs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
def decorator(obj: Fn) -> Fn: | ||
"""Decorate function or class to cast all arguments that are query comilers to the current query compiler""" | ||
if isinstance(obj, type): | ||
seen: Dict[Any, Any] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this dict? Is there a case when we may not get into except KeyError
branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to handle cases where for same function different aliases exists in basequerycompiler, This is no longer required and no cases in both the query compilers that would go to except so removing this.
Signed-off-by: Wei Chen <wei@runllm.com>
* FEAT-modin-project#7331: Initial Polars API This commit adds a polars namespace to Modin, and the DataFrame and Series objects and their respective APIs. This doesn't include error handling and is still missing several polars features: * LazyFrame * Expressions * String, Temporal, Struct, and other Series accessors * Several parameters * Operators that we don't have query compiler methods for * e.g. sin, cos, tan, etc. Those will be handled in a future PR. Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * Lint Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * flake8 Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * isort Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * headers Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * forgot one Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * Add test Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * header Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * isort Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * Add to CI Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * fix name Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * Update modin/polars/base.py Co-authored-by: Mahesh Vashishtha <mahesh.vashishtha@snowflake.com> * address comments Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * polars 1 Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * Update for polars 1.x and fix some hacks Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * Remove hax Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * Black Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * Address comments Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * Lint Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> * Address comment Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> --------- Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com> Co-authored-by: Devin Petersohn <devin.petersohn@snowflake.com> Co-authored-by: Mahesh Vashishtha <mahesh.vashishtha@snowflake.com>
…er (modin-project#7348) Signed-off-by: Jonathan Shi <jhshi07@gmail.com>
…odin-project#7352) Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
…Dataset. (modin-project#7353) Signed-off-by: sfc-gh-mvashishtha <mahesh.vashishtha@snowflake.com>
…n-project#7354) Signed-off-by: sfc-gh-mvashishtha <mahesh.vashishtha@snowflake.com>
…er (modin-project#7356) Signed-off-by: arunjose696 <arunjose696@gmail.com>
…odin-project#7358) Signed-off-by: Devin Petersohn <devin.petersohn@snowflake.com>
f788a20
to
68a51ef
Compare
…amic partitioning (modin-project#7369) Signed-off-by: Kirill Suvorov <kirill.suvorov@intel.com> Co-authored-by: Anatoly Myachev <anatoliimyachev@mail.com>
…ervice, pin to 5.0.13 (modin-project#7374) Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Co-authored-by: Igoshev, Iaroslav <iaroslav.igoshev@intel.com> Signed-off-by: arunjose696 <arunjose696@gmail.com>
…ply` (modin-project#7338) Signed-off-by: Kirill Suvorov <kirill.suvorov@intel.com>
Signed-off-by: arunjose696 <arunjose696@gmail.com>
Signed-off-by: arunjose696 <arunjose696@gmail.com>
Signed-off-by: Igoshev, Iaroslav <iaroslav.igoshev@intel.com>
…rialized_dtypes to query compiler layer as in the code in multiple places the methods of private _modin_frame were used
Signed-off-by: Igoshev, Iaroslav <iaroslav.igoshev@intel.com>
…fferent query compilers
Co-authored-by: Iaroslav Igoshev <Poolliver868@mail.ru>
cd70589
to
9af55f2
Compare
Signed-off-by: arunjose696 <arunjose696@gmail.com>
9af55f2
to
80ce01c
Compare
Co-authored-by: Iaroslav Igoshev <Poolliver868@mail.ru> Co-authored-by: Anatoly Myachev <anatoliimyachev@mail.com>
94e8e48
to
af0a7ba
Compare
Signed-off-by: arunjose696 <arunjose696@gmail.com>
04b7f60
to
03d137c
Compare
Signed-off-by: arunjose696 <arunjose696@gmail.com>
03d137c
to
c1b0942
Compare
Signed-off-by: arunjose696 <arunjose696@gmail.com>
136bf1e
to
f488872
Compare
What do these changes do?
Adds interoperablity between different query compiler by introducing a decorator function to do type casting of all query compiler arguments to the type of left operand
result=qc_native.op(pandas_qc) # This would result in result being of type native_qc
result=pandas_qc.op(qc_native) # This would result in result being of type pandas_qc
The tests for interoperablity have been added by modifying tests in modin/tests/pandas/dataframe/ folder such that operations would be performed across dataframes with different query compilers.
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date