-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose isUnquotedPathCharacter for validation #375
Commits on Sep 18, 2023
-
Add support for CAST(DECIMAL as VARCHAR) (facebookincubator#6210)
Summary: To improve the performance, instead of creating intermediate strings, the raw string buffer is pre-allocated to be written directly. Function `std::to_chars` is used to convert integers into a character string by successively filling the range. On buffer allocation, instead of calculating the precise size from intermediate strings, we pre-allocate sufficient buffer based on an estimation with decimal precision and scale, and set the precise size after all strings are written. An alternative implementation used `DecimalUtil::toString` which produced a lot of intermediate strings during conversion. Besides, `DecimalUtil::toString` was called for the calculation of string buffer size. The optimized implementation uses `std::to_chars` to convert integer to string and avoid all intermediate strings. The string buffer size is estimated with decimal precision and scale. As below benchmarks show, the final performance is improved 4-5x compared with the previous one. Cast from decimal to varchar benchmark | cast##cast_short_decimal | cast##cast_long_decimal -- | -- | -- previous (DecimalUtil::toString) | 45.43ms | 132.09ms optimized (std::to_chars) | 9.87ms | 35.00ms Pull Request resolved: facebookincubator#6210 Reviewed By: xiaoxmeng Differential Revision: D49315826 Pulled By: mbasmanova fbshipit-source-id: 1f419aa9edcb080752c3bed567d390cc7a461cce
Configuration menu - View commit details
-
Copy full SHA for 4d9c0ea - Browse repository at this point
Copy the full SHA 4d9c0eaView commit details -
Fix the linking problem of simdjson (facebookincubator#6565)
Summary: When velox was used as a third-party library and `SIMDJsonExtractor` was used, it failed when running json function tests. We found that `-DSIMDJSON_THREADS_ENABLED=1` was not configured when generating libvelox_functions_json.a. We fix it by changing "simdjson" to "simdjson::simdjson" in target_link_libraries. Fixes facebookincubator#6564 Pull Request resolved: facebookincubator#6565 Reviewed By: Yuhta Differential Revision: D49285542 Pulled By: kgpai fbshipit-source-id: f9bc093b278288a2a73bbb289bb91b5dd7061097
Configuration menu - View commit details
-
Copy full SHA for d7a6875 - Browse repository at this point
Copy the full SHA d7a6875View commit details -
Fix potential crash in DwrfReader::updateColumnNamesFromTableSchema(). (
facebookincubator#6599) Summary: Pull Request resolved: facebookincubator#6599 When type kind is not equal and one of them non-primitive we would crash accessing null type pointer after dynamic cast. Fix is to bail out from going down the type tree whenever type kind is different. The bug sneaked in, when we replaced throw() by log(1) in type checking code. Reviewed By: Yuhta Differential Revision: D49338549 fbshipit-source-id: 987f1df62016f68d7796f40c0aedfcd1becf5f1e
Configuration menu - View commit details
-
Copy full SHA for 15ca5f2 - Browse repository at this point
Copy the full SHA 15ca5f2View commit details -
pass down the scan table schema type to parquet column reader (facebo…
…okincubator#6404) Summary: pass down the scan table schema to parquet column reader Details: currently, the requestedType, which is available in [ParquetColumnReader.cpp](https://github.com/facebookincubator/velox/blob/517e3e3a0c8308c96ca068444dfeee37204f7773/velox/dwio/parquet/reader/ParquetColumnReader.cpp#L37C60-L37C68), are set based on the schema present in the parquet file (file data type) instead of scan table schema. The issue occurs when the expected output of the TableScan differs from the schema of the parquet file. Spark's data format for some types differs from Parquet's format. Similar to schema evolution, when the type differs, Spark performs an implicit conversion. The conversions that Spark performs can be seen in [ParquetVectorUpdaterFactory.java](https://github.com/apache/spark/blob/6ca45c52b7416e7b3520dc902cb24f060c7c72dd/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetVectorUpdaterFactory.java#L67C3-L185C6). This PR fix the issue by setting requestedType` with scan table schema data type to parquet column reader. It's a follow PR of this PR facebookincubator#5786 to address the issue by following the comments of Yuhta Please check detail context from facebookincubator#5786 This update is one of the modifications necessary for issue facebookincubator#5770. Pull Request resolved: facebookincubator#6404 Reviewed By: pedroerp Differential Revision: D49330580 Pulled By: Yuhta fbshipit-source-id: bd56bda6efd708691ee35b5b66d5ba9536df525f
Configuration menu - View commit details
-
Copy full SHA for b803a26 - Browse repository at this point
Copy the full SHA b803a26View commit details -
Fix Lead/Lag window function for int64 offset (facebookincubator#6463)
Summary: Fixes facebookincubator#6417 Pull Request resolved: facebookincubator#6463 Reviewed By: amitkdutta Differential Revision: D49371431 Pulled By: mbasmanova fbshipit-source-id: 8956b04abe608bfcb76b0a3b49cefd0689284bb2
Configuration menu - View commit details
-
Copy full SHA for aeeab89 - Browse repository at this point
Copy the full SHA aeeab89View commit details -
Add ability to serialize input vector and sql if expression evaluatio…
…n crashes (facebookincubator#6402) Summary: Pull Request resolved: facebookincubator#6402 This adds an experimental flag 'experimental_velox_save_input_on_fatal_signal' that when set to true, serializes the input vector data and all the SQL expressions in the ExprSet that is currently executing whenever a fatal signal is encountered. Enabling this flag makes the signal handler async signal unsafe, so it should only be used for debugging purposes. Reviewed By: kgpai Differential Revision: D48891649 fbshipit-source-id: 47722d726c76a8602cf436c1840d2a0d720e2c35
Configuration menu - View commit details
-
Copy full SHA for 8e4b1cb - Browse repository at this point
Copy the full SHA 8e4b1cbView commit details -
Return reference to shared_ptr in INTERVAL_DAY_TIME(), INTERVAL_YEAR_…
…MONTH() and DATE() to avoid copying (facebookincubator#6615) Summary: Pull Request resolved: facebookincubator#6615 This is to remove unnecessary copying in INTERVAL_DAY_TIME(), INTERVAL_YEAR_MONTH() and DATE() calls, which return (a copy of) constant shared_ptr, and make it very expensive. Reviewed By: Yuhta, bikramSingh91 Differential Revision: D49347369 fbshipit-source-id: 6930970d9f2807347b16065fc224d7a7f5f57b69
Configuration menu - View commit details
-
Copy full SHA for cccc8ad - Browse repository at this point
Copy the full SHA cccc8adView commit details
Commits on Sep 19, 2023
-
Add S3 Filesink (facebookincubator#6309)
Summary: Pull Request resolved: facebookincubator#6309 Reviewed By: xiaoxmeng Differential Revision: D49394977 Pulled By: pedroerp fbshipit-source-id: ba5fa3dda474505093d7d9d2f00aaa8c3d2d7e81
Configuration menu - View commit details
-
Copy full SHA for 4addc9a - Browse repository at this point
Copy the full SHA 4addc9aView commit details -
Configuration menu - View commit details
-
Copy full SHA for bd9ab9c - Browse repository at this point
Copy the full SHA bd9ab9cView commit details