Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose isUnquotedPathCharacter for validation #375

Closed
wants to merge 9 commits into from

Commits on Sep 18, 2023

  1. Add support for CAST(DECIMAL as VARCHAR) (facebookincubator#6210)

    Summary:
    To improve the performance, instead of creating intermediate strings, the
    raw string buffer is pre-allocated to be written directly. Function `std::to_chars`
    is used to convert integers into a character string by successively filling
    the range. On buffer allocation, instead of calculating the precise size
    from intermediate strings, we pre-allocate sufficient buffer based on an
    estimation with decimal precision and scale, and set the precise size after all
    strings are written.
    
    An alternative implementation used `DecimalUtil::toString` which produced a lot
    of intermediate strings during conversion. Besides, `DecimalUtil::toString`
    was called for the calculation of string buffer size. The optimized implementation
    uses `std::to_chars` to convert integer to string and avoid all intermediate strings.
    The string buffer size is estimated with decimal precision and scale. As below
    benchmarks show, the final performance is improved 4-5x compared with the
    previous one.
    
    Cast from decimal to varchar benchmark | cast##cast_short_decimal | cast##cast_long_decimal
    -- | -- | --
    previous (DecimalUtil::toString) | 45.43ms | 132.09ms
    optimized (std::to_chars) | 9.87ms | 35.00ms
    
    Pull Request resolved: facebookincubator#6210
    
    Reviewed By: xiaoxmeng
    
    Differential Revision: D49315826
    
    Pulled By: mbasmanova
    
    fbshipit-source-id: 1f419aa9edcb080752c3bed567d390cc7a461cce
    rui-mo authored and facebook-github-bot committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    4d9c0ea View commit details
    Browse the repository at this point in the history
  2. Fix the linking problem of simdjson (facebookincubator#6565)

    Summary:
    When velox was used as a third-party library and `SIMDJsonExtractor` was used, it failed when running json function tests. We found that `-DSIMDJSON_THREADS_ENABLED=1` was not configured when generating libvelox_functions_json.a. We fix it by changing "simdjson" to "simdjson::simdjson" in target_link_libraries.
    
    Fixes facebookincubator#6564
    
    Pull Request resolved: facebookincubator#6565
    
    Reviewed By: Yuhta
    
    Differential Revision: D49285542
    
    Pulled By: kgpai
    
    fbshipit-source-id: f9bc093b278288a2a73bbb289bb91b5dd7061097
    xiaodouchen authored and facebook-github-bot committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    d7a6875 View commit details
    Browse the repository at this point in the history
  3. Fix potential crash in DwrfReader::updateColumnNamesFromTableSchema(). (

    facebookincubator#6599)
    
    Summary:
    Pull Request resolved: facebookincubator#6599
    
    When type kind is not equal and one of them non-primitive we would
    crash accessing null type pointer after dynamic cast.
    Fix is to bail out from going down the type tree whenever type kind is different.
    The bug sneaked in, when we replaced throw() by log(1) in type checking code.
    
    Reviewed By: Yuhta
    
    Differential Revision: D49338549
    
    fbshipit-source-id: 987f1df62016f68d7796f40c0aedfcd1becf5f1e
    Sergey Pershin authored and facebook-github-bot committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    15ca5f2 View commit details
    Browse the repository at this point in the history
  4. pass down the scan table schema type to parquet column reader (facebo…

    …okincubator#6404)
    
    Summary:
    pass down the scan table schema to parquet column reader
    
    Details:
    
    currently, the requestedType, which is available in [ParquetColumnReader.cpp](https://github.com/facebookincubator/velox/blob/517e3e3a0c8308c96ca068444dfeee37204f7773/velox/dwio/parquet/reader/ParquetColumnReader.cpp#L37C60-L37C68), are set based on the schema present in the parquet file (file data type) instead of scan table schema.
    
    The issue occurs when the expected output of the TableScan differs from the schema of the parquet file. Spark's data format for some types differs from Parquet's format. Similar to schema evolution, when the type differs, Spark performs an implicit conversion. The conversions that Spark performs can be seen in [ParquetVectorUpdaterFactory.java](https://github.com/apache/spark/blob/6ca45c52b7416e7b3520dc902cb24f060c7c72dd/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetVectorUpdaterFactory.java#L67C3-L185C6).
    
    This PR fix the issue by setting requestedType` with scan table schema data type to parquet column reader.
    
    It's a follow PR of this PR facebookincubator#5786 to address the issue by following the comments of Yuhta
    
    Please check detail context from facebookincubator#5786
    
    This update is one of the modifications necessary for issue facebookincubator#5770.
    
    Pull Request resolved: facebookincubator#6404
    
    Reviewed By: pedroerp
    
    Differential Revision: D49330580
    
    Pulled By: Yuhta
    
    fbshipit-source-id: bd56bda6efd708691ee35b5b66d5ba9536df525f
    Yangyang Gao authored and facebook-github-bot committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    b803a26 View commit details
    Browse the repository at this point in the history
  5. Fix Lead/Lag window function for int64 offset (facebookincubator#6463)

    Summary:
    Fixes facebookincubator#6417
    
    Pull Request resolved: facebookincubator#6463
    
    Reviewed By: amitkdutta
    
    Differential Revision: D49371431
    
    Pulled By: mbasmanova
    
    fbshipit-source-id: 8956b04abe608bfcb76b0a3b49cefd0689284bb2
    xumingming authored and facebook-github-bot committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    aeeab89 View commit details
    Browse the repository at this point in the history
  6. Add ability to serialize input vector and sql if expression evaluatio…

    …n crashes (facebookincubator#6402)
    
    Summary:
    Pull Request resolved: facebookincubator#6402
    
    This adds an experimental flag
    'experimental_velox_save_input_on_fatal_signal' that when set to
    true, serializes the input vector data and all the SQL expressions
    in the ExprSet that is currently executing whenever a fatal signal
    is encountered. Enabling this flag makes the signal handler async
    signal unsafe, so it should only be used for debugging purposes.
    
    Reviewed By: kgpai
    
    Differential Revision: D48891649
    
    fbshipit-source-id: 47722d726c76a8602cf436c1840d2a0d720e2c35
    Bikramjeet Vig authored and facebook-github-bot committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    8e4b1cb View commit details
    Browse the repository at this point in the history
  7. Return reference to shared_ptr in INTERVAL_DAY_TIME(), INTERVAL_YEAR_…

    …MONTH() and DATE() to avoid copying (facebookincubator#6615)
    
    Summary:
    Pull Request resolved: facebookincubator#6615
    
    This is to remove unnecessary copying in INTERVAL_DAY_TIME(), INTERVAL_YEAR_MONTH() and DATE() calls, which return (a copy of) constant shared_ptr, and make it very expensive.
    
    Reviewed By: Yuhta, bikramSingh91
    
    Differential Revision: D49347369
    
    fbshipit-source-id: 6930970d9f2807347b16065fc224d7a7f5f57b69
    Artem Gelun authored and facebook-github-bot committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    cccc8ad View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2023

  1. Add S3 Filesink (facebookincubator#6309)

    Summary: Pull Request resolved: facebookincubator#6309
    
    Reviewed By: xiaoxmeng
    
    Differential Revision: D49394977
    
    Pulled By: pedroerp
    
    fbshipit-source-id: ba5fa3dda474505093d7d9d2f00aaa8c3d2d7e81
    majetideepak authored and facebook-github-bot committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    4addc9a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    bd9ab9c View commit details
    Browse the repository at this point in the history