Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump stablehlo to openxla/stablehlo@fd52182f76cadb82f2064fe5fc49a4fb4347a826 #2821

Merged
merged 7 commits into from
Jan 31, 2024

Conversation

sjain-stanford
Copy link
Member

@sjain-stanford sjain-stanford commented Jan 28, 2024

With the recent LLVM integrate and changes from llvm/llvm-project#78260, we hit this build error in Stablehlo (which is quite old).

external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1020:14: error: no member named 'startRootUpdate' in 'mlir::PatternRewriter'
    rewriter.startRootUpdate(op);
    ~~~~~~~~ ^
external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1026:16: error: no member named 'finalizeRootUpdate' in 'mlir::PatternRewriter'
      rewriter.finalizeRootUpdate(op);
      ~~~~~~~~ ^
external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1029:16: error: no member named 'cancelRootUpdate' in 'mlir::PatternRewriter'
      rewriter.cancelRootUpdate(op);
      ~~~~~~~~ ^
external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1108:14: error: no member named 'updateRootInPlace' in 'mlir::PatternRewriter'
    rewriter.updateRootInPlace(op->getParentOp(), [&]() { return; });
    ~~~~~~~~ ^
4 errors generated.
Target @torch-mlir//:torch-mlir-opt failed to build

I'm still puzzled as to how this didn't fail with the CMake merge gating CI (do we not test Stablehlo builds/tests?). In any case, bumping our submodule to openxla/stablehlo#1918 fixes it.

It exposes a new failing lit test in TorchToStablehlo though, that I have looped stablehlo developers into (here).

bazel run @torch-mlir//test/Conversion:TorchToStablehlo/scatter.mlir.test 

...external/torch-mlir/test/Conversion/TorchToStablehlo/scatter.mlir
within split at <stdin>:1 offset :33:8: error: unexpected error: Expects non-empty reduction block for type inference                                                                               
  %0 = torch.aten.scatter.src %arg0, %int0, %arg1, %arg2 : !torch.vtensor<[?,?],si64>, !torch.int, !torch.vtensor<[?,?],si64>, !torch.vtensor<[?,?],si64> -> !torch.vtensor<[?,?],si64>             
       ^                                                                                                                                                                                            
LLVM ERROR: Failed to infer result type(s).               

Bazel CI: https://github.com/sjain-stanford/torch-mlir/actions/runs/7732673480/job/21083102228

@sjain-stanford sjain-stanford changed the title Bump stablehlo to https://github.com/openxla/stablehlo/commit/20255865ba299ed67bdf6267478c8477aef7a60d Bump stablehlo to openxla/stablehlo@20255865ba299ed67bdf6267478c8477aef7a60d Jan 28, 2024
@stellaraccident
Copy link
Collaborator

With the recent LLVM integrate and changes from llvm/llvm-project#78260, we hit this build error in Stablehlo (which is quite old).

external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1020:14: error: no member named 'startRootUpdate' in 'mlir::PatternRewriter'
    rewriter.startRootUpdate(op);
    ~~~~~~~~ ^
external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1026:16: error: no member named 'finalizeRootUpdate' in 'mlir::PatternRewriter'
      rewriter.finalizeRootUpdate(op);
      ~~~~~~~~ ^
external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1029:16: error: no member named 'cancelRootUpdate' in 'mlir::PatternRewriter'
      rewriter.cancelRootUpdate(op);
      ~~~~~~~~ ^
external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1108:14: error: no member named 'updateRootInPlace' in 'mlir::PatternRewriter'
    rewriter.updateRootInPlace(op->getParentOp(), [&]() { return; });
    ~~~~~~~~ ^
4 errors generated.
Target @torch-mlir//:torch-mlir-opt failed to build

I'm still puzzled as to how this didn't fail with the CMake merge gating CI (do we not test Stablehlo builds/tests?). In any case, bumping our submodule to openxla/stablehlo#1918 fixes it.

It exposes a new failing lit test in TorchToStablehlo though, that I have looped stablehlo developers into (here).

bazel run @torch-mlir//test/Conversion:TorchToStablehlo/scatter.mlir.test 

...external/torch-mlir/test/Conversion/TorchToStablehlo/scatter.mlir
within split at <stdin>:1 offset :33:8: error: unexpected error: Expects non-empty reduction block for type inference                                                                               
  %0 = torch.aten.scatter.src %arg0, %int0, %arg1, %arg2 : !torch.vtensor<[?,?],si64>, !torch.int, !torch.vtensor<[?,?],si64>, !torch.vtensor<[?,?],si64> -> !torch.vtensor<[?,?],si64>             
       ^                                                                                                                                                                                            
LLVM ERROR: Failed to infer result type(s).               

Bazel CI: https://github.com/sjain-stanford/torch-mlir/actions/runs/7687378424/job/20947325658

I often found it hard to figure out what the old CI was testing. Looks like the new one gets it, so focused there.

@sdasgup3
Copy link

The stablehlo side of fix for this particular issue is going to be made at openxla/stablehlo#1965.
This would also need some changes in torch-mlir side. e.g. openxla/stablehlo#1962 (comment).
Also, I can see that there are other instances of create<stablehlo::ReduceOp> (an example would be here) which should be updated to allow using the new builder introduced in openxla/stablehlo#1965

stablehlo patch

stablehlo patches
@sjain-stanford
Copy link
Member Author

A lot more conversions needed an update, I think this is all of it: a0ad9a8. The bazel build works, I have one lit test that needs an update.

The CMake build now fails with setup_sanitizers. Probably related to openxla/stablehlo#1959 (cc: @fzakaria , @mlevesquedion, @sdasgup3 ). I see the default for STABLEHLO_ENABLE_SANITIZER being OFF, but this code probably needs to be enclosed in an if-endif:

include(SetupSanitizers)
setup_sanitizers()

Error:

  -- Building StableHLO as an external LLVM project
  -- Building with -fPIC
  CMake Error at /_work/torch-mlir/torch-mlir/externals/stablehlo/CMakeLists.txt:147 (include):
    include could not find requested file:
  
      SetupSanitizers
  
  
  CMake Error at /_work/torch-mlir/torch-mlir/externals/stablehlo/CMakeLists.txt:148 (setup_sanitizers):
    Unknown CMake command "setup_sanitizers".
  
  
  -- Configuring incomplete, errors occurred!

@sjain-stanford sjain-stanford changed the title Bump stablehlo to openxla/stablehlo@20255865ba299ed67bdf6267478c8477aef7a60d Bump stablehlo to openxla/stablehlo@fd52182f76cadb82f2064fe5fc49a4fb4347a826 Jan 31, 2024
@sjain-stanford
Copy link
Member Author

Picked up patch: openxla/stablehlo#1972

@sjain-stanford sjain-stanford marked this pull request as ready for review January 31, 2024 21:33
Copy link
Collaborator

@stellaraccident stellaraccident left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the community service.

@stellaraccident stellaraccident merged commit 8a17c98 into llvm:main Jan 31, 2024
3 checks passed
@sjain-stanford sjain-stanford deleted the sambhav/bazel_fix branch February 1, 2024 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants