`iree-codegen-iree-comprehensive-bufferize` genereates `memref`s with dynamic offset #847

makslevental · 2024-10-16T21:33:58Z

#845 is blocked because at that commit of IREE, iree-codegen-iree-comprehensive-bufferize generates memrefs with dynamic offsets and we get an error here.

@MaheshRavishankar any clue what changed recently that might produce this behavior? Possibly @pashu123 might be able to give a hint (I'm seeing recent changes in git-blame...).

cc @jtuyls @yzhang93 @newling @Abhishek-Varma

Failing snippet follows; what stands out to me as odd/a clue is that hal.interface.binding.subspan now has a memref.assume_alignment with a dynamic offset:

func.func @mm_in_bf16_out_f32_dispatch_0_matmul_64x64x64_bf16xbf16xf32() attributes {translation_info = #iree_codegen.translation_info<Custom>} {
  %c0 = arith.constant 0 : index
  %cst = arith.constant 0.000000e+00 : f32
  %alloc = memref.alloc() : memref<1x1x8x4x8x4xbf16, 2 : i32>
  %alloc_0 = memref.alloc() : memref<1x1x4x8x4x8xbf16, 2 : i32>
  %alloc_1 = memref.alloc() : memref<1x2x32x32xbf16, 1 : i32>
  %alloc_2 = memref.alloc() : memref<2x1x32x32xbf16, 1 : i32>
  %alloc_3 = memref.alloc() : memref<2x2x8x8x4x4xf32, 2 : i32>
  %alloc_4 = memref.alloc() : memref<2x2x32x32xf32, 1 : i32>
  %0:3 = util.assume.int 
      %c0<umin = 0, umax = 0>, 
      %c0<umin = 0, umax = 0>, 
      %c0<umin = 0, umax = 0>
    : index, index, index
  %1 = hal.interface.binding.subspan layout(<bindings = [#hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, Indirect>], flags = Indirect>) binding(0) alignment(64) offset(%0#0) flags("ReadOnly|Indirect") : memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
  memref.assume_alignment %1, 1 : memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
  %2 = hal.interface.binding.subspan layout(<bindings = [#hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, Indirect>], flags = Indirect>) binding(1) alignment(64) offset(%0#1) flags("ReadOnly|Indirect") : memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
  memref.assume_alignment %2, 1 : memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
  %3 = hal.interface.binding.subspan layout(<bindings = [#hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, Indirect>], flags = Indirect>) binding(2) alignment(64) offset(%0#2) flags(Indirect) : memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
  memref.assume_alignment %3, 1 : memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
  scf.forall (%arg0, %arg1) = (0, 0) to (64, 64) step (64, 64) {
    %subview = memref.subview %1[%arg0, 0] [64, 64] [1, 1] : memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> to memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
    %subview_5 = memref.subview %2[0, %arg1] [64, 64] [1, 1] : memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> to memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
    %subview_6 = memref.subview %3[%arg0, %arg1] [64, 64] [1, 1] : memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> to memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
    %subview_7 = memref.subview %subview[0, 0] [64, 32] [1, 1] : memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> to memref<64x32xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
    iree_linalg_ext.pack %subview_7 inner_dims_pos = [0, 1] inner_tiles = [32, 32] into %alloc_2 : (memref<64x32xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> memref<2x1x32x32xbf16, 1 : i32>)
    %subview_8 = memref.subview %subview_5[0, 0] [32, 64] [1, 1] : memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> to memref<32x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
    iree_linalg_ext.pack %subview_8 outer_dims_perm = [0, 1] inner_dims_pos = [0, 1] inner_tiles = [32, 32] into %alloc_1 : (memref<32x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> memref<1x2x32x32xbf16, 1 : i32>)
    scf.forall (%arg2, %arg3) in (2, 2) {
      %subview_12 = memref.subview %alloc_2[%arg2, 0, 0, 0] [1, 1, 32, 32] [1, 1, 1, 1] : memref<2x1x32x32xbf16, 1 : i32> to memref<1x1x32x32xbf16, strided<[1024, 1024, 32, 1], offset: ?>, 1 : i32>
      iree_linalg_ext.pack %subview_12 outer_dims_perm = [0, 1, 3, 2] inner_dims_pos = [2, 3] inner_tiles = [4, 8] into %alloc_0 : (memref<1x1x32x32xbf16, strided<[1024, 1024, 32, 1], offset: ?>, 1 : i32> memref<1x1x4x8x4x8xbf16, 2 : i32>)
      %subview_13 = memref.subview %alloc_1[0, %arg3, 0, 0] [1, 1, 32, 32] [1, 1, 1, 1] : memref<1x2x32x32xbf16, 1 : i32> to memref<1x1x32x32xbf16, strided<[2048, 1024, 32, 1], offset: ?>, 1 : i32>
      iree_linalg_ext.pack %subview_13 outer_dims_perm = [0, 1, 3, 2] inner_dims_pos = [2, 3] inner_tiles = [8, 4] into %alloc : (memref<1x1x32x32xbf16, strided<[2048, 1024, 32, 1], offset: ?>, 1 : i32> memref<1x1x8x4x8x4xbf16, 2 : i32>)
      %subview_14 = memref.subview %alloc_3[%arg2, %arg3, 0, 0, 0, 0] [1, 1, 8, 8, 4, 4] [1, 1, 1, 1, 1, 1] : memref<2x2x8x8x4x4xf32, 2 : i32> to memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>
      linalg.fill ins(%cst : f32) outs(%subview_14 : memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>)
      linalg.generic {indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5, d6, d7, d8) -> (d0, d2, d5, d3, d6, d8)>, affine_map<(d0, d1, d2, d3, d4, d5, d6, d7, d8) -> (d2, d1, d4, d5, d8, d7)>, affine_map<(d0, d1, d2, d3, d4, d5, d6, d7, d8) -> (d0, d1, d4, d3, d6, d7)>], iterator_types = ["parallel", "parallel", "reduction", "parallel", "parallel", "reduction", "parallel", "parallel", "reduction"]} ins(%alloc_0, %alloc : memref<1x1x4x8x4x8xbf16, 2 : i32>, memref<1x1x8x4x8x4xbf16, 2 : i32>) outs(%subview_14 : memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>) attrs =  {lowering_config = #iree_codegen.lowering_config<tile_sizes = [[64, 64], [0, 0, 1], [1, 1, 0, 0, 0, 0]]>, packing_config = #amdaie.packing_config<packing_config = [{packedSizes = [32, 32, 32], transposePackIndices = [1], unpackEmpty = [false], innerPerm = [[1, 0]], outerPerm = [[0, 1]]}, {packedSizes = [0, 0, 0, 4, 4, 8], transposePackIndices = [0, 1, 2], unpackEmpty = [false, false, true], innerPerm = [[0, 1], [1, 0], [0, 1]], outerPerm = [[0, 1, 3, 2], [0, 1, 3, 2], [0, 1, 3, 2]]}]>} {
      ^bb0(%in: bf16, %in_16: bf16, %out: f32):
        %4 = arith.extf %in : bf16 to f32
        %5 = arith.extf %in_16 : bf16 to f32
        %6 = arith.mulf %4, %5 : f32
        %7 = arith.addf %out, %6 : f32
        linalg.yield %7 : f32
      }
      %subview_15 = memref.subview %alloc_3[%arg2, %arg3, 0, 0, 0, 0] [1, 1, 8, 8, 4, 4] [1, 1, 1, 1, 1, 1] : memref<2x2x8x8x4x4xf32, 2 : i32> to memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>
      linalg.generic {indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d1, d2, d3, d4, d5)>, affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d1, d2, d3, d4, d5)>], iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "parallel"]} ins(%subview_14 : memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>) outs(%subview_15 : memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>) {
      ^bb0(%in: f32, %out: f32):
        linalg.yield %in : f32
      }
    } {mapping = [#gpu.thread<y>, #gpu.thread<x>]}
    %subview_9 = memref.subview %subview[0, 32] [64, 32] [1, 1] : memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> to memref<64x32xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
    iree_linalg_ext.pack %subview_9 inner_dims_pos = [0, 1] inner_tiles = [32, 32] into %alloc_2 : (memref<64x32xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> memref<2x1x32x32xbf16, 1 : i32>)
    %subview_10 = memref.subview %subview_5[32, 0] [32, 64] [1, 1] : memref<64x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> to memref<32x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
    iree_linalg_ext.pack %subview_10 outer_dims_perm = [0, 1] inner_dims_pos = [0, 1] inner_tiles = [32, 32] into %alloc_1 : (memref<32x64xbf16, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> memref<1x2x32x32xbf16, 1 : i32>)
    scf.forall (%arg2, %arg3) in (2, 2) {
      %subview_12 = memref.subview %alloc_2[%arg2, 0, 0, 0] [1, 1, 32, 32] [1, 1, 1, 1] : memref<2x1x32x32xbf16, 1 : i32> to memref<1x1x32x32xbf16, strided<[1024, 1024, 32, 1], offset: ?>, 1 : i32>
      iree_linalg_ext.pack %subview_12 outer_dims_perm = [0, 1, 3, 2] inner_dims_pos = [2, 3] inner_tiles = [4, 8] into %alloc_0 : (memref<1x1x32x32xbf16, strided<[1024, 1024, 32, 1], offset: ?>, 1 : i32> memref<1x1x4x8x4x8xbf16, 2 : i32>)
      %subview_13 = memref.subview %alloc_1[0, %arg3, 0, 0] [1, 1, 32, 32] [1, 1, 1, 1] : memref<1x2x32x32xbf16, 1 : i32> to memref<1x1x32x32xbf16, strided<[2048, 1024, 32, 1], offset: ?>, 1 : i32>
      iree_linalg_ext.pack %subview_13 outer_dims_perm = [0, 1, 3, 2] inner_dims_pos = [2, 3] inner_tiles = [8, 4] into %alloc : (memref<1x1x32x32xbf16, strided<[2048, 1024, 32, 1], offset: ?>, 1 : i32> memref<1x1x8x4x8x4xbf16, 2 : i32>)
      %subview_14 = memref.subview %alloc_3[%arg2, %arg3, 0, 0, 0, 0] [1, 1, 8, 8, 4, 4] [1, 1, 1, 1, 1, 1] : memref<2x2x8x8x4x4xf32, 2 : i32> to memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>
      linalg.generic {indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5, d6, d7, d8) -> (d0, d2, d5, d3, d6, d8)>, affine_map<(d0, d1, d2, d3, d4, d5, d6, d7, d8) -> (d2, d1, d4, d5, d8, d7)>, affine_map<(d0, d1, d2, d3, d4, d5, d6, d7, d8) -> (d0, d1, d4, d3, d6, d7)>], iterator_types = ["parallel", "parallel", "reduction", "parallel", "parallel", "reduction", "parallel", "parallel", "reduction"]} ins(%alloc_0, %alloc : memref<1x1x4x8x4x8xbf16, 2 : i32>, memref<1x1x8x4x8x4xbf16, 2 : i32>) outs(%subview_14 : memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>) attrs =  {lowering_config = #iree_codegen.lowering_config<tile_sizes = [[64, 64], [0, 0, 1], [1, 1, 0, 0, 0, 0]]>, packing_config = #amdaie.packing_config<packing_config = [{packedSizes = [32, 32, 32], transposePackIndices = [1], unpackEmpty = [false], innerPerm = [[1, 0]], outerPerm = [[0, 1]]}, {packedSizes = [0, 0, 0, 4, 4, 8], transposePackIndices = [0, 1, 2], unpackEmpty = [false, false, true], innerPerm = [[0, 1], [1, 0], [0, 1]], outerPerm = [[0, 1, 3, 2], [0, 1, 3, 2], [0, 1, 3, 2]]}]>} {
      ^bb0(%in: bf16, %in_18: bf16, %out: f32):
        %4 = arith.extf %in : bf16 to f32
        %5 = arith.extf %in_18 : bf16 to f32
        %6 = arith.mulf %4, %5 : f32
        %7 = arith.addf %out, %6 : f32
        linalg.yield %7 : f32
      }
      %subview_15 = memref.subview %alloc_4[%arg2, %arg3, 0, 0] [1, 1, 32, 32] [1, 1, 1, 1] : memref<2x2x32x32xf32, 1 : i32> to memref<1x1x32x32xf32, strided<[2048, 1024, 32, 1], offset: ?>, 1 : i32>
      iree_linalg_ext.unpack %subview_14 outer_dims_perm = [0, 1, 3, 2] inner_dims_pos = [2, 3] inner_tiles = [4, 4] into %subview_15 : (memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32> memref<1x1x32x32xf32, strided<[2048, 1024, 32, 1], offset: ?>, 1 : i32>)
      %subview_16 = memref.subview %alloc_3[%arg2, %arg3, 0, 0, 0, 0] [1, 1, 8, 8, 4, 4] [1, 1, 1, 1, 1, 1] : memref<2x2x8x8x4x4xf32, 2 : i32> to memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>
      linalg.generic {indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d1, d2, d3, d4, d5)>, affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d1, d2, d3, d4, d5)>], iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "parallel"]} ins(%subview_14 : memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>) outs(%subview_16 : memref<1x1x8x8x4x4xf32, strided<[2048, 1024, 128, 16, 4, 1], offset: ?>, 2 : i32>) {
      ^bb0(%in: f32, %out: f32):
        linalg.yield %in : f32
      }
      %subview_17 = memref.subview %alloc_4[%arg2, %arg3, 0, 0] [1, 1, 32, 32] [1, 1, 1, 1] : memref<2x2x32x32xf32, 1 : i32> to memref<1x1x32x32xf32, strided<[2048, 1024, 32, 1], offset: ?>, 1 : i32>
      linalg.generic {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>, affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%subview_15 : memref<1x1x32x32xf32, strided<[2048, 1024, 32, 1], offset: ?>, 1 : i32>) outs(%subview_17 : memref<1x1x32x32xf32, strided<[2048, 1024, 32, 1], offset: ?>, 1 : i32>) {
      ^bb0(%in: f32, %out: f32):
        linalg.yield %in : f32
      }
    } {mapping = [#gpu.thread<y>, #gpu.thread<x>]}
    iree_linalg_ext.unpack %alloc_4 inner_dims_pos = [0, 1] inner_tiles = [32, 32] into %subview_6 : (memref<2x2x32x32xf32, 1 : i32> memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>)
    %subview_11 = memref.subview %3[%arg0, %arg1] [64, 64] [1, 1] : memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>> to memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>
    linalg.generic {indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>], iterator_types = ["parallel", "parallel"]} ins(%subview_6 : memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>) outs(%subview_11 : memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>) {
    ^bb0(%in: f32, %out: f32):
      linalg.yield %in : f32
    }
  } {mapping = [#gpu.block<y>, #gpu.block<x>]}
  linalg.generic {indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>], iterator_types = ["parallel", "parallel"]} ins(%3 : memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>) outs(%3 : memref<64x64xf32, strided<[64, 1], offset: ?>, #hal.descriptor_type<storage_buffer>>) {
  ^bb0(%in: f32, %out: f32):
    linalg.yield %in : f32
  }
  memref.dealloc %alloc_4 : memref<2x2x32x32xf32, 1 : i32>
  memref.dealloc %alloc_3 : memref<2x2x8x8x4x4xf32, 2 : i32>
  memref.dealloc %alloc_2 : memref<2x1x32x32xbf16, 1 : i32>
  memref.dealloc %alloc_1 : memref<1x2x32x32xbf16, 1 : i32>
  memref.dealloc %alloc_0 : memref<1x1x4x8x4x8xbf16, 2 : i32>
  memref.dealloc %alloc : memref<1x1x8x4x8x4xbf16, 2 : i32>
  return
}

The text was updated successfully, but these errors were encountered:

pashu123 · 2024-10-17T06:04:32Z

I've made a change to duplicate Empty tensor ops here: https://github.com/iree-org/iree/blob/05bbcf1385146d075829cd940a52bf06961614d0/compiler/src/iree/compiler/Codegen/Common/IREEComprehensiveBufferizePass.cpp#L177 Since, we are not using the destination passing style as a preprocessing for distribute-using-for-all we had to make that decision. If your pipeline uses convert-to-destination passing style pass, then it shouldn't make a difference. @MaheshRavishankar, do you think the error might be caused by the change mentioned?

yzhang93 · 2024-10-17T17:52:17Z

I've made a change to duplicate Empty tensor ops here: https://github.com/iree-org/iree/blob/05bbcf1385146d075829cd940a52bf06961614d0/compiler/src/iree/compiler/Codegen/Common/IREEComprehensiveBufferizePass.cpp#L177 Since, we are not using the destination passing style as a preprocessing for distribute-using-for-all we had to make that decision. If your pipeline uses convert-to-destination passing style pass, then it shouldn't make a difference. @MaheshRavishankar, do you think the error might be caused by the change mentioned?

No, I don't think the error is caused by your change.

The reason is like @makslevental mentioned because of

%0:3 = util.assume.int 
      %c0<umin = 0, umax = 0>, 
      %c0<umin = 0, umax = 0>, 
      %c0<umin = 0, umax = 0>
    : index, index, index
  %1 = hal.interface.binding.subspan layout(<bindings = [#hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, Indirect>], flags = Indirect>) binding(0) alignment(64) offset(%0#0) flags("ReadOnly|Indirect") : !flow.dispatch.tensor<readonly:tensor<128x128xi32>>
  %2 = hal.interface.binding.subspan layout(<bindings = [#hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, Indirect>], flags = Indirect>) binding(1) alignment(64) offset(%0#1) flags("ReadOnly|Indirect") : !flow.dispatch.tensor<readonly:tensor<128x128xi32>>
  %3 = hal.interface.binding.subspan layout(<bindings = [#hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, "ReadOnly|Indirect">, #hal.pipeline.binding<storage_buffer, Indirect>], flags = Indirect>) binding(2) alignment(64) offset(%0#2) flags(Indirect) : !flow.dispatch.tensor<writeonly:tensor<128x128xi32>>

It generates memref.assume_alignment with a dynamic offset after bufferization.

I don't know how to get rid of the dynamic offsets, but if we remove this check for now, then we can proceed without problem.

MaheshRavishankar · 2024-10-17T18:27:01Z

Maybe you just need to drop those hints using this pass https://github.com/MaheshRavishankar/iree/blob/6950dc0a5a2e6af2d8ba18e323534df72df984ad/compiler/src/iree/compiler/Codegen/LLVMCPU/Passes.cpp#L822

yzhang93 · 2024-10-17T18:50:51Z

I think Stella's optimization PRs from yesterday solved the problem, my local build with new iree bump works. I'll update the branch later after fixing some other conflicts.

makslevental · 2024-10-17T19:05:46Z

I think Stella's optimization PRs from yesterday solved the problem, my local build with new iree bump works. I'll update the branch later after fixing some other conflicts.

that's like two wrongs make a right lol. cool.

MaheshRavishankar · 2024-10-17T19:23:33Z

I think Stella's optimization PRs from yesterday solved the problem, my local build with new iree bump works. I'll update the branch later after fixing some other conflicts.

that's like two wrongs make a right lol. cool.

Hey maybe this is two rights!!

makslevental · 2024-10-17T22:08:17Z

Fixed by #845

makslevental self-assigned this Oct 16, 2024

makslevental added the bug Something isn't working label Oct 16, 2024

makslevental changed the title ~~iree-codegen-iree-comprehensive-bufferize genereates memref.subview with dynamic size~~ iree-codegen-iree-comprehensive-bufferize genereates memrefs with dynamic offset Oct 16, 2024

makslevental closed this as completed Oct 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`iree-codegen-iree-comprehensive-bufferize` genereates `memref`s with dynamic offset #847

`iree-codegen-iree-comprehensive-bufferize` genereates `memref`s with dynamic offset #847

makslevental commented Oct 16, 2024 •

edited

Loading

pashu123 commented Oct 17, 2024

yzhang93 commented Oct 17, 2024 •

edited

Loading

MaheshRavishankar commented Oct 17, 2024

yzhang93 commented Oct 17, 2024

makslevental commented Oct 17, 2024

MaheshRavishankar commented Oct 17, 2024

makslevental commented Oct 17, 2024

iree-codegen-iree-comprehensive-bufferize genereates memrefs with dynamic offset #847

iree-codegen-iree-comprehensive-bufferize genereates memrefs with dynamic offset #847

Comments

makslevental commented Oct 16, 2024 • edited Loading

pashu123 commented Oct 17, 2024

yzhang93 commented Oct 17, 2024 • edited Loading

MaheshRavishankar commented Oct 17, 2024

yzhang93 commented Oct 17, 2024

makslevental commented Oct 17, 2024

MaheshRavishankar commented Oct 17, 2024

makslevental commented Oct 17, 2024

`iree-codegen-iree-comprehensive-bufferize` genereates `memref`s with dynamic offset #847

`iree-codegen-iree-comprehensive-bufferize` genereates `memref`s with dynamic offset #847

makslevental commented Oct 16, 2024 •

edited

Loading

yzhang93 commented Oct 17, 2024 •

edited

Loading