Develop upstream sync 230731 #2169

weihanmines · 2023-08-01T01:05:43Z

No description provided.

Similiar to tensorflow#58677, the capitalization of FlatBuffers needs to match. Otherwise using TFLite via find_package() will fail to find FlatBuffers.

…ednn-3.0-final PiperOrigin-RevId: 551186750

Create reduction_utils for utils related to reduction codegen. Move some functions to gpu_fusible. PiperOrigin-RevId: 551187475

PiperOrigin-RevId: 551196576

…s of HLOs. PiperOrigin-RevId: 5512129

As of CUDA 12.2 additional input validation allows NULL for the row offsets only when rows=0.

…in global or local view. If the attribute is set on a CallOp, then verification logic converts the programs arguments and results from local view to global view to verify that local view shape + sharding is equivalent to the expected global view shape. PiperOrigin-RevId: 551222813

… wheel. PiperOrigin-RevId: 551224665

Updates LLVM usage to match [365d6eb1f7d8](llvm/llvm-project@365d6eb1f7d8) PiperOrigin-RevId: 551229328

…ion/configuration out of experimental. PiperOrigin-RevId: 551235514

PiperOrigin-RevId: 551238120

Also fix typo in SetAllowBufferHandleOutput comment: false->true. Also fix #include order to match style guide. PiperOrigin-RevId: 551247708

PiperOrigin-RevId: 551261650

PiperOrigin-RevId: 551275563

PiperOrigin-RevId: 551276374

`TF_STATUS_ASSIGN_OR_RETURN` and `TF_STATUS_RETURN_IF_ERROR` PiperOrigin-RevId: 551278625

PiperOrigin-RevId: 551290442

PiperOrigin-RevId: 551295403

…_heuristics PiperOrigin-RevId: 551297374

This CL will add patterns to fold Transpose and FC to covert into a BMM, like below- FC(lhs, Transpose(rhs)) -> BMM(lha, rhs, false, false) The right thing to do in this pattern will be to apply the pattern only if keep_num_dims==True. Because, if the output rank is less-than the input rank, it means `keep_num_dims` has reduced the output. But checking for rank will improve the coverage. This pattern will now work PiperOrigin-RevId: 551297769

PiperOrigin-RevId: 551313932

To improve debuggability, we want the shape refinement to make as few changes as possible to the module. In this change we remove one use of inlining. PiperOrigin-RevId: 551325242

PiperOrigin-RevId: 551347216

PiperOrigin-RevId: 551353292

…ac compiler error Apparently ssize_t is only a long sometimes (at least 32-bit), instead of long long (at least 64-bit). I don't have a mac so I can't repro the failing build, but hopefully this fixes it based on the error message. PiperOrigin-RevId: 551376003

PiperOrigin-RevId: 551401683

PiperOrigin-RevId: 551408554

PiperOrigin-RevId: 551410772

…memcpy API call hlo.sort operation compiled to a memcpy + a sequence of device kernel launches PiperOrigin-RevId: 552539521

PiperOrigin-RevId: 552543030

Use memref descriptor to get offset if we do not know it at compile time. PiperOrigin-RevId: 552554429

PiperOrigin-RevId: 552555470

PiperOrigin-RevId: 552562345

PiperOrigin-RevId: 552565337

…symmetric. PiperOrigin-RevId: 552566803

PiperOrigin-RevId: 552568564

A total of three new ops are added: Mul, Equal, and While. The control flow op works for one float32 input only. PiperOrigin-RevId: 552571044

…fixes two things: 1. when compile & execute the program, set the option properly for multi-partition. 2. Use a constant launch_id for TF to align with previous non-PJRT implementation. PiperOrigin-RevId: 552577435

This is needed by the DUCC FFT library in order to use `tsl::condition_variable` as a direct replacement for `std::condition_variable`. PiperOrigin-RevId: 552595622

- Add an option to provide XLA the device memory limit to use - Plumb that to HloModuleConfig through different objects PiperOrigin-RevId: 552596103

PiperOrigin-RevId: 552605248

PiperOrigin-RevId: 552615923

PiperOrigin-RevId: 552619308

PiperOrigin-RevId: 552619840

…end-Recv sequence. This is to prevent the latency hiding scheduler to interleave two Send-Recv sequences. PiperOrigin-RevId: 552621536

PiperOrigin-RevId: 552621643

InitializeCreateGcsFileSystemFnPtr is a temporary fix and it is no longer needed. PiperOrigin-RevId: 552624923

…nalysis PiperOrigin-RevId: 552625083

PiperOrigin-RevId: 552626000

PiperOrigin-RevId: 552631765

This removes some unnecessary `cuDeviceGetCount()` calls when custom ops are used. PiperOrigin-RevId: 552634342

…c in tf.constant according to auto dtype conversion semantics. WeakTensor is created if it satisfies both of the following conditions: 1. tf.constant is called with no dtype arg specified. 2. Input is a nested Python type. PiperOrigin-RevId: 552634845

…mul. PiperOrigin-RevId: 552636662

…nd 0 of a gather, assume that the sharding of that operand does not matter. PiperOrigin-RevId: 552637713

…sync-230731

daniel-lang and others added 30 commits July 26, 2023 13:16

[TFLite] Fix FlatBuffers package name in installed CMake files

f9d4231

Similiar to tensorflow#58677, the capitalization of FlatBuffers needs to match. Otherwise using TFLite via find_package() will fail to find FlatBuffers.

Merge pull request tensorflow#61365 from Intel-tensorflow:bhavanis/on…

e92261f

…ednn-3.0-final PiperOrigin-RevId: 551186750

Split ir_emission_utils.

f95e8b0

Create reduction_utils for utils related to reduction codegen. Move some functions to gpu_fusible. PiperOrigin-RevId: 551187475

Register stablehlo dialect for tf_tfl_translate

5b2c3dd

PiperOrigin-RevId: 551196576

[XLA:GPU] Make float normalization for convolutions ignore other type…

6d990a8

…s of HLOs. PiperOrigin-RevId: 5512129

Avoid nullptr as row offsets to cusparseCreateCsr

9720b40

As of CUDA 12.2 additional input validation allows NULL for the row offsets only when rows=0.

Add python and numpy headers to the local_config_python folder in the…

a034b3d

… wheel. PiperOrigin-RevId: 551224665

Integrate LLVM at llvm/llvm-project@365d6eb1f7d8

daa9a34

Updates LLVM usage to match [365d6eb1f7d8](llvm/llvm-project@365d6eb1f7d8) PiperOrigin-RevId: 551229328

Update sample_stable_delegate for promotion of experimental/accelerat…

f178576

…ion/configuration out of experimental. PiperOrigin-RevId: 551235514

Increase the memory limit for the dtensor GPU test.

0ef963c

PiperOrigin-RevId: 551238120

Remove trigraph

0321ee1

Remove trigraph

cc5aa34

Remove unnecessary 'const' from pass-by-value function parameters.

7ba36c1

Also fix typo in SetAllowBufferHandleOutput comment: false->true. Also fix #include order to match style guide. PiperOrigin-RevId: 551247708

Internal visibility change only.

d9ec8c5

PiperOrigin-RevId: 551261650

special allocations' aggregated metrics need to consider memory color.

cf4afb6

PiperOrigin-RevId: 551275563

deprecate instruction name, it is changed over 1 years ago.

7f8be6e

PiperOrigin-RevId: 551276374

Add macros for working with TF_Status in C++ code

0cc2c30

`TF_STATUS_ASSIGN_OR_RETURN` and `TF_STATUS_RETURN_IF_ERROR` PiperOrigin-RevId: 551278625

Correct the device assignment for tf._XlaCompile

3874ea2

PiperOrigin-RevId: 551290442

#tf-data-service Graduate "data_transfer" experiment.

69541bb

PiperOrigin-RevId: 551295403

Merge pull request tensorflow#60026 from milpuz01:node_rewrite_to_mkl…

2d04d5c

…_heuristics PiperOrigin-RevId: 551297374

Merge pull request tensorflow#61176 from tensorflow:pjpratik-patch-6

f7e9b91

PiperOrigin-RevId: 551313932

Remove one use of inlining in XlaCallModule shape refinement.

f9e2045

To improve debuggability, we want the shape refinement to make as few changes as possible to the module. In this change we remove one use of inlining. PiperOrigin-RevId: 551325242

Merge consecutive Pad operators

fe33928

PiperOrigin-RevId: 551347216

update internal files for release

1c26b1c

PiperOrigin-RevId: 551353292

Added a workaround for broadcast.

6021733

PiperOrigin-RevId: 551401683

[PJRT] Add PjRtDevice::PoisonExecution.

54eeb2a

PiperOrigin-RevId: 551408554

[IFRT] Update ShardingParam to also support scalars.

333bb69

PiperOrigin-RevId: 551410772

ezhulenev and others added 27 commits July 31, 2023 11:26

[xla:gpu][iree] Add support for compiled ops with multiple kernels + …

43f97d5

…memcpy API call hlo.sort operation compiled to a memcpy + a sequence of device kernel launches PiperOrigin-RevId: 552539521

Change CHECK to explicit error.

2a0b819

PiperOrigin-RevId: 552543030

[xla:runtime] Fix a bug in encoding memrefs with dynamic offset

39c23eb

Use memref descriptor to get offset if we do not know it at compile time. PiperOrigin-RevId: 552554429

Merge pull request tensorflow#60898 from psunn:matmul_psunn

ea3b03a

PiperOrigin-RevId: 552555470

Cleanup for XLA Outside Compilation in Cloud TPU VMs

3cdfbc6

PiperOrigin-RevId: 552562345

[XLA:TPU] HLO flattening fix for SPMD graphs that have outfeed.

c265e85

PiperOrigin-RevId: 552565337

[XLA] Make PatternMatchMergeSharding and PatternMatchUnmergeSharding …

691b916

…symmetric. PiperOrigin-RevId: 552566803

[TF:PJRT] Allow DEVICE_GPU to use PJRT for XlaCompileOnDemand op.

08a1842

PiperOrigin-RevId: 552568564

Add sample stable delegate code for nested control flow support.

915d0fb

A total of three new ops are added: Mul, Equal, and While. The control flow op works for one float32 input only. PiperOrigin-RevId: 552571044

Add predicate-based wait method to tsl::condition_variable.

fbe8e76

This is needed by the DUCC FFT library in order to use `tsl::condition_variable` as a direct replacement for `std::condition_variable`. PiperOrigin-RevId: 552595622

[XLA] Add device_memory_size option to ExecutableBuildOptions.

9a389b7

- Add an option to provide XLA the device memory limit to use - Plumb that to HloModuleConfig through different objects PiperOrigin-RevId: 552596103

[xla:gpu] Add readme + style recommendation for experimental backend

872f84d

PiperOrigin-RevId: 552605248

Adds saved model default input support in TF.

bfc143e

PiperOrigin-RevId: 552615923

Remove reference to deprecated "long" type.

82077d5

PiperOrigin-RevId: 552619308

#tf-data Promote "file_locality_v2" experiment to job-level.

04465a6

PiperOrigin-RevId: 552619840

[xla] Make Send a control predecessor of Recv-done in the generated S…

bc68c98

…end-Recv sequence. This is to prevent the latency hiding scheduler to interleave two Send-Recv sequences. PiperOrigin-RevId: 552621536

Fixed conflict in graph_execution_options wrapper.

cd1cdc8

PiperOrigin-RevId: 552621643

Remove setting up GCS in FindAndLoadTpuLibrary.

3c4627f

InitializeCreateGcsFileSystemFnPtr is a temporary fix and it is no longer needed. PiperOrigin-RevId: 552624923

[XLA:GPU] Update intercept check of DUS and Copy in LiveRangeRegion A…

73a54a8

…nalysis PiperOrigin-RevId: 552625083

Merge pull request tensorflow#61428 from elfringham:fix_xla_lit

0eed700

PiperOrigin-RevId: 552626000

Update Rendezvous API to not depend on :tf_status

b769fc4

PiperOrigin-RevId: 552631765

Add a device count cache to CudaPlatform.

67dbc78

This removes some unnecessary `cuDeviceGetCount()` calls when custom ops are used. PiperOrigin-RevId: 552634342

Creates an optimization to fuse transpose and reshape into batch_mat…

9d0fea2

…mul. PiperOrigin-RevId: 552636662

When sharding propagation does not return an input sharding for opera…

d1280be

…nd 0 of a gather, assume that the sharding of that operand does not matter. PiperOrigin-RevId: 552637713

Merge remote-tracking branch 'upstream/master' into develop-upstream-…

c7616ae

…sync-230731

weihanmines requested review from i-chaochen and jayfurmanek August 1, 2023 01:05

weihanmines closed this Aug 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop upstream sync 230731 #2169

Develop upstream sync 230731 #2169

weihanmines commented Aug 1, 2023

Develop upstream sync 230731 #2169

Develop upstream sync 230731 #2169

Conversation

weihanmines commented Aug 1, 2023