Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop upstream sync 230731 #2170

Merged
merged 453 commits into from
Aug 7, 2023
Merged

Conversation

weihanmines
Copy link

No description provided.

tensorflower-gardener and others added 30 commits July 26, 2023 09:08
…in global or local view.

If the attribute is set on a CallOp, then verification logic converts the programs arguments and results from local view to global view to verify that local view shape + sharding is equivalent to the expected global view shape.

PiperOrigin-RevId: 551222813
Updates LLVM usage to match
[365d6eb1f7d8](llvm/llvm-project@365d6eb1f7d8)

PiperOrigin-RevId: 551229328
…ion/configuration out of experimental.

PiperOrigin-RevId: 551235514
Also fix typo in SetAllowBufferHandleOutput comment: false->true.
Also fix #include order to match style guide.

PiperOrigin-RevId: 551247708
PiperOrigin-RevId: 551261650
`TF_STATUS_ASSIGN_OR_RETURN` and `TF_STATUS_RETURN_IF_ERROR`

PiperOrigin-RevId: 551278625
…_heuristics

PiperOrigin-RevId: 551297374
This CL will add patterns to fold Transpose and FC to covert into a BMM, like below-

FC(lhs, Transpose(rhs)) -> BMM(lha, rhs, false, false)

The right thing to do in this pattern will be to apply the pattern only if keep_num_dims==True. Because, if the output rank is less-than the input rank, it means `keep_num_dims` has reduced the output. But checking for rank will improve the coverage. This pattern will now work

PiperOrigin-RevId: 551297769
To improve debuggability, we want the shape refinement to make as few changes as possible to the module. In this change we remove one use of inlining.

PiperOrigin-RevId: 551325242
PiperOrigin-RevId: 551347216
PiperOrigin-RevId: 551353292
…ac compiler error

Apparently ssize_t is only a long sometimes (at least 32-bit), instead
of long long (at least 64-bit). I don't have a mac so I can't repro
the failing build, but hopefully this fixes it based on the error
message.

PiperOrigin-RevId: 551376003
PiperOrigin-RevId: 551401683
PiperOrigin-RevId: 551408554
The BFS algorithm didn't have a visited set, therefore had a complexity of O(N*E).

PiperOrigin-RevId: 551414282
…sion pattern.

This change implements a conversion pattern that converts stablehlo.convolution to tfl.conv_2d.
This is a minimal version that converts quantized `stablehlo.convolution` with certain assumptions like that the filter has the format of `[0, 1, i, o]`.

PiperOrigin-RevId: 551419638
…ll on SplitShardingDimension.

PiperOrigin-RevId: 551438695
This is in preparation for another change improving the state of copies
in while loops.

PiperOrigin-RevId: 551451818
bixia1 and others added 12 commits July 31, 2023 16:21
…end-Recv

sequence.

This is to prevent the latency hiding scheduler to interleave two Send-Recv
sequences.

PiperOrigin-RevId: 552621536
InitializeCreateGcsFileSystemFnPtr is a temporary fix and it is no longer needed.

PiperOrigin-RevId: 552624923
This removes some unnecessary `cuDeviceGetCount()` calls when custom ops are used.

PiperOrigin-RevId: 552634342
…c in tf.constant according to auto dtype conversion semantics.

WeakTensor is created if it satisfies both of the following conditions:
1. tf.constant is called with no dtype arg specified.
2. Input is a nested Python type.

PiperOrigin-RevId: 552634845
…nd 0 of a gather, assume that the sharding of that operand does not matter.

PiperOrigin-RevId: 552637713
@i-chaochen
Copy link

Since I did the cover here https://github.com/openxla/xla/pull/4603/files#diff-fc02eb6aea06ad0d72011e9a64da28e8c6fae000a9e0cab9b15b40b4e914af4aR956-R958

could you try to enable triton-softmax flag to have a go, please?
da2cefb#diff-04cb485c1774fda54f8346ece0ade9efbcd813235768905b234d596c22660cf6R170

we need to see are any other unit tests affected on that flag.

Copy link

@i-chaochen i-chaochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since gpu.graph PR is merged, you should enable xla_graph_level = 1

@weihanmines
Copy link
Author

Since gpu.graph PR is merged, you should enable xla_graph_level = 1

turn it on in the latest commit.

@weihanmines
Copy link
Author

Since I did the cover here https://github.com/openxla/xla/pull/4603/files#diff-fc02eb6aea06ad0d72011e9a64da28e8c6fae000a9e0cab9b15b40b4e914af4aR956-R958

could you try to enable triton-softmax flag to have a go, please? da2cefb#diff-04cb485c1774fda54f8346ece0ade9efbcd813235768905b234d596c22660cf6R170

we need to see are any other unit tests affected on that flag.

turn it on in the latest commit.

@weihanmines
Copy link
Author

Jenkins: retest Ubuntu-GPU-single please

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is added in tensorflow@56f261b we need a ticket to track this test

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@weihanmines weihanmines merged commit 7a70b16 into develop-upstream Aug 7, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.