Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use shadow buffers for scanning out and GPU blit engine to copy content before present #82

Open
wants to merge 1 commit into
base: celadon/u/mr0/master
Choose a base branch
from

Conversation

phreer
Copy link
Contributor

@phreer phreer commented Sep 9, 2024

No description provided.

Copy link

@feijiang1 feijiang1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great implementation
@HaihongxLi @manxiaoliang please review together

utils/intel_blit.cpp Outdated Show resolved Hide resolved
char device_path[32];
if (fd >= 0)
return fd;
for (int i = 0; i < 64; ++i) {
Copy link

@feijiang1 feijiang1 Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64 seems too big, normally we didn't have such many devices

hwc2_device/HwcLayer.cpp Outdated Show resolved Hide resolved
@sysopenci
Copy link

Improper Commit Message
Tracked on not found in commit message,
make sure Tracked-On: Jira-ticket is present.

@sysopenci
Copy link

Improper Commit Message
Valid Commit Message
Improper Jira Status,
jira status not in ['In Progress','Implemented']

@phreer
Copy link
Contributor Author

phreer commented Oct 11, 2024

The performance is much more satisfying with XY_FAST_COPY_BLT instruction now. Thus I suppose we can merge it if it won't lead to any regression.

@HaihongxLi @manxiaoliang @feijiang1 @xzhan34 Please review it.

Copy link

@xzhan34 xzhan34 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

utils/intel_blit.cpp Outdated Show resolved Hide resolved
Copy link

@feijiang1 feijiang1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@xzhan34 xzhan34 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

To get best performance we must guarantee that scan-out buffers used for
composition in surfaceflinger reside in GPU local memory, but importing
these buffers into virtio-GPU will migrate the buffers from local memory
to system memory, which will highly impact the performance. To avoid
migration of these client-composited buffers, allocate a shadow buffer
for each of them and import the shadow buffers into virtio-GPU for
scanning-out. Right before atomic commit, leverage GPU blit engine to
copy content to shadow buffer.

Use shadow buffers only when feature ALLOW_P2P of virtio-GPU is not present
and dGPU exists.

There are several GPU instructions to blit memory:
- XY_FAST_COPY_BLT (BSpec: 47982),
- XY_SRC_COPY_BLT (BSpec: 48002),
- XY_BLOCK_COPY_BLT (BSpec: 3678).
By experiment, XY_FAST_COPY is much faster than the other two instructions.

Tracked-On: OAM-124182
Signed-off-by: Weifeng Liu <weifeng.liu@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Engineering Build Not Started Engineering Build Not Started Pending Developer Approval Pending Developer Approval Pending PR Review Pending PR Review Valid commit message
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants