Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(devmanual): DB clusters and read/write split #11492

Merged
merged 1 commit into from
Feb 2, 2024

Conversation

ChristophWurst
Copy link
Member

@ChristophWurst ChristophWurst commented Feb 1, 2024

Copy link
Member

@juliusknorr juliusknorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice writeup 💪

Two small comments inline, but looks good in general

2. **Avoid the read operation**. If the code allows it, avoid the read operation all together. You should know what was just written. If you need the auto increment ID, use the database's *last insert ID* feature. Proceed with this data, pass it to event listeners, etc. This approach guarantees consistency, too, but also improves overall performance.

.. note::
Nextcloud can help you identify read after write without the need to set up a cluster for your development environment. If you change the loglevel to 0 (debug), dirty reads will trigger a log entry. Monitor the log when testing your code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment that the log is based on the table so there might be e false positives


There are two patterns to avoid the "dirty" read:

1. **Wrap the write+read operation in a transaction**. Nextcloud's read/write split, but also other database cluster load balancers will ensure that the queries of a transactions go to one single database node of a cluster. That ensures that data written is instantly available to be read back. This approach guarantees consistency, but puts additional load on the primary node because it has to execute the read operation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention that depending on where the queries happen this might not be an option, as usually you'd want to keep the transaction time as low as possible?

Signed-off-by: Christoph Wurst <christoph@winzerhof-wurst.at>
@ChristophWurst ChristophWurst force-pushed the feat/devmanual/performance-database-cluster branch from 90e8900 to d47461a Compare February 2, 2024 11:21
@ChristophWurst ChristophWurst merged commit 0b9e227 into master Feb 2, 2024
11 checks passed
@ChristophWurst ChristophWurst deleted the feat/devmanual/performance-database-cluster branch February 2, 2024 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

Primary/Secondary choosing querybuilder
2 participants