Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trino is slow at retrieving data out of it #464

Open
zeddit opened this issue Aug 28, 2023 · 0 comments
Open

trino is slow at retrieving data out of it #464

zeddit opened this issue Aug 28, 2023 · 0 comments
Labels

Comments

@zeddit
Copy link

zeddit commented Aug 28, 2023

Affected version

No response

Current and expected behavior

I am using trino to get some data out of my backend database, and trino is working as a middle ware. something like that db <-> trino <-> client(trino_cli or python client) .

I found the trino is slow when reading data out of it. more specifically, when using trino to read data, the speed of it is significantly slow than reading data from the backend database directly.
trino limits the speed to about 10-20MB/s, while my database could serve 100MB/s per connection.

I think the trino shouldn't be the bottleneck in the data pipeline, otherwise it will block something.

Possible solution

if there is some configuration or debug method that let me find the underlying bottleneck is and to know how to fix it.

I am using the way below to find out that trino blocks the data stream.

I am using trino memory connector to help me with the diagonose.

# step 1. copy data from my database to trino memory, which is the data path for reading data out from the backend database to the trino nodes.
create table memory.default.sf100_lineitem AS select * from xdb.default.sf100_lineitem limit 10000000; # it shows a throughput of about 100MB/s, more precisely, 80-150MB/s

# step 2. read data from trino to the outside, I am using trino cli to test the data
select * from memory.default.sf100_lineitem; # it shows me only a bandwidth of 10MB/s could be achieved. my network is more than 10Gb/s, so it is not blocked by the network.

Additional context

No response

Environment

No response

Would you like to work on fixing this bug?

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant