-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the data pull problem when use alluxio as fs. #23815
base: master
Are you sure you want to change the base?
Conversation
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: tanyajun.
|
Could you add a test for this problem ? |
String filePath = fileStatus.getPath(); | ||
String fileName = filePath.substring(filePath.lastIndexOf('/') + 1); | ||
Location location = pathLocation.appendSuffix("/" + fileName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change may lead to failure of some other cases and some tests are not able to pass. But the bug does exist. Still need to find a way to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be a more reasonable solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following tests failed to pass with this change, maybe take a look on these tests? https://github.com/trinodb/trino/actions/runs/11380756207/job/31660701721?pr=23815
Error: Tests run: 24, Failures: 3, Errors: 0, Skipped: 4, Time elapsed: 62.74 s <<< FAILURE! -- in io.trino.filesystem.alluxio.TestAlluxioFileSystem
2024-10-17T02:10:31.425-0600 INFO main io.trino.testing.containers.junit.ReportLeakedContainers ReportLeakedContainers disabled
Error: io.trino.filesystem.alluxio.TestAlluxioFileSystem.testPaths -- Time elapsed: 0.051 s <<< FAILURE!
org.opentest4j.AssertionFailedError:
expected: 35L
but was: 34L
at io.trino.filesystem.AbstractTestTrinoFileSystem.listPath(AbstractTestTrinoFileSystem.java:1370)
at io.trino.filesystem.AbstractTestTrinoFileSystem.testPathHierarchical(AbstractTestTrinoFileSystem.java:763)
at io.trino.filesystem.AbstractTestTrinoFileSystem.testPaths(AbstractTestTrinoFileSystem.java:739)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1458)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2034)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:189)
Error: io.trino.filesystem.alluxio.TestAlluxioFileSystem.testListFiles -- Time elapsed: 0.171 s <<< FAILURE!
org.opentest4j.AssertionFailedError:
expected: 46L
but was: 59L
at io.trino.filesystem.AbstractTestTrinoFileSystem.listPath(AbstractTestTrinoFileSystem.java:1370)
at io.trino.filesystem.AbstractTestTrinoFileSystem.testListFiles(AbstractTestTrinoFileSystem.java:1070)
at io.trino.filesystem.AbstractTestTrinoFileSystem.testListFiles(AbstractTestTrinoFileSystem.java:1001)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1458)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2034)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:189)
Error: io.trino.filesystem.alluxio.TestAlluxioFileSystem.testListLexicographicalOrder -- Time elapsed: 1.436 s <<< FAILURE!
org.opentest4j.AssertionFailedError:
expected: 44L
but was: 43L
at io.trino.filesystem.AbstractTestTrinoFileSystem.listPath(AbstractTestTrinoFileSystem.java:1370)
at io.trino.filesystem.AbstractTestTrinoFileSystem.testListLexicographicalOrder(AbstractTestTrinoFileSystem.java:1207)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1458)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2034)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:189)
[INFO]
[INFO] Results:
[INFO]
Error: Failures:
Error: TestAlluxioFileSystem>AbstractTestTrinoFileSystem.testListFiles:1001->AbstractTestTrinoFileSystem.testListFiles:1070->AbstractTestTrinoFileSystem.listPath:1370
expected: 46L
but was: 59L
Error: TestAlluxioFileSystem>AbstractTestTrinoFileSystem.testListLexicographicalOrder:1207->AbstractTestTrinoFileSystem.listPath:1370
expected: 44L
but was: 43L
Error: TestAlluxioFileSystem>AbstractTestTrinoFileSystem.testPaths:739->AbstractTestTrinoFileSystem.testPathHierarchical:763->AbstractTestTrinoFileSystem.listPath:1370
expected: 35L
but was: 34L
[INFO]
Error: Tests run: 27, Failures: 3, Errors: 0, Skipped: 4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@awkwardd Add the following extractSchema
method to AlluxioUtils
, and update the next()
method in AlluxioFileIterator
can pass the test successfully. Please help update the code in this PR.
public static Location extractSchema(Location location)
{
String path = location.toString();
if (path.startsWith("alluxio://")) {
int index = path.indexOf("/", "alluxio://".length());
if (index != -1) {
String schema = path.substring(0, index);
return Location.of(schema + "/");
}
else {
return location;
}
}
return location;
}
public FileEntry next()
throws IOException
{
if (!hasNext()) {
return null;
}
URIStatus fileStatus = files.next();
String path = fileStatus.getPath();
if (path.startsWith("/")) {
path = path.substring(1);
}
Location schema = extractSchema(dirLocation);
Location fileLocation = schema.appendSuffix(path);
return new FileEntry(
fileLocation,
fileStatus.getLength(),
Instant.ofEpochMilli(fileStatus.getLastModificationTimeMs()),
Optional.empty());
}
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: tanyajun.
|
@awkwardd you may also need to sign the CLA https://trino.io/development/process#contribution-process |
I have signed the cla file and snet it last night. |
@cla-bot check |
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: tanyajun.
|
The cla-bot has been summoned, and re-checked this pull request! |
@awkwardd Please take a look at #23815 (comment) |
Description
When use alluxio as hive fs,cant get data from alluxio.
I finaly found that the AlluxioIterator.java didn't generate correct file location as alluxio://192...../dir/filename but alluxio:///dir/filename, so I modify the code.It works for me.Fix the issue#23778