Skip to content

Commit

Permalink
Undo optimization
Browse files Browse the repository at this point in the history
In Hive coercion tests some test cases produce binary arrays that contain \0.

Previously we were writing Slices representing UTF-8 String by calling .toStringUtf8()
which passed the bytes to the String constructor that would truncate the value at \0.

When directly writing Slice to a JSON output, \0 isn't interpreted as String terminator,
got escaped and appears in the output without truncation.

It seems more correct but breaks previous behaviour.
  • Loading branch information
wendigo committed Nov 1, 2024
1 parent 59c3bef commit fee08fb
Showing 1 changed file with 2 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -263,8 +263,7 @@ public void encode(JsonGenerator generator, ConnectorSession session, Block bloc
return;
}
Slice slice = VARCHAR.getSlice(block, position);
// Optimization: avoid conversion from Slice to String and String to bytes when writing UTF-8 strings
generator.writeUTF8String(slice.byteArray(), slice.byteArrayOffset(), slice.length());
generator.writeString(slice.toStringUtf8());
}
}

Expand All @@ -287,8 +286,7 @@ public void encode(JsonGenerator generator, ConnectorSession session, Block bloc
return;
}
Slice slice = padSpaces(VARCHAR.getSlice(block, position), length);
// Optimization: avoid conversion from Slice to String and String to bytes when writing UTF-8 strings
generator.writeUTF8String(slice.byteArray(), slice.byteArrayOffset(), slice.length());
generator.writeString(slice.toStringUtf8());
}
}

Expand Down

0 comments on commit fee08fb

Please sign in to comment.