Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Commit

Permalink
General cleanups.
Browse files Browse the repository at this point in the history
1. Fix a few typos/grammar nits in README.md.
2. Fix broken javadoc target in gradle.
3. Bsdiff: Add checks to long running algorithms to properly handle
thread interrupts (thanks, admo@google.com)
4. Bsdiff: Performance improvements (thanks, Samuel Huang)
  • Loading branch information
andrewhayden committed Oct 5, 2016
1 parent c312b62 commit b6093c5
Show file tree
Hide file tree
Showing 29 changed files with 358 additions and 698 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ By design, **File-by-File patches are uncompressed**. This allows freedom in cho
> *Note: Archive-patcher does not currently handle 'zip64' archives (archives supporting more than 65,535 files or containing files larger than 4GB in size).*
# How It Works
Archive-patcher **transforms** archives into a **delta-friendly space** to generate and apply a delta. This transformation involves uncompressing the compressed content the has changed, while leaving everything else alone. The patch applier then recompresses the content that has changed to create a perfect binary copy of the original input file. In v1, bsdiff is the delta algorithm used within the delta-friendly space. Much more information on this subject is available in the [Appendix](#appendix).
Archive-patcher **transforms** archives into a **delta-friendly space** to generate and apply a delta. This transformation involves uncompressing the compressed content that has changed, while leaving everything else alone. The patch applier then recompresses the content that has changed to create a perfect binary copy of the original input file. In v1, bsdiff is the delta algorithm used within the delta-friendly space. Much more information on this subject is available in the [Appendix](#appendix).

Diagrams and examples follow. In these examples we will use an old archive and a new archive, each containing 3 files: foo.txt, bar.xml, and baz.lib:

Expand All @@ -68,7 +68,7 @@ File-by-File v1: Patch Generation Overview
Delta-Friendly Delta-Friendly
Old Archive Old Blob New Blob New Archive
---------------- ---------------- ---------------- ----------------
---------------- ---------------- ---------------- ----------------
| foo.txt | | foo.txt | | foo.txt | | foo.txt |
| version 1 | | version 1 | | version 2 | | version 2 |
| (compressed) | |(uncompressed)| |(uncompressed)| | (compressed) |
Expand Down Expand Up @@ -263,7 +263,7 @@ The number of these entries is determined by the "Num new archive recompression

* Entries must be ordered in ascending order by offset. This allows the output from the delta apply process (which creates the delta-friendly new blob) to be piped to an intelligent partially-compressing stream that is seeded with the knowledge of which ranges to recompress and the settings to use for each. This avoids the need to write the delta-friendly new blob to persistent storage, an important optimization.
* Entries must not overlap (for sanity)
* Areas of the new archive that are not included in any recompression op will be copied through from the delta-friendly old blob without modification. These represent arbitrary data that should **not** be compressed, such as zip structural components or blocks of data that are stored without compression in the new archive.
* Areas of the new archive that are not included in any recompression op will be copied through from the delta-friendly new blob without modification. These represent arbitrary data that should **not** be compressed, such as zip structural components or blocks of data that are stored without compression in the new archive.

```
|------------------------------------------------------|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,11 +92,12 @@ public PatchExplainer(Compressor compressor, DeltaGenerator deltaGenerator) {
* @param recommendationModifier optionally, a {@link RecommendationModifier} to use during patch
* planning. If null, a normal patch is generated.
* @return a list of the explanations for each entry that would be
* @throws IOException
* @throws IOException if unable to read data
* @throws InterruptedException if any thread interrupts this thread
*/
public List<EntryExplanation> explainPatch(
File oldFile, File newFile, RecommendationModifier recommendationModifier)
throws IOException {
throws IOException, InterruptedException {
List<EntryExplanation> result = new ArrayList<>();

// Isolate entries that are only found in the new archive.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -192,10 +192,9 @@ public long getEstimatedResourceConstrainedSize() {
}

/**
* Writes a JSON representation of the data to the specified {@link PrintWriter}.
* The data has the following form:
* <code>
* <br>&lbrace;
* Writes a JSON representation of the data to the specified {@link PrintWriter}. The data has the
* following form: <code>
* <br>{
* <br>&nbsp;&nbsp;estimatedNewSize = &lt;number&gt;,
* <br>&nbsp;&nbsp;estimatedChangedSize = &lt;number&gt;,
* <br>&nbsp;&nbsp;explainedAsNew = [
Expand All @@ -207,15 +206,15 @@ public long getEstimatedResourceConstrainedSize() {
* <br>&nbsp;&nbsp;explainedAsUnchangedOrFree = [
* <br>&nbsp;&nbsp;&nbsp;&nbsp;&lt;entry_list&gt;
* <br>&nbsp;&nbsp;]
* <br>&rbrace;
* </code>
* <br>Where <code>&lt;entry_list&gt;</code> is a list of zero or more entries of the following
* form:
* <br>}
* </code> <br>
* Where <code>&lt;entry_list&gt;</code> is a list of zero or more entries of the following form:
* <code>
* <br>&lbrace; path: '&lt;path_string&gt;', isNew: &lt;true|false&gt;,
* <br>{ path: '&lt;path_string&gt;', isNew: &lt;true|false&gt;,
* reasonIncluded: &lt;undefined|'&lt;reason_string'&gt;, compressedSizeInPatch: &lt;number&gt;
* &rbrace;
* }
* </code>
*
* @param writer the writer to write the JSON to
*/
public void writeJson(PrintWriter writer) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ public void tearDown() {
}

@Test
public void testExplainPatch_CompressedBytesIdentical() throws IOException {
public void testExplainPatch_CompressedBytesIdentical() throws Exception {
byte[] bytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_LEVEL_6));
save(bytes, oldFile);
save(bytes, newFile);
Expand All @@ -172,7 +172,7 @@ public void testExplainPatch_CompressedBytesIdentical() throws IOException {
}

@Test
public void testExplainPatch_CompressedBytesChanged_UncompressedUnchanged() throws IOException {
public void testExplainPatch_CompressedBytesChanged_UncompressedUnchanged() throws Exception {
byte[] oldBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_LEVEL_6));
byte[] newBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_LEVEL_9));
save(oldBytes, oldFile);
Expand All @@ -189,7 +189,7 @@ public void testExplainPatch_CompressedBytesChanged_UncompressedUnchanged() thro
}

@Test
public void testExplainPatch_CompressedBytesChanged_UncompressedChanged() throws IOException {
public void testExplainPatch_CompressedBytesChanged_UncompressedChanged() throws Exception {
byte[] oldBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_LEVEL_6));
byte[] newBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A2_LEVEL_9));
save(oldBytes, oldFile);
Expand All @@ -215,7 +215,7 @@ public void testExplainPatch_CompressedBytesChanged_UncompressedChanged() throws

@Test
public void testExplainPatch_CompressedBytesChanged_UncompressedChanged_Limited()
throws IOException {
throws Exception {
// Just like above, but this time with a TotalRecompressionLimit that changes the result.
TotalRecompressionLimiter limiter = new TotalRecompressionLimiter(1); // 1 byte limit!
byte[] oldBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_LEVEL_6));
Expand Down Expand Up @@ -244,7 +244,7 @@ public void testExplainPatch_CompressedBytesChanged_UncompressedChanged_Limited(
}

@Test
public void testExplainPatch_BothEntriesUncompressed_BytesUnchanged() throws IOException {
public void testExplainPatch_BothEntriesUncompressed_BytesUnchanged() throws Exception {
byte[] oldBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_STORED));
byte[] newBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_STORED));
save(oldBytes, oldFile);
Expand All @@ -261,7 +261,7 @@ public void testExplainPatch_BothEntriesUncompressed_BytesUnchanged() throws IOE
}

@Test
public void testExplainPatch_BothEntriesUncompressed_BytesChanged() throws IOException {
public void testExplainPatch_BothEntriesUncompressed_BytesChanged() throws Exception {
byte[] oldBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_STORED));
byte[] newBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A2_STORED));
save(oldBytes, oldFile);
Expand All @@ -285,7 +285,7 @@ public void testExplainPatch_BothEntriesUncompressed_BytesChanged() throws IOExc
}

@Test
public void testExplainPatch_CompressedChangedToUncompressed() throws IOException {
public void testExplainPatch_CompressedChangedToUncompressed() throws Exception {
byte[] oldBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_LEVEL_9));
byte[] newBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_STORED));
save(oldBytes, oldFile);
Expand All @@ -308,7 +308,7 @@ public void testExplainPatch_CompressedChangedToUncompressed() throws IOExceptio
}

@Test
public void testExplainPatch_UncompressedChangedToCompressed() throws IOException {
public void testExplainPatch_UncompressedChangedToCompressed() throws Exception {
byte[] oldBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_STORED));
byte[] newBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_LEVEL_6));
save(oldBytes, oldFile);
Expand All @@ -331,7 +331,7 @@ public void testExplainPatch_UncompressedChangedToCompressed() throws IOExceptio
}

@Test
public void testExplainPatch_Unsuitable() throws IOException {
public void testExplainPatch_Unsuitable() throws Exception {
byte[] oldBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_STORED));
byte[] newBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_LEVEL_6));
save(oldBytes, oldFile);
Expand Down Expand Up @@ -365,7 +365,7 @@ public void testExplainPatch_Unsuitable() throws IOException {
}

@Test
public void testExplainPatch_NewFile() throws IOException {
public void testExplainPatch_NewFile() throws Exception {
byte[] oldBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_A1_LEVEL_6));
byte[] newBytes = UnitTestZipArchive.makeTestZip(Collections.singletonList(ENTRY_B_LEVEL_6));
save(oldBytes, oldFile);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
import com.google.archivepatcher.shared.MultiViewInputStreamFactory;
import com.google.archivepatcher.shared.RandomAccessFileInputStream;
import com.google.archivepatcher.shared.RandomAccessFileInputStreamFactory;

import java.io.File;
import java.io.IOException;
import java.io.InputStream;
Expand Down Expand Up @@ -108,16 +107,18 @@ public List<DivinationResult> divineDeflateParameters(File archiveFile) throws I
}
return results;
}

/**
* Returns an unmodifiable map whose keys are deflate strategies and whose values are the levels
* that make sense to try with the corresponding strategy, in the recommended testing order.
*
* <ul>
* <li>For strategy 0, levels 1 through 9 (inclusive) are included.</li>
* <li>For strategy 0, levels 1 through 9 (inclusive) are included.
* <li>For strategy 1, levels 4 through 9 (inclusive) are included. Levels 1, 2 and 3 are
* excluded because they behave the same under strategy 0.</li>
* excluded because they behave the same under strategy 0.
* <li>For strategy 2, only level 1 is included because the level is ignored under strategy 2.
* </li>
* </ul>
*
* @return such a mapping
*/
protected Map<Integer, List<Integer>> getLevelsByStrategy() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,9 @@ public interface DeltaGenerator {
* @param newBlob the new blob
* @param deltaOut the stream to write the delta to
* @throws IOException in the event of an I/O error reading the input files or writing to the
* delta output stream
* delta output stream
* @throws InterruptedException if any thread has interrupted the current thread
*/
public void generateDelta(File oldBlob, File newBlob, OutputStream deltaOut) throws IOException;
public void generateDelta(File oldBlob, File newBlob, OutputStream deltaOut)
throws IOException, InterruptedException;
}
Original file line number Diff line number Diff line change
Expand Up @@ -50,19 +50,21 @@ public FileByFileV1DeltaGenerator(RecommendationModifier recommendationModifier)
}

/**
* Generate a V1 patch for the specified input files and write the patch to the specified
* {@link OutputStream}. The written patch is <em>raw</em>, i.e. it has not been compressed.
* Compression should almost always be applied to the patch, either right in the specified
* {@link OutputStream} or in a post-processing step, prior to transmitting the patch to the
* patch applier.
* Generate a V1 patch for the specified input files and write the patch to the specified {@link
* OutputStream}. The written patch is <em>raw</em>, i.e. it has not been compressed. Compression
* should almost always be applied to the patch, either right in the specified {@link
* OutputStream} or in a post-processing step, prior to transmitting the patch to the patch
* applier.
*
* @param oldFile the original old file to read (will not be modified)
* @param newFile the original new file to read (will not be modified)
* @param patchOut the stream to write the patch to
* @throws IOException if unable to complete the operation due to an I/O error
* @throws InterruptedException if any thread has interrupted the current thread
*/
@Override
public void generateDelta(File oldFile, File newFile, OutputStream patchOut)
throws IOException {
throws IOException, InterruptedException {
try (TempFileHolder deltaFriendlyOldFile = new TempFileHolder();
TempFileHolder deltaFriendlyNewFile = new TempFileHolder();
TempFileHolder deltaFile = new TempFileHolder();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
import com.google.archivepatcher.shared.JreDeflateParameters;
import com.google.archivepatcher.shared.PatchConstants;
import com.google.archivepatcher.shared.TypedRange;

import java.io.BufferedInputStream;
import java.io.DataOutputStream;
import java.io.File;
Expand Down Expand Up @@ -52,9 +51,15 @@ public class PatchWriter {

/**
* Creates a new patch writer.
* @param plan
* @param deltaFriendlyOldFileSize
* @param deltaFile
*
* @param plan the patch plan
* @param deltaFriendlyOldFileSize the expected size of the delta-friendly old file, provided as a
* convenience for the patch <strong>applier</strong> to reserve space on the filesystem for
* applying the patch
* @param deltaFriendlyNewFileSize the expected size of the delta-friendly new file, provided for
* forward compatibility
* @param deltaFile the delta that transforms the old delta-friendly file into the new
* delta-friendly file
*/
public PatchWriter(
PreDiffPlan plan,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ private PreDiffExecutor(
/**
* Prepare resources for diffing and returns the completed plan.
*
* @return the plan
* @throws IOException if unable to complete the operation due to an I/O error
*/
public PreDiffPlan prepareForDiffing() throws IOException {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -122,9 +122,14 @@ static Match searchForMatch(
final int pivot = oldDataRangeStartA + (rangeLength / 2);
groupArray.seekToIntAligned(pivot);
final int groupArrayPivot = groupArray.readInt();
final int compareLength =
Math.min((int) oldData.length() - groupArrayPivot, (int) newData.length() - newStart);
if (BsUtil.memcmp(oldData, groupArrayPivot, newData, newStart, compareLength) < 0) {
if (BsUtil.lexicographicalCompare(
oldData,
groupArrayPivot,
(int) oldData.length() - groupArrayPivot,
newData,
newStart,
(int) newData.length() - newStart)
< 0) {
return searchForMatch(groupArray, oldData, newData, newStart, pivot, oldDataRangeStartB);
}
return searchForMatch(groupArray, oldData, newData, newStart, oldDataRangeStartA, pivot);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
package com.google.archivepatcher.generator.bsdiff;

import com.google.archivepatcher.generator.DeltaGenerator;

import java.io.File;
import java.io.IOException;
import java.io.OutputStream;
Expand All @@ -31,7 +30,8 @@ public class BsDiffDeltaGenerator implements DeltaGenerator {
private static final int MATCH_LENGTH_BYTES = 16;

@Override
public void generateDelta(File oldBlob, File newBlob, OutputStream deltaOut) throws IOException {
public void generateDelta(File oldBlob, File newBlob, OutputStream deltaOut)
throws IOException, InterruptedException {
BsDiffPatchWriter.generatePatch(oldBlob, newBlob, deltaOut, MATCH_LENGTH_BYTES);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ class BsDiffMatcher implements Matcher {
}

@Override
public Matcher.NextMatch next() throws IOException {
public Matcher.NextMatch next() throws IOException, InterruptedException {
RandomAccessObject oldData = mOldData;
RandomAccessObject newData = mNewData;

Expand All @@ -97,6 +97,9 @@ public Matcher.NextMatch next() throws IOException {
int matchesCacheSize = 0;

while (mNewPos < newData.length()) {
if (Thread.interrupted()) {
throw new InterruptedException();
}
BsDiff.Match match =
BsDiff.searchForMatch(mGroupArray, oldData, newData, mNewPos, 0, (int) oldData.length());
mOldPos = match.start;
Expand Down
Loading

0 comments on commit b6093c5

Please sign in to comment.