Movatterモバイル変換


[0]ホーム

URL:


Fossil

Fossil Delta Format
Login

1.0 Overview

Fossil achieves efficient storage and low-bandwidth synchronizationthrough the use of delta-compression. Instead of storingor transmitting the complete content of an artifact, Fossil stores ortransmits only the changes relative to a related artifact.

This document describes the delta-encoding format used by Fossil.The intended audience is developers working on eitherFossil itself, or on tools compatible withFossil. Understanding of this document isnot required forordinary users of Fossil. This document is an implementation detail.

This document only describes the delta file format. Aseparate document describes one possiblealgorithm for generating deltas in this format.

1.1 Sample Software And Analysis Tools

The core routines used to generate and apply deltas in this formatare contained in thedelta.c source file. Interfacelogic, including "test-*" commands to directly manipulate deltas arecontained in thedeltacmd.c source file. SQL functionsto create, apply, and analyze deltas are implemented by code in thedeltafunc.c source file.

The following command-line tools are available to create and applydeltas and to test the delta logic:

When running thefossil sql command to get aninteractive SQL session connected to the repository, the followingadditional SQL functions are provided:

2.0 Structure

HeaderSegmentsTrailer
    leftmargin = 0.1    box height 50% "Header"    box same "Segments"    box same "Trailer"

A delta consists of three parts, a "header", a "trailer", and a"segment-list" between them.

Both header and trailer provide information about the targethelping the decoder, and the segment-list describes how the target canbe constructed from the original.

2.1 Header

Size"\n"
    leftmargin = 0.1    box height 50% "Size"    box same "\"\\n\""

The header consists of a single number followed by a newlinecharacter (ASCII 0x0a). The number is the length of the target inbytes.

This means that, given a delta, the decoder can compute the size ofthe target (and allocate any necessary memory based on that) by simplyreading the first line of the delta and decoding the number foundthere. In other words, before it has to decode everything else.

2.2 Trailer

Checksum";"
    leftmargin = 0.1    box height 50% "Checksum"    box same "\";\""

The trailer consists of a single number followed by a semicolon (ASCII0x3b). This number is a checksum of the target and can be used by adecoder to verify that the delta applied correctly, reconstructing thetarget from the original.

The checksum is computed by treating the target as a series of32-bit integer numbers (MSB first), and summing these up, modulo2^32-1. A target whose length is not a multiple of 4 is padded with0-bytes (ASCII 0x00) at the end.

By putting this information at the end of the delta a decoder hasit available immediately after the target has been reconstructedfully.

2.3 Segment-List

***Insert LiteralCopy RangeLength":"Bytes
    leftmargin = 0.1    PART1: [        B1: box height 50% width 15% ""        B2: box same ""        B3: box same ""            "***"            box height 50% width 15% ""        I1: line down 50% from B2 .s            arrow right until even with B3.e            box "Insert Literal" height 50%            line down 75% from I1 .s            arrow right until even with B3.e            box "Copy Range" height 50%    ]    down    PART2: [        ""        box "Length" height 50%        right        box "\":\"" same        box "Bytes" same    ] with .nw at previous.sw

The segment-list of a delta describes how to create the target fromthe original by a combination of inserting literal byte-sequences andcopying ranges of bytes from the original. This is where thecompression takes place, by encoding the large common parts oforiginal and target in small copy instructions.

The target is constructed from beginning to end, with the datagenerated by each instruction appended after the data of all previousinstructions, with no gaps.

2.3.1 Insert Literal

A literal is specified by two elements, the size of the literal inbytes, and the bytes of the literal itself.

Length":"Bytes
    leftmargin = 0.1    box "Length" height 50%    box "\":\"" same    box "Bytes" same

The length is written first, followed by a colon character (ASCII0x3a), followed by the bytes of the literal.

2.3.2 Copy Range

A range to copy is specified by two numbers, the offset of thefirst byte in the original to copy, and the size of the range, inbytes. The size zero is special, its usage indicates that the rangeextends to the end of the original.

Length"@"Offset","
    leftmargin = 0.1    box "Length" height 50%    box "\"@\"" same    box "Offset" same    box "\",\"" same

The length is written first, followed by an "at" character (ASCII0x40), then the offset, followed by a comma (ASCII 0x2c).

3.0 Encoding of integers

The format currently handles only 32 bit integer numbers. They arewritten base-64 encoded, MSB first, and without leading"0"-characters, except if they are significant (i.e. 0 => "0").

The base-64 encoding uses one character for each 6 bits ofthe integer to be encoded. The encoding characters are:

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~

The least significant 6 bits of the integer are encoded bythe first character, followed bythe next 6 bits, and so on until all non-zero bits of the integerare encoded. The minimum number of encoding characters is used.Note that for integers less than 10, the base-64 coding is aASCII decimal rendering of the number itself.

4.0 Examples

4.1 Integer encoding

ValueEncoding
00
62461Xb
-11014387702zMM3E

4.2 Delta encoding

An example of a delta using the specified encoding is:

1Xb4E@0,2:thFN@4C,6:scenda1B@Jd,6:scenda5x@Kt,6:pieces79@Qt,F: Example: eskil~E@Y0,2zMM3E;

This can be taken apart into the following parts:

WhatEncodingMeaningDetails
Header1XbSize 6246
S-List4E@0,Copy 270 @ 0
 2:thLiteral 2 'th'
 FN@4C,Copy 983 @ 268
 6:scendaLiteral 6 'scenda'
 1B@Jd,Copy 75 @ 1256
 6:scendaLiteral 6 'scenda'
 5x@Kt,Copy 380 @ 1336
 6:piecesLiteral 6 'pieces'
 79@Qt,Copy 457 @ 1720
 F: Example: eskilLiteral 15 ' Example: eskil'
 ~E@Y0,Copy 4046 @ 2176
Trailer2zMM3EChecksum -1101438770

The unified diff behind the above delta is

bluepeak:(761) ~/Projects/Tcl/Fossil/Devel/devel > diff -u ../DELTA/old ../DELTA/new--- ../DELTA/old        2007-08-23 21:14:40.000000000 -0700+++ ../DELTA/new        2007-08-23 21:14:33.000000000 -0700@@ -5,7 +5,7 @@  *  If the server does not have write permission on the database     file, or on the directory containing the database file (and-    it is thus unable to update database because it cannot create+    it is thus unable to update the database because it cannot create     a rollback journal) then it currently fails silently on a push.     It needs to return a helpful error.@@ -27,8 +27,8 @@  *  Additional information displayed for the "vinfo" page:      +  All leaves of this version that are not included in the-        descendant list.  With date, user, comment, and hyperlink.-        Leaves in the descendant table should be marked as such.+        descendant list.  With date, user, comment, and hyperlink.+        Leaves in the descendant table should be marked as such.         See the compute_leaves() function to see how to find all         leaves.      +  Add file diff links to the file change list.@@ -37,7 +37,7 @@  *  The /xfer handler (for push, pull, and clone) does not do     delta compression.  This results in excess bandwidth usage.-    There are some code in xfer.c that are sketches of ideas on+    There are some pieces in xfer.c that are sketches of ideas on     how to do delta compression, but nothing has been implemented.  *  Enhancements to the diff and tkdiff commands in the cli.@@ -45,7 +45,7 @@     single file.  Allow diffs against any two arbitrary versions,     not just diffs against the current check-out.  Allow     configuration options to replace tkdiff with some other-    visual differ of the users choice.+    visual differ of the users choice. Example: eskil.  *  Ticketing interface (expand this bullet)

Notes


[8]ページ先頭

©2009-2025 Movatter.jp