mirror of
https://github.com/kovidgoyal/kitty
synced 2026-06-06 09:15:57 +02:00
Document the rsync wire formats
This commit is contained in:
@@ -342,7 +342,7 @@ Repeated transfer of large files that have only changed a little between
|
||||
the receiving and sending side can be sped up significantly by transmitting
|
||||
binary deltas of only the changed portions. This protocol has built-in support
|
||||
for doing that. This support uses the `rsync algorithm
|
||||
<https://github.com/librsync/librsync>`__. In this algorithm first the
|
||||
<https://rsync.samba.org/tech_report/tech_report.html>`__. In this algorithm, first the
|
||||
receiving side sends a file signature that contains hashes of blocks
|
||||
in the file. Then the sending side sends only those blocks that have changed.
|
||||
The receiving side applies these deltas to the file to update it till it matches
|
||||
@@ -409,8 +409,80 @@ The client then uses this delta to update the file.
|
||||
The format of signatures and deltas
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
These come from `librsync <https://github.com/librsync/librsync>`__. If this
|
||||
specification gains wider adoption, these formats should be documented here.
|
||||
In what follows, all integers must be encoded in little-endian format,
|
||||
regardless of the architecture of the machines involved. The XXH3 hash family
|
||||
refers to `the xxHash algorithm
|
||||
<https://github.com/Cyan4973/xxHash/blob/dev/doc/xxhash_spec.md>`__.
|
||||
|
||||
A signature first has a 12 byte header of the form:
|
||||
|
||||
.. code::
|
||||
|
||||
uint16 version
|
||||
uint16 checksum_type
|
||||
uint16 strong_hash_type
|
||||
uint16 weak_hash_type
|
||||
uint32 block_size
|
||||
|
||||
These fields define the parameters to the rsync algorithm. Allowed values are
|
||||
currently all zero except for ``block_size``, which is usually the square root
|
||||
of the file size, but implementations are free to use any algorithm they like
|
||||
to arrive at the block size.
|
||||
|
||||
``checksum_type`` must be ``0`` which indicates using the XXH3-128 bit hash
|
||||
to verify file integrity after transmission.
|
||||
|
||||
``strong_hash_type`` must be ``0`` which indicates using the XXH3-64 bit hash
|
||||
to identify blocks.
|
||||
|
||||
``weak_hash_type`` must be ``0`` which indicates using the `rsync rolling
|
||||
checksum hash <https://rsync.samba.org/tech_report/node3.html>`__ to identify
|
||||
blocks, weakly.
|
||||
|
||||
After the header comes the list of block signatures. The number of blocks is
|
||||
unknown allowing for streaming, the transfer protocol takes care of indicating
|
||||
end-of-stream via an ``action=end_data`` packet. Each signature in the list is of the form:
|
||||
|
||||
.. code::
|
||||
|
||||
uint64 index
|
||||
uint32 weak_hash
|
||||
uint64 strong_hash
|
||||
|
||||
Here, ``index`` is the zero-based block number. ``weak_hash`` is the weak, but easy
|
||||
to calculate hash of the block and strong hash is a stronger hash of the block
|
||||
that is very unlikely to collide.
|
||||
|
||||
The algorithms used for these hashes are specified by the signature header
|
||||
above. Given the ``block_size`` from the header and ``index`` the position of a
|
||||
block in the file is: ``index * block_size``.
|
||||
|
||||
Once the sending side receives the signature, it calculates a *delta* based on
|
||||
the actual file contents and transmits that delta to the receiving side. The delta
|
||||
is of the form of a list of *operations*. An operation is a single byte
|
||||
denoting the operation type followed by variable length data depending on the
|
||||
type. The types of operations are:
|
||||
|
||||
``Block (type=0)``
|
||||
Followed by an 8 byte ``uint64`` that is the block index. It means copy the
|
||||
specified block from the existing file to the output, unmodified.
|
||||
|
||||
``Data (type=1)``
|
||||
Followed by a 4 byte ``uint32`` that is the size of the payload and then the
|
||||
payload itself. The payload must be written to the output.
|
||||
|
||||
``Hash (type=2)``
|
||||
Followed by a 2 byte ``uint16`` specifying the size of the hash checksum and
|
||||
then the checksum itself. The checksum of the output file must match this
|
||||
checksum. The algorithm used to calculate the checksum is specified in the
|
||||
signature header.
|
||||
|
||||
``BlockRange (type=3)``
|
||||
Followed by an 8 byte ``uint64`` that is the starting block index and then
|
||||
a 4 byte ``uint32`` (``N``) that is the number of additional blocks. Works just
|
||||
like ``Block`` above, except that after copying the block an additional (``N``) more
|
||||
blocks must be copied.
|
||||
|
||||
|
||||
Compression
|
||||
--------------
|
||||
|
||||
Reference in New Issue
Block a user