Document the fastpatch format in a patch file

Fast patch is the patch generated by one pass makepatch / takepatch It’s just moving a subset of the weave in a way that uses the code flow in makepatch and takepatch.

Changes to the patch format

Bump the version number from 1.3 to 1.4

Have only one diffs section per file, after all the delta headers.

Keep the general layout in ver 1.3

== file
delta header +lines -lines
\n
diffs
\n
tag header (to show that tags have empty patches)
\n
\n
delta header +lines -lines
\n
diffs
\n
== anotherfile or sfio or patch checksum
...

But slightly different:

== file
delta header +lines -lines =lines	<- HERE (add =lines)
\n					<- HERE (no diffs)
\n
tag header (to show that tags have empty patches)
\n
\n
delta header +lines -lines =lines	<- HERE (add =lines)
\n
fastpatch diffs				<- HERE (fastpatch)
\n
== anotherfile or sfio or patch checksum
...

New format — only one diffs and it occurs after the last header in a given file section, even if that last header is a tag header.

That is, there is no longer a relationship between a delta header and the diff section. The fastpatch will always be in the last diff slot, which could be near a tag header.

Note: the last diffs can be blank if no data is being sent (such as tags or an exclude or empty merge node).

Also: delta header now includes same field: delta header +added -deleted =same

This is because delta_body used to compute same, which needs a serialmap for that delta. The fastpatch is weaving N deltas, and would need to handle N maps. It could, but that hits a scaling problem: ser_t is 32bit, and linux has 250k serials, so that’s one meg per array. A dump of 5k csets would need 5G for serialmap arrays.

Instead, the same is shipped over from the other repo, then verified locally during the check stage — after the file has been written with all the ported over =same entries. That way, we keep integrity but also get the efficiency of moving the precomputed value in the patch.

New interleaved diff format

If there is a diffs in the last block, it will start with "F\n" and end with "K<chksum> <line-count>\n"

Then a mix of: [DEIN]serial line >data

Example:

F I1 23 >here’s a line >here’s another line D2 25 >and even more E2 25 E1 25 K2235 127

The serial number in the diffs is from the order of the delta and tag headers in the patch. The first delta listed is serial 1. The serials don’t appear anywhere in the header.

The line numbers are data as shown by:

for files:	bk annotate -R..<tiprev> file
for cset:	bk annotate -R`bk repogca -a`..<tiprev> ChangeSet
same as:	bk annotate -R<list of sent revs> ChangeSet

Note: repogca -a does not work in a bugfix repo, only in a 5.0 based one. Likewise, the range code handling multiple gca nodes is only a 5.0 thing.

Where means the tip of the set of deltas being sent. The fastpatch doesn’t require there to be a single tip, but that’s what happens in reality, as well, as to simulate with annotate -R. Our range code doesn’t have a way to specify multiple tips.

Since the patch most likely is shipping the tip in the patch, then the line numbers most of the time would be from all data in the file. This is different than a get, as annotate -R shows deleted lines and get doesn’t.

The checksum is all those lines, and the final line count is those lines as well.

The changeset file doesn’t do the transitive closure to root graph coloring. It just uses the lines sent. The reason for this is the patch would stop working when chaining (or the half version of it: csetshrink) comes online.

Weaving the diff

The SCCS weave has two relationships for placing of commands and blocks of data: before or after an existing text line. That’s straightforward enough. However with this weave, we’re adding in older lines, and having it looked like they were woven oldest to newest. This adds two more functions to movement. When moving before, stop if we get to the desired data token, but also stop if we get to a D command with a higher serial in an active region. This is because the D commands stack up before a data token in serial order:

^AD 12 ^AD 14 ^AD 17 some line that many people deleted.

If we are weaving a D 15, then we stop when we hit D 17.

The other additional move is after. Not only do we move after the desired token, but we see if there are any I or E commands in the weave body that have a larger serial, and skip those. If it is an I command, then we skip over the data block until we find the corresponding E. Then we resume the check. Any D will stop the movement, as D commands associate before the next token. Also any smaller I or E will stop the search.

put new stuff from serial 12 after this token ^AI 17 ^AD 18 skip right over that D because it is an I-E block being ignored ^AE 18 and the rest of the block ^AE 17 stop before here because this is the next line

Read ahead one line

The way the weave walk is setup is to keep a line buffer w→buf, always set to the next line. Before the process is started, a line is read into the buffer. Now the code will look at that line, and if it wants to move past it, will print it out to the new sfile, and then read the next line into the w→buf. If the weave walker decides it wants to stop before the line in the buf, it exits and goes on its merry way, leaving w→buf to be read next time a weave walk is done.

Perf on the takepatch side

Between extractPatch() and sccs_fastWeave() is a pipeline that has been compressed to doing one sccs_init while the file is in the repo, and (if there are any changes to the sfile) one write of the sfile to the RESYNC directory.

Note: that sccs_init is now done with file checksum (no INIT_NOCKSUM) which is different and shouldn’t matter as the whole sfile will be read anyway (unless there is no change to the sfile) so it just means another pass through the file cache.

That sounds efficient (and it is), but the old code has been mostly left in place, (s→iloc and friends) so there are still many seemingly redundant layers between that one read and one write. Fixing it won’t help perf much and will add risk, so saving that for a later day. Also the cset_insert was left N^2 and also leaves undo bubbles in the serials. A merge sort has been prototyped which both is not N^2 and takes out the bubbles, but it didn’t add much perf in the mysql repo, and has no perf for nfs, so it’s left for later.

Note: this one read / one write works also works for old version changeset files, so even for old stuff, there’s a little perf boost of one init and one write.

Another perf feature saved for later is to change takepatch to process patches directly from from the stream. Don’t save the patch to PENDING, and don’t pre-scan the patch to validate the checksum. Just process the incoming patch directly overlapping the file IO with the network IO. Then if the checksum is found to be wrong, delete the RESYNC directory and return an error.