BitKeeper Nested Programmers Manual

Overview

This is an overview of the nested data structures and its usage on various BitKeeper operations. It is intended for the programmer and as such it delves into details about the usage of the various fields and the algorithms implemented with them. This description is intended to be more of a guide than an accurate description of how it all works. Remember, the ultimate documentation is the code itself.

Data Structures

We’ll start with a brief overview of nested, and the data structures that support it.

A nested collection is a group of repositories, with the 'product' serving as a container for zero or more 'components'. This is reflected in the code in two different data structures in nested.h. The nested structure, which represents the product’s knowledge of the components, and the comp structure which represents knowledge about each individual component.

In addition to the above, BitKeeper has the concept of 'aliases' or named sets of components. These don’t have a data structure associated with them, they behave more like 'tags'. The idea is that there is an API for tagging the individual comp structures with information about whether they are selected by the given alias (or set of aliases) or not.

Each product has a HERE file, which specifies the list of aliases that are supposed to be here. BitKeeper tries very hard to maintain the synchronicity between what is actually here and the contents of the HERE files. You can see the HERE file by running the bk here command or by just running cat BitKeeper/log/HERE.

We’ll take a look first at the nested data structure and give a brief definition of each of the fields and how they are used.

Product

A BitKeeper product is a repository marked with an empty file BitKeeper/log/PRODUCT. The command bk product tells you whether the repository you’re at is a product or not.

Most operations in a nested collection start by getting a nested structure through a call to nested_init(). This function has many modes of operation, some of which are described at the end of this section. But first, let’s take a look at the nested structure, and a brief description of what the fields are for:

{{{#!cplusplus numbers=off struct nested { char here; // the contents of the here file char comps; // addlines of pointers to components char *oldtip; // tip before revs (new tip for undo) char *tip; // newest cset in revs sccs *cset; // cache of cset file project *proj; hash *aliasdb; // lazy init’d aliasdb hash *compdb; // lazy init rk lookup of &n→comp[i] comp *product; // pointer into comps to the product // bits u32 alias:1; // components tagged with c→alias u32 pending:1; // include pending component state u32 freecset:1; // do a sccs_free(cset) in nested_free() u32 fix_idDB:1; // don’t complain about idDB problems }; }}}

here

This is just a lines array of the contents of the HERE file. As a side effect of loading the HERE file, duplicates in it are removed, the PRODUCT special alias is inserted if it was missing, and the capitalization of the PRODUCT alias is fixed if it was wrong (it has to be all caps).

comps

This is a lines array of pointers to each of the components that comprise the nested collection. It is ''all'' of the components, not just the components that are present in this repository. The information comes straight from the product’s ChangeSet file. The product’s own comp structure is included either first or last, depending on whether NESTED_PRODUCTFIRST was passed to nested_init(), but it can always be accessed using the product field as described below.

oldtip

Only set if nested_init() is called with the revs argument. Without NESTED_PULL, this becomes the latest product cset key not in revs. With NESTED_PULL oldtip is the tip of the 'local' changes. Mostly for undo, but used also by push to assert that we are an update-only push. Note in undo, the oldtip is really going to be the new tip. See the section on nested_init().

tip

The revision at the tip of the coloring by revs, or just rev if that is how nested_init() was called. See the section on nested_init()

cset

This is just the sccs * to the product’s ChangeSet file. It is there in order to pass it around to other APIs as to avoid multiple sccs_init()'s. In pull this will be the RESYNC ChangeSet file after takepatch has added in the new remote csets.

proj

A pointer to the product’s proj structure.

aliasdb

The BitKeeper/etc/aliases file. Lazily initialized when needed. E.g. by the alias code.

compdb

Used internally by nested_findKey() to map component rootkey’s to the matching comp structure.

product

Since for many operations we can treat the product as just another component, we keep a comp structure for the product here for easy access. The same struct will be included in the n->comps array.

alias

When we need to see which components are included in an alias, we call the function aliasdb_tag() which will mark each component included in the alias with cp->alias, it will also tag the nested structure with alias to indicate that some components are tagged.

freecset

When we initialize a nested structure by calling nested_init(), we can optionally pass an sccs * for the product’s ChangeSet file. If we do not, nested_init() will sccs_init() the product’s ChangeSet file and keep a pointer to it in the cset field of the nested structure. If nested_init() did the sccs_init() it must free it (sccs_free()) when the nested structure is freed.

fix_idDB

Used internally in nested_init(). If nested_init() was told that it’s okay to fix the idDB, this fact is remembered here so that we can fix it later.

The main mechanism through which we obtain a nested structure is by calling the nested_init() function. This function is described next.

{{{#!cplusplus numbers=off nested *nested_init(sccs *cset, char *rev, char **revs, u32 flags); }}}

cset

Optional sccs * of the product’s ChangeSet file. If omitted (NULL) nested_init() will sccs_init() the product’s ChangeSet file and cache the sccs *. If provided, it will not be freed by nested_free() and ''must'' be freed by the caller.

rev

If present, nested_init() will return a nested structure as it existed at rev time. This parameter is mutually exclusive with revs, which is described below. It will be passed to sccs_findrev() so it can take any value sccs_findrev() can handle (e.g. "+" for the tip revision, etc).

revs

If present, a list of revisions (in a lines array) which will be colored and tagged (with the included field in the comp structure, see below). Mutually exclusive with rev. This field is mainly used by bk pull and bk undo which operate on regions of the graph. The region of the graph must have a single tip and when the region and it’s descendents are removed from the graph a single tip must remain. The nested_init() will fail if these properties are not true.

flags

Many tweaks to how nested_init() operates, these are described below.

{{{#!cplusplus numbers=off #define NESTED_PENDING 0x10000000 /* included pending comps / #define NESTED_PULL 0x20000000 / This is a pull / #define NESTED_PRODUCTFIRST 0x40000000 / c→p first in c→comps / #define NESTED_MARKPENDING 0x01000000 / set c→pending / #define NESTED_FIXIDCACHE 0x02000000 / no error for bad idDB */ }}}

NESTED_PENDING

When initializing a nested structure, include any new 'pending' components that have not yet been added to the product’s ChangeSet file. Also for existing components that have pending csets not included in the product, set c->pending. Mutually exclusive with either rev or revs since we are not looking at the nested collection as of any revision.

NESTED_PULL

Used by bk pull when processing the RESYNC/ChangeSet file to indicate that we have three regions that we care about, the 'common region', the 'remote region', and the 'local region'. In this case, oldtip will be set as the tip of the local side, tip as the tip of the remote side, and each of the components will be tagged according to which region they belong to. For instance, components that appear in the GCA region, will be tagged as 'not new', components that appear in the REMOTE region will be tagged as 'included' and components that appear in the LOCAL region will be tagged as 'with local changes'. These three tags are c->new, c->included, and c->localchanges correspondingly in the comp structure. This also changes the meaning of c->lowerkey in the comp structure to be the tipkey of the local repository for this component, while c->deltakey is the tip of the remote component.

NESTED_PRODUCTFIRST

Put the product at the beginning of the comps list. The default is to put it last.

NESTED_MARKPENDING

Whether to tag components that have pending changes with c->pending or not. The NESTED_PENDING mode above does this by default. The nested_populate() function requires that pending components be marked.

NESTED_FIXIDCACHE

Whether it is okay to fix the idDB cache for moved components or we should error out.

Caching of the nested struct

Whenever nested_init() is called with both the rev and revs parameters set to zero (meaning the tip of the product’s ChangeSet file), the information will be loaded from and written to a cache kept in BitKeeper/log/nested_init.cache.

Component

Information about components is kept in the comp structure. A list of the components is kept in the nested structure’s comps field.

{{{#!cplusplus numbers=off typedef struct { nested *n; // backpointer char *rootkey; // rootkey of the repo char *deltakey; // deltakey of repo as of rev char *lowerkey; // in pull, local tip // otherwise, gca tip char *path; // actual path: like GFILE, not DPN

void	*data;			// scratchpad for applications
	// bits
	u32	alias:1;		// in the latest alias
	u32	included:1;		// component modified in 'revs'
	u32	localchanges:1;		// component modified outside 'revs'
	u32	new:1;			// if set, the undo will remove this
	u32	present:1;		// if set, the repo is actually here
	u32	product:1;		// this is the product
	u32	remotePresent:1;	// scratch for remote present bit
	u32	pending:1;		// has pending csets not in product
} comp;
}}}
n

Back-pointer to the nested structure we belong t.

rootkey

Rootkey of this component.

deltakey

Tip key of this component, if revs were passed to nested_init() this is the tip of the region implied by revs. In the case of bk pull, it is the tip of this component in the remote side.

lowerkey

When pulling, it is the tip of the local region.