BitKeeper Nested Programmers Manual
Overview
This is an overview of the nested data structures and its usage on various BitKeeper operations. It is intended for the programmer and as such it delves into details about the usage of the various fields and the algorithms implemented with them. This description is intended to be more of a guide than an accurate description of how it all works. Remember, the ultimate documentation is the code itself.
Data Structures
We’ll start with a brief overview of nested, and the data structures that support it.
A nested collection is a group of repositories, with the 'product'
serving as a container for zero or more 'components'. This is
reflected in the code in two different data structures in
nested.h. The nested structure, which represents the product’s
knowledge of the components, and the comp structure which represents
knowledge about each individual component.
In addition to the above, BitKeeper has the concept of 'aliases' or
named sets of components. These don’t have a data structure associated
with them, they behave more like 'tags'. The idea is that there is
an API for tagging the individual comp structures with information
about whether they are selected by the given alias (or set of aliases)
or not.
Each product has a HERE file, which specifies the list of aliases
that are supposed to be here. BitKeeper tries very hard to maintain
the synchronicity between what is actually here and the contents of
the HERE files. You can see the HERE file by running the bk here
command or by just running cat BitKeeper/log/HERE.
We’ll take a look first at the nested data structure and give a brief definition of each of the fields and how they are used.
Product
A BitKeeper product is a repository marked with an empty file
BitKeeper/log/PRODUCT. The command bk product tells you whether
the repository you’re at is a product or not.
Most operations in a nested collection start by getting a nested
structure through a call to nested_init(). This function has many
modes of operation, some of which are described at the end of this
section. But first, let’s take a look at the nested structure, and a
brief description of what the fields are for:
{{{#!cplusplus numbers=off struct nested { char here; // the contents of the here file char comps; // addlines of pointers to components char *oldtip; // tip before revs (new tip for undo) char *tip; // newest cset in revs sccs *cset; // cache of cset file project *proj; hash *aliasdb; // lazy init’d aliasdb hash *compdb; // lazy init rk lookup of &n→comp[i] comp *product; // pointer into comps to the product // bits u32 alias:1; // components tagged with c→alias u32 pending:1; // include pending component state u32 freecset:1; // do a sccs_free(cset) in nested_free() u32 fix_idDB:1; // don’t complain about idDB problems }; }}}
- here
-
This is just a
linesarray of the contents of the HERE file. As a side effect of loading theHEREfile, duplicates in it are removed, thePRODUCTspecial alias is inserted if it was missing, and the capitalization of thePRODUCTalias is fixed if it was wrong (it has to be all caps). - comps
-
This is a
linesarray of pointers to each of the components that comprise the nested collection. It is ''all'' of the components, not just the components that are present in this repository. The information comes straight from the product’s ChangeSet file. The product’s owncompstructure is included either first or last, depending on whetherNESTED_PRODUCTFIRSTwas passed tonested_init(), but it can always be accessed using theproductfield as described below. - oldtip
-
Only set if
nested_init()is called with therevsargument. WithoutNESTED_PULL, this becomes the latest product cset key not inrevs. WithNESTED_PULLoldtip is the tip of the 'local' changes. Mostly forundo, but used also bypushto assert that we are an update-only push. Note in undo, the oldtip is really going to be the new tip. See the section onnested_init(). - tip
-
The revision at the tip of the coloring by
revs, or justrevif that is hownested_init()was called. See the section onnested_init() - cset
-
This is just the
sccs *to the product’s ChangeSet file. It is there in order to pass it around to other APIs as to avoid multiplesccs_init()'s. Inpullthis will be the RESYNC ChangeSet file aftertakepatchhas added in the new remote csets. - proj
-
A pointer to the product’s
projstructure. - aliasdb
-
The
BitKeeper/etc/aliasesfile. Lazily initialized when needed. E.g. by the alias code. - compdb
-
Used internally by
nested_findKey()to map component rootkey’s to the matchingcompstructure. - product
-
Since for many operations we can treat the product as just another component, we keep a
compstructure for the product here for easy access. The same struct will be included in then->compsarray. - alias
-
When we need to see which components are included in an alias, we call the function
aliasdb_tag()which will mark each component included in the alias withcp->alias, it will also tag thenestedstructure withaliasto indicate that some components are tagged. - freecset
-
When we initialize a
nestedstructure by callingnested_init(), we can optionally pass ansccs *for the product’s ChangeSet file. If we do not,nested_init()willsccs_init()the product’s ChangeSet file and keep a pointer to it in thecsetfield of thenestedstructure. Ifnested_init()did thesccs_init()it must free it (sccs_free()) when thenestedstructure is freed. - fix_idDB
-
Used internally in
nested_init(). Ifnested_init()was told that it’s okay to fix theidDB, this fact is remembered here so that we can fix it later.
The main mechanism through which we obtain a nested structure is by
calling the nested_init() function. This function is described next.
{{{#!cplusplus numbers=off nested *nested_init(sccs *cset, char *rev, char **revs, u32 flags); }}}
- cset
-
Optional
sccs *of the product’s ChangeSet file. If omitted (NULL)nested_init()willsccs_init()the product’s ChangeSet file and cache thesccs *. If provided, it will not be freed bynested_free()and ''must'' be freed by the caller. - rev
-
If present,
nested_init()will return anestedstructure as it existed atrevtime. This parameter is mutually exclusive withrevs, which is described below. It will be passed tosccs_findrev()so it can take any valuesccs_findrev()can handle (e.g. "+" for the tip revision, etc). - revs
-
If present, a list of revisions (in a
linesarray) which will be colored and tagged (with theincludedfield in thecompstructure, see below). Mutually exclusive withrev. This field is mainly used bybk pullandbk undowhich operate on regions of the graph. The region of the graph must have a single tip and when the region and it’s descendents are removed from the graph a single tip must remain. Thenested_init()will fail if these properties are not true. - flags
-
Many tweaks to how
nested_init()operates, these are described below.
{{{#!cplusplus numbers=off #define NESTED_PENDING 0x10000000 /* included pending comps / #define NESTED_PULL 0x20000000 / This is a pull / #define NESTED_PRODUCTFIRST 0x40000000 / c→p first in c→comps / #define NESTED_MARKPENDING 0x01000000 / set c→pending / #define NESTED_FIXIDCACHE 0x02000000 / no error for bad idDB */ }}}
- NESTED_PENDING
-
When initializing a
nestedstructure, include any new 'pending' components that have not yet been added to the product’s ChangeSet file. Also for existing components that have pending csets not included in the product, setc->pending. Mutually exclusive with eitherrevorrevssince we are not looking at the nested collection as of any revision. - NESTED_PULL
-
Used by
bk pullwhen processing theRESYNC/ChangeSetfile to indicate that we have three regions that we care about, the 'common region', the 'remote region', and the 'local region'. In this case,oldtipwill be set as the tip of the local side,tipas the tip of the remote side, and each of the components will be tagged according to which region they belong to. For instance, components that appear in the GCA region, will be tagged as 'not new', components that appear in the REMOTE region will be tagged as 'included' and components that appear in the LOCAL region will be tagged as 'with local changes'. These three tags arec->new,c->included, andc->localchangescorrespondingly in thecompstructure. This also changes the meaning ofc->lowerkeyin thecompstructure to be the tipkey of the local repository for this component, whilec->deltakeyis the tip of the remote component. - NESTED_PRODUCTFIRST
-
Put the product at the beginning of the
compslist. The default is to put it last. - NESTED_MARKPENDING
-
Whether to tag components that have pending changes with
c->pendingor not. TheNESTED_PENDINGmode above does this by default. Thenested_populate()function requires that pending components be marked. - NESTED_FIXIDCACHE
-
Whether it is okay to fix the idDB cache for moved components or we should error out.
Caching of the nested struct
Whenever nested_init() is called with both the rev and revs
parameters set to zero (meaning the tip of the product’s ChangeSet
file), the information will be loaded from and written to a cache
kept in BitKeeper/log/nested_init.cache.
Component
Information about components is kept in the comp structure. A list
of the components is kept in the nested structure’s comps field.
{{{#!cplusplus numbers=off typedef struct { nested *n; // backpointer char *rootkey; // rootkey of the repo char *deltakey; // deltakey of repo as of rev char *lowerkey; // in pull, local tip // otherwise, gca tip char *path; // actual path: like GFILE, not DPN
void *data; // scratchpad for applications
// bits u32 alias:1; // in the latest alias u32 included:1; // component modified in 'revs' u32 localchanges:1; // component modified outside 'revs' u32 new:1; // if set, the undo will remove this u32 present:1; // if set, the repo is actually here u32 product:1; // this is the product u32 remotePresent:1; // scratch for remote present bit u32 pending:1; // has pending csets not in product } comp; }}}
- n
-
Back-pointer to the
nestedstructure we belong t. - rootkey
-
Rootkey of this component.
- deltakey
-
Tip key of this component, if
revswere passed tonested_init()this is the tip of the region implied byrevs. In the case ofbk pull, it is the tip of this component in the remote side. - lowerkey
-
When pulling, it is the tip of the local region.