Thursday, 6 September 2012

Quadrature ICS

I have been following the (heated) debate around Git's usability that has been going on recently.

So I woke up this morning and thought, y'know, what the world really needs is another Version Control System.

:-)

Still just an idea, I present to you: Quadrature Integration Control System (QICS) - the anti-Git.

The basic ideas:


(1) Ruthless minimization of required user interaction.

This is both to reduce workload, and to allow non-technical users to work alongside developers. (This is a big deal for me:- I need to get work from my collaborators across the business into version control, so my automated scripts can pick up their work & data for integration into production systems)

You should only need to interact with the system to resolve a conflict), otherwise the system can operate quite happily in the background, listening for changes to a directory tree, making commits to the ICS as and when files change. (Unison, Wave)

The minimization of interaction should extend to installation and configuration. A cross platform stack with no runtime is desirable (Golang seems ideal for this), as well as a peer-to-peer network architecture, with optional central server.

Sophisticated users should have as much control as they want, up to and including provision of APIs for all aspects of behavior. This implies a post-hoc approach to logging, where the user (can) explain what the changes were after they happened, and decorate the log with those explanations.


(2) The target users are a very large, globally distributed, multi-disciplinary development team in a commercial organization. Think Google-scale development, but with a mixture of programmers and (intelligent non-programmer) domain experts. We will assume intermittent or poor network access.


(3) A convention-over-configuration, "opinionated software" approach, with a set of prescriptive and detailed conventions for repository organization, workflow, and branch management. By creating conventions, we can aid and support development and workflow automation efforts to further reduce burden and increase efficiency.


The focus will be on supporting continuous integration and resolving conflicts, rather than supporting branches and the management of concurrent, independent versions. The assumed model is one in which, by default, all work is integrated together in trunk as quickly as network connectivity permits.

The QICS way of thinking about the problem:


A stream of lots of little deltas flowing around a distributed multicast/pub-sub network - similar problem to distributed databases (CAP theorem applies) - take similar approach to some NoSQL DBs to solve the problem - offer only eventual (or asymptotic) consistency. In fancy language: "Present the user a locally coherent view of an asymptotically coherent whole."


Concepts:

Upstream = deltas coming from other collaborators to the local/client working copy.
Downstream = deltas going from the local/client working copy to other collaborators.
Local = the client machine.
Remote = the rest of the network of collaborators.
Local Working Copy = The set of files and directories that the user interacts with.
Delta buffer = storage area containing a set of deltas (or differences), equivalent to the notion of a repository.

To support operation in situations when network connection fails, we will need local delta-buffers, perhaps one local-downstream store, and one local-upstream store, as well as a mechanism to shuttle information from the local downstream store out to the remote network as well as in from from the remote network to the local upstream store.

The local client/user probably does not want the whole repository on his local machine, so he will probably need some mechanism to specify a region-of-interest in the repository (a set of directories) so he can pull in only pertinent changes. (Hence a Pub/Sub-like communications mechanism)

Locally, we will need a mechanism to monitor file changes, calculate the deltas and store them in the downstream store, as well as a mechanism to take deltas from the upstream store, resolve conflicts and apply them to the local working copy.

It would be good if we could detect moved/renamed files without explicit user interaction, so per-file checksums are probably a good idea.

Heuristic mechanisms are good enough.