Friday, 15 November 2013

Continuous Collaboration: Using Git to meld continuous integration with collaborative editing.

Written in response to:

I strongly agree with the fundamental point that the author is making.

However, there are nuances. A lot of this depends on the type of development that you are doing.

For example, most of my day-to-day work is done in very small increments. Minor bug-fixes, incremental classifier performance improvements, parameter changes, and so on. Only rarely will I work on a feature that is so significant in its' impact that the work-in-progress causes the branch to spend several days in a broken / non-working state. I also work in fairly small teams, so the rate of pushes to Gerrit is quite low: only around a dozen pushes per day or so. This means that integration is pretty easy, and that our CI server gives us value & helps with our quality gating. We can follow a single-branch development path with little to no pain, and because both our software and the division of labour in the team are fairly well organised, conflicts very very seldom occur when merging (even when using suboptimal tools to perform the merges).

This state of affairs probably does not hold for all developers, but it holds for me, and for most of the people that I work with. As a result, we can happily work without feature branches (most of the time), and lean on the CI process to keep ourselves in sync & to measure the performance of our classifiers & other algorithms.

Now, don't get me wrong, I think that Git is great. I am the nominated Git expert in my team, and spend a lot of time helping other team members navigate the nuances of using Git with Gerrit, but for most people it is yet another tool to learn in an already over-complex development environment. Git gives us the flexibility to do what we need to in the environment that we have; but it is anything but effortless and transparent, which is what it really needs to be.

Software development is about developing software. Making systems that work. Not wrangling branches in Git.

My ideal tool would be the bastard son of Git and a real-time collaborative editor. My unit tests should be able to report when my local working copy is in a good state. Likewise, my unit tests should be able to report whether a merge or rebase has succeeded or failed. Why can I not then fully automate the process of integrating my work with that of my colleagues? Indeed, my work should be integrated & shared whenever the following two conditions are met: 1) My unit tests pass on my local working copy, and 2) My unit tests pass on the fully integrated copy. These are the same criteria that I would use when doing the process manually ... so why do it manually? Why not automate it? Triggered by every save, the resulting process would create the appearance of an almost-real-time collaborative working environment, opening up the possibility for new forms of close collaboration and team-working that are simply not possible with current tools. A source file would be a shared document that updates almost in real time. (If it is only comments that are being edited, then there is no reason why the updating could not actually be in real time). This means that you could discuss a change with a colleague, IRC-style, in the comments of a source document, and make the change in the source file *at the same time*, keeping a record not only of the logic change, but also of the reasoning that led to it. (OK, this might cause too much noise, but with comment-folding, that might not matter too much).

Having said all of that, branches are still useful, as are commit messages, so we would still want something like Git to keep a record of significant changes, and to isolate incompatible works-in-progress in separate branches; but there is no reason why we cannot separate out the "integration" use case and the "collaboration" use case from the "version control" and "record keeping" use cases.