Monday 25 November 2013

Reasoning about state evolution using classes

In response to:

http://hadihariri.com/2013/11/24/refactoring-to-functionalwhy-class

Thinking about how application state evolves over time is difficult, and is responsible for all sorts of pernicious bugs. If you can avoid statefulness, you should ... but sometimes you just can't. In these cases, Classes give us a way to limit and control how state can change, making it easier to reason about.

In other words, placing state in a private variable, and limiting the possible state transitions with a restricted set of public methods (public setters don't count) can dramatically limit the combinatorial explosion that makes reasoning about state evolution so difficult. To the extent that this explosion is limited, classes help us to think and reason about how the state of our application evolves over time, reducing the occurrence and severity of bugs in our system.

The discussion of using classes as an OO domain modelling language is a separate question, and is (as the above article insinuates) is bedevilled with domain-model impedance mismatches resulting in unsatisfactory levels of induced-accidental-complexity.

Friday 22 November 2013

20 percent together

20% time is a beguiling idea. Productivity is a function of passion; and nothing fuels passion like ownership. The problem with 20% time stems from the blurring of organisational focus and the diffusion of collective action that results from it. 

So ... the problem remains: How to harness the passion of self-ownership, and steer it so that it is directed in line with the driving force and focus of the entire company ... so the group retains it's focus, and the individual retains his (or her) sense of ownership.

I can well imagine that the solution to this strategic challenge is going to be idiosyncratic to the business or industry in question ... but for many types of numerical software development, there is a real need for good tools and automation, so why not make a grand bargain: give employees some (limited) freedom to choose what they work on, but mandate that it should be in support of (well advertised) organisational strategic objectives.

Friday 15 November 2013

Continuous Collaboration: Using Git to meld continuous integration with collaborative editing.

Written in response to:

http://www.grahamlea.com/2013/11/git-mercurial-anti-agile-continuous-integration/

I strongly agree with the fundamental point that the author is making.

However, there are nuances. A lot of this depends on the type of development that you are doing.

For example, most of my day-to-day work is done in very small increments. Minor bug-fixes, incremental classifier performance improvements, parameter changes, and so on. Only rarely will I work on a feature that is so significant in its' impact that the work-in-progress causes the branch to spend several days in a broken / non-working state. I also work in fairly small teams, so the rate of pushes to Gerrit is quite low: only around a dozen pushes per day or so. This means that integration is pretty easy, and that our CI server gives us value & helps with our quality gating. We can follow a single-branch development path with little to no pain, and because both our software and the division of labour in the team are fairly well organised, conflicts very very seldom occur when merging (even when using suboptimal tools to perform the merges).

This state of affairs probably does not hold for all developers, but it holds for me, and for most of the people that I work with. As a result, we can happily work without feature branches (most of the time), and lean on the CI process to keep ourselves in sync & to measure the performance of our classifiers & other algorithms.

Now, don't get me wrong, I think that Git is great. I am the nominated Git expert in my team, and spend a lot of time helping other team members navigate the nuances of using Git with Gerrit, but for most people it is yet another tool to learn in an already over-complex development environment. Git gives us the flexibility to do what we need to in the environment that we have; but it is anything but effortless and transparent, which is what it really needs to be.

Software development is about developing software. Making systems that work. Not wrangling branches in Git.

My ideal tool would be the bastard son of Git and a real-time collaborative editor. My unit tests should be able to report when my local working copy is in a good state. Likewise, my unit tests should be able to report whether a merge or rebase has succeeded or failed. Why can I not then fully automate the process of integrating my work with that of my colleagues? Indeed, my work should be integrated & shared whenever the following two conditions are met: 1) My unit tests pass on my local working copy, and 2) My unit tests pass on the fully integrated copy. These are the same criteria that I would use when doing the process manually ... so why do it manually? Why not automate it? Triggered by every save, the resulting process would create the appearance of an almost-real-time collaborative working environment, opening up the possibility for new forms of close collaboration and team-working that are simply not possible with current tools. A source file would be a shared document that updates almost in real time. (If it is only comments that are being edited, then there is no reason why the updating could not actually be in real time). This means that you could discuss a change with a colleague, IRC-style, in the comments of a source document, and make the change in the source file *at the same time*, keeping a record not only of the logic change, but also of the reasoning that led to it. (OK, this might cause too much noise, but with comment-folding, that might not matter too much).

Having said all of that, branches are still useful, as are commit messages, so we would still want something like Git to keep a record of significant changes, and to isolate incompatible works-in-progress in separate branches; but there is no reason why we cannot separate out the "integration" use case and the "collaboration" use case from the "version control" and "record keeping" use cases.

Wednesday 13 November 2013

Specification and Risk

When using a general purpose library for a specific application, we generally only use a tiny subset of the functionality that the library provides. As a result, it is often reasonable to wrap that library in a simplified, special-purpose API, tailored to the specific needs of the application under development. This simplifies the interface, and in reducing the number of ways that it can be used, we also reduce the number of ways that it can go wrong, reducing the cognitive burden on the developer, simplifying the system, and reducing both cost and risk.

In this way, a restriction in what we expect the system to do results in a beneficial simplification; reduced costs and reduced risk.

It is possible to go too far with this, though, and there are two deleterious effects that immediately spring to mind:

Firstly, there is the obvious risk of a bug in the specification - that proposed restrictions may be inappropriate or incompatible with the primary task of the system, or with the political needs of various stakeholders.

Secondly, and more insidiously, there is the risk that excessive restrictions become positive specification items; moving from "the system does not have to handle X" to "the system must check for X and explicitly error in this way". Whilst this seems ridiculous, it is a surprisingly easy psychological trap to fall into, and, of course, it increases system complexity and cost rather than reducing it.

Consistently being able to strike the right balance is primarily a function of development organisation culture; and another reason why businesses need to pay attention to this critical (but frequently overlooked) aspect of their internal organisation.

Tuesday 12 November 2013

Qualitative changes in privacy induced by quantitative changes in social-communications network topology

How do we form opinions and make judgements of other people?

Can we understand this process in terms of the flow of information through a network/graph?

Can we use this model to understand the impact of changes in the network? (Both structural/topological changes in connectivity and quantitative changes to flow along existing graph edges).

Can we use this approach to predict what will happen as our privacy is increasingly eroded by technology?

I.e.: Do we see some relationship between the structure of a social/communication graph and the notion of "privacy". (Temporal aspects might also be important?)

If the graph changes, does the quality of "privacy" change in different ways, and how does that impact the way that we make judgements about other people, and the way that other people make judgements about us.

What does that mean for the nature of society going forwards; particularly as developments in personal digital technologies mean that increasing amounts of highly intimate, personal information are captured, stored and disseminated in an increasingly uncontrolled, (albeit sparsely distributed) manner.

The sparsity of the distribution might be important - in terms of the creation of novel/disruptive power/influence networks.

Wednesday 6 November 2013

The software conspiracy: Maintaining a software-developer biased imbalance in human-machine labour arbitrage.

Have you ever wondered why software engineering tools are so terrible?

Perhaps there is an implicit/unspoken conspiracy across our profession?

After all, we software developers are working away at our jobs to automate various economic activities; the inevitable result of which is to force workers in other industries out of their jobs and their livelihoods.

A claim can be made that technological developments create new, less rote and routine roles, with more intellectual challenge and greater responsibility -- but there is no real evidence that this outcome will necessarily always hold; indeed, there is some empirical evidence to suggest that this pattern is even now beginning to fail.

We are not stupid. Indeed, we are well aware of the effects of our actions on their individual welfare and security, so why should we bring the same calamity upon ourselves? Perhaps we should keep our software tools in their present primitive state; to ensure job security for ourselves just as we undermine it for others?