Friday 20 September 2013

Software & Hardware - How complexity and risk define engineering practice and culture.

Striking cultural differences exist between companies that specialise in software development and those that specialise in hardware development & manufacturing.

The key to understanding the divergence of these two cultures may be found in the differing approaches that each takes to risk and failure; driven both by a difference in the expected cost of failure and a difference in the cost associated with predicting failure.

One approach seeks to mitigate the impact of adverse events by emphasising flexibility and agility; the other approach seeks to minimise the chances that adverse events occur at all, by emphasising predictability and control.

In other words, do you design so you can fix your system quickly when it breaks (at the expense of having it break often), or do you design your system so that it very rarely breaks (but when it does, it is more expensive to fix)?

The answer to this question not only depends on how safety-critical the system is, but how complex it is too. The prediction-and-control approach rapidly becomes untenable when you have systems that reach a certain level of complexity -- the cost of accurately predicting when failures are going to occur rapidly becomes larger than the cost of the failure itself. As the complexity of the system under development increases, the activity looks less like development and more like research. The predictability of disciplined engineering falls apart in the face of sufficient complexity. Worse; complexity increases in a combinatorial manner, so we can very easily move from a predictable system to an unpredictable one with the addition of only a small number of innocuous looking components.

Most forms of mechanical engineering emphasise the second (predictive) approach, particularly for safety critical equipment, since the systems are simple (compared to a lot of software) and the costs of failure are high. On the other hand, a lot of software development emphasises the first (agile/reactive) approach, because the costs associated with failures are (normally) a lot less than the costs associated with development.

Of course, a lot of pejorative terms get mixed up in this, like: "sloppy engineering", and "cowboy developers", vs "expensive failures" and "moribund bureaucracy" but really, these approaches are just the result of the same cost/benefit analysis producing different answers given different input conditions.

Problems mainly arise when you use the wrong risk-management approach for the wrong application; or for the wrong *part* of the application. Things can get quite subtle quite quickly, and managers really need to be on top of their game to succeed.

One of the challenges in developing automotive ADAS systems is that a lot of the software is safety critical, and therefore very expensive to write, because of all of the (necessary) bureaucratic support that the OEMs require for traceability and accountability.

Equally, a lot of the advanced functionality for machine vision / Radar / Lidar signal processing is very advanced, and (unfortunately) has a lot of necessary complexity. As a result it is very very costly to develop when using the former approach; yet may be involved in safety-critical functions.

This is not by any means a solved problem, and very much requires detailed management on a case-by-case basis.

Certainly testing infrastructure becomes much more important as the sensor systems that we develop become more complex. (Disclaimer: my area of interest). Indeed, my experience indicates that for sophisticated sensor systems well over 80% of the effort (measured by both in hours of development & size of the code-base) is associated with test infrastructure, and less than 20% with the software that ends up in the vehicle.

Perhaps the word "test" is a misnomer here; since the role of this infrastructure is not so much to do V&V on the completed system, but to help to develop the system's requirements -- to do the "Data Science" and analytics that are needed to understand the operating environment well enough that you can correctly specify the behaviour of the application.

Monday 16 September 2013


Media 6 Degrees (the new owner of my former employer, EveryScreen Media), has changed it's name to "Dstillery".

This Fast Company article does a remarkably good job of explaining what the technology delivers. (Disclosure: I implemented the early mobile-phone IP-tracking algorithms when I worked at EveryScreen Media).

Whilst I do believe (from what I saw) that ESM/M6D/Dstillery take privacy very seriously; and will (continue to) behave in a responsible manner, I still feel a sense of unease when I think about the extent to which participants in the advertising industry are able to peer into people's personal lives.

To balance this (sort-of) criticism, I feel I should emphasise the fact that the advertising industry is, in general, a pile 'em high sell 'em cheap kind of affair, where advertising impressions are bought and sold by the million; where performance is measured in statistical terms; and where the idea of paying close attention to any one individual would be laughed off as a ludicrous waste of time.

However, it is possible that not all participants will feel that way, just as it is possible that not all participants will be as motivated to act responsibly as my former employers were.

There has been a lot of debate recently about NSA spying on what we read on-line and using our mobile phones to track where we go in the world. Well, you don't need anything like the NSA's multi-billion dollar budgets and mastery of (de)cryptography to do something that feels (like it could be) similarly invasive. (The advertising industry is an order of magnitude less creepy, but it is facilitated by the same social & technological developments, and is still heading in a very similar direction).

I think that this is something that we should think very carefully about, and just as we seek to find better ways to regulate our security services, so too we should (carefully; deliberately; deliberatively) seek to find ways to regulate the flow of personal information around our advertising ecosystem.


Edit: One thing that does bear mentioning -- the link up between M6D & ESM really is a smart move. From a data point of view, it is a marriage made in heaven (a bit of a no-brainer, actually); and I think that the insights that result from combining their respective data-streams will yield genuine benefits for their clients & the brands that they represent.

Wednesday 4 September 2013

Antisocial development

Human beings are social creatures; to the extent that our sanity can be undermined by isolation.

Much of our behaviour is guided and controlled by observing others around us. We instinctively imitate our peers, and follow normative standards of behaviour. As a result, I believe that the most effective learning occurs in a social context; with the presence of peers to shape and guide behaviours.

I also believe that learning is at the centre of what we do as software developers. Functioning software is seemingly more a by-product of learning than the converse.

It is therefore a great pity that modern development methods seem tailor-made to encourage developers to work alone; to minimize the need for contact with their peers and colleagues, and to reduce the need for social interaction.

For example, part of the appeal of the modern distributed version control system is that it allows individuals to work independently and alone, without requiring coordination or synchronisation.

It is possible that this has a rational basis; After all Fred Brooks' analysis in: "The Mythical Man-Month" suggests that optimally sized teams are composed of a single solitary individual, Conway's Law also seems to suggest that maximum modularity is achieved by distributed teams.

Perhaps developer solitude really is the global optimum towards which we, as an industry, are headed. However, this clearly neglects the important social aspects of human behaviour, as well as our need to learn as a group.

I wonder if we ever see a resurgence of tools that support and encourage face-to-face social interaction and learning; rather than obviate the need for it.