Idle Conjectures in Search of Refutation: July 2012

Monday, 30 July 2012

Cost-control and Amortization in Software Engineering: The importance of the source repository.

This discussion is not finished, but it is getting late, so I am putting it out regardless. Please excuse the jumps, the argument is put together like a tree, working down from the leaves, through the branches to the trunk. I will try to re-work it later to make it more linear and easier to read.

--

This is a discussion about controlling costs in Software Engineering. It is also a discussion about communication, bureaucracy, filing systems and human nature, but it starts with the debate about how best to organize ones source code repository; a debate dominated by the ascendance of the Distributed Version Control System (DVCS):

In the Distributed-vs-Centralized VCS debate I am generally agnostic, perhaps inclined to view DVCS systems slightly more favorably than their centralized counterparts.

I have used Svn and Hg and am happy with both. For a distributed team, working on a large code-base over low-bandwidth connections, Hg or Git have obvious advantages.

Many DVCS activists strongly advocate a fine-grained division of the code-base into multiple per-project repositories. In the context of a global, distributed development team, this approach is highly efficient as it allows developers to work independently with minimal need for synchronization or communication.

This is particularly striking when we remember just how difficult and time-consuming (i.e. hugely expensive) synchronization and communication can be if developers are working in different time-zones or only working part-time on a project.

However, this property is less relevant to a traditional, co-located and tightly integrated development team. In fact, I intend to demonstrate that splitting the repository in this situation has serious implications that need to be considered.

Before I do that, however, I need to describe a number of considerations that motivate and support my argument.

Amortization as the only effective mechanism for controlling development costs.

Firstly, developing software and maintaining software is very expensive. The more complexity, the more cost. For a given problem, there is a limit to the amount of complexity that we can eliminate. There is also a limit to how much we can reduce developer salaries or outsource work before the complexity and cost imposed by the consequent deterioration in understanding and communication outweigh the savings.

The only other meaningful mechanism that we have to control costs is careful, meticulous amortization, through reuse across product lines, products & bespoke projects. I believe that reuse is fiendishly difficult to achieve, critically important, and requires a fundamental shift in thinking to tackle effectively. The difficulty in achieving reuse is supported by historical evidence: re-use is rare as hen's teeth. It's importance, I hope, is self-evident, and the fundamental shift in thinking is required because we are working against human nature, against historical precedent, and against some unfortunate physics.

Much of this argument is a discussion about how to overcome these problems and facilitate the amortization of costs at a variety of different levels, and how a single, monolithic repository (or filing system) can support our efforts.

Reuse is difficult partly because of what software development is.

More than anything else, software development is the process of learning about a problem, exploring different solutions, and applying that learning to the development of a machine that embodies your understanding of the problem and it's solution; software both defines a machine and describes a model of our understanding of the problem domain. The development of a piece of software is primarily a personal intellectual exercise undertaken by the individual developer, and only incidentally a group exercise undertaken by the team.

Reuse by one person, of a software component written by another, must at some level involve some transfer of some degree of understanding.

Communications bandwidth is insufficient.

The bandwidth offered by our limited senses (sight, hearing, smell etc..) is insignificant when held up against the tremendous expressive power of our imagination; the ability of our brain to bring together distant memories with present facts to build a sophisticated and detailed mental model. The bandwidth offered by our channels of communication is more paltry still; even the most powerful means of communication at our disposal, carried out face-to-face, is pathetic in comparison; writing, documentation, even more contemptible. (Although still necessary).

Communicating the understanding built up in the process of doing development work is very difficult because the means that we have at our disposal are totally inadequate.

Reuse by one person, of a software component written previously by the same person is orders of magnitude easier to achieve than reuse of a software component written by another. Facilitating reuse between individuals will require us to bolster both the available bandwidth and the available transfer time by all means and mechanisms possible. We will need to consider the developer's every waking moment as a potential communications opportunity, and every possible thing that is seen or touched as a potential communications channel.

Re-Think the approach.

One solution to the problem is to side-step communication. A developer is already an expert in, and thus tightly bound to the software components that he has written; A simple mechanism for reuse is simply to redeploy the original developer, along with his library of reusable components, on different projects.

As with the individual, so with the team. A team of individuals that has spent a long time together has a body of shared organizational knowledge; they have had opportunities to share experiences, have evolved a common vocabulary, culture and approach. Collectively, they are also tightly bound to the software components that they have written. A team, expert in a particular area, is a strategic asset to an organization, and the deployment of that asset also needs to be considered strategically, in some detail, by the executive.

(Make people's speciality part of their identity.)

Strategic planning needs to think in terms of capabilities and reuse of assets. The communication of the nature of those capabilities and assets needs to be a deliberate, planned activity.

Exploit pervasive communication mechanisms.

We need to think beyond the morning stand-up meeting; our current "managed" channels of communication are transient; but our eyes and ears are less fleetingly open. We need to take advantage of pervasive, persistent out-of-band communication mechanisms. Not intranets and wiki pages that are seldom read, nor emails and meetings that arrive when the recipient is not maximally receptive. We need communications channels that are in the background, not intrusive, yet always on. Do not place the message into the environment, make the environment the message.

Make the office layout communicate reuse. Make the way the company is structured communicate reuse, make where people sit communicate reuse, and above all else, make the way that your files are organized communicate reuse.

The repository is a great pervasive communication mechanism.

As developers, we need to use the source repository frequently. The structure of the repository dictates the structure of our work, and defines how components are to be reused.

The structure of the repository and the structure of the organization should be, by design, tightly bound together, as they are so often by accident. If the two are in harmony, the repository then becomes much more useful. It provides a map that we can use to navigate by. It will help us to find other software components, and, if components and people are tightly bound together, other people also. The repository is more than a place to store code; it is a place to store documentation and organizational knowledge, and to synchronize it amongst the members of the organization. Indeed, it is the place where we define the structure and the nature of the organization. It is the source of more than just binaries, but of all organizational behavior.

TODO:

Financial reporting still drives everything - it needs to be in harmony also.

Trust is required. (maybe not)

Not the whole story.

Another problem: reuse is difficult because it puts the cart before the horse. How can we decide what component to reuse before we have understood the problem? How can we truly understand the problem before we have done the development work? If we are solving externally defined and uncorrelated problems that we have no influence over, then we could never reuse anything.

Sounds difficult? Lessons from the Spartans on the nature of city walls.

Back to repositories.

Limiting code reuse. Difficult to do "hot" library development.

One of the frequent arguments that gets trotted out in the Git/Hg-vs-SVN debate is that of merging.
The merge conflict argument is bogus. Proper organization and DRY prevents merge issues.

10,000 hours and too little time...

The 10,000 hour rule suggests that it takes 10,000 hours to master a discipline. That is over a year running continuously; almost two years at 16 hours per day, or over 3 years at a somewhat more reasonable 60 hours per week.

With millions of developers working away world-wide, an awful lot changes in technology in 3 years. New languages, new tools, new paradigms and approaches. We must learn not just one discipline, but many.

In this new world, none of us can be masters: all of us must be students. The faster paced the world, the more tools and disciplines to master, the more fleeting the experience and the shallower the knowledge.

Please remember that when you write for others, that you are, invariably, writing for the inexperienced student, with too little time on his hands, no matter how worn and weathered the face.

Wednesday, 25 July 2012

The Reality Gap

A widely acknowledged truism in the development of complex systems is that, for any given project, we observe a huge dynamic range in individual developer ability. (Largely determined by level-of-obsession with the problem).

A less frequently noted corollary of this is that a huge dynamic range in team ability is also observed.

Team performance is, of course, driven to a large extent by the ability and combination of constituent members; however, we can also identify tools, techniques, methodologies and modes of organization that can help to optimise a team's performance within these constraints.

As a technologist, and a border line obsessive-compulsive, I am particularly interested in the intersection between tooling and team organization, and how the tools that we use can help us to impose order and discipline on our approach to problem solving.

I am also a realist, and acknowledge that we cannot all be 10X-ers all of the time. In fact, by definition, most of us will not be 10X-ers most of the time. There is a reality gap between our technical aspirations and the reality of high-probability mediocrity.

So, having taken our medicinal dose of humility; here is an interesting question:

What is the simplest tool (or set of tools) that one could use to reduce the risk of poor performance in a development team?

--

[Aside]

This question is motivated in part by the too-seldom-made observation that success is more often achieved by getting the management and engineering basics right than by resting all hopes on a single piece of sophisticated key IP.

On the other hand, it can be argued (http://williamtpayne.blogspot.com/2012/05/complexity-it-had-better-be-worth-it_03.html) that the opposite sentiment applies; that a lot of risk is not controllable, and that the only variable that can be optimized is the potential payoff.

The key parameter here is the degree to which risk may be controlled.

Friday, 20 July 2012

Quantitative Investing in Startups

Comment in response to: http://mattturck.com/2012/07/20/data-driven-venture-capital

Considering the possibility of using quantitative techniques to invest in startups.

Quantitative investing is really quite hard, even when trying to analyse equities with decades of financial reports available. The world is complex and always changing, with no guarantee that historical conditions will prevail. Furthermore, the data that you have is generally extremely sparse relative to the dimensionality of the problem, requires lots of massaging to normalize, and is full of wrinkles that must be ironed out to normalize it for comparison (retrospective corrections, stock splits and other corporate actions etc… etc…).

The natural (and only) solution is to approach the problem with really strong priors in the form of a set of principled, theory-driven model(s) of the fundamentals, (backed by good engineering and thorough data management).

Because the problem domain itself is so complex, and the data so sparse, the models themselves have to be simple, which means that most of the opportunities for innovation are to be found in the search for new and previously underexploited data sources. This is particularly true when looking at early-stage startups, as financial history is either absent or not particularly predictive.

Perhaps fortunately, the current state of the art in data exploitation is really quite poor, meaning that many opportunities exist to improve the state of the art.

So, what opportunities can we identify?

Well, organizations are composed of people. Different organizations have different personalities, and different cultures; sometimes the people in those organizations gel together and turn into a great and highly productive team, and sometimes they do not.

If you were able to develop a really good understanding of how people work together in teams, and how different personality types, personal circumstances, technical skills and work environments come together, you could build what could be a pretty strong factor based on staff surveys, psychometric profiles and whatever other behavioral data you can lay your hands on.

(You could also use the same models to build a secondary business offering personal and organizational coaching… )

The cost of obtaining this data would be quite high, unfortunately, but there are other factors that could be attractive based simply on the ease with which large quantities of data may be collected.

For example, a statistical analysis of source code repositories and checkin histories might well yield insights into the ability of the organisation to respond to changing conditions, and to rapidly innovate.

http://alistair.cockburn.us/Characterizing+people+as+non-linear%2c+first-order+components+in+software+development

Monday, 16 July 2012

Libor, Malfeasance, Data and Complexity

Thomas Redman wrote about the Libor scandal; arguing that improvements to data quality will help to reduce malfeasance.

http://blogs.hbr.org/cs/2012/07/libors_real_scandal_bad_data.html?cm_sp=blog_flyout-_-cs-_-libors_real_scandal_bad_data

I agree with a lot of what he says, but I believe that good data, on its own, is not enough. We also need visualisation and data exploration tools to understand the data, simplicity to communicate that understanding, and an educated and intelligent workforce to comprehend the implications.

So, here is my response, emphasizing simplicity partly for rhetorical effect, but mostly because it is the most difficult thing to achieve:

A safe, crisis-free future requires both good data and simplicity.

We can build a world so complex and hidden that there is no obvious fraud, or we can build one so simple and transparent that there is obviously no fraud.

The former is a recipe for catastrophe, but the latter requires considerably more effort.
Obtaining trustworthy data is hard enough, but building simplicity is subtle, and difficult. There is a degree of complexity which is intrinsic to every problem, and which cannot be eliminated, so beware of false simplicity that merely pushes complexity somewhere else. Non-essential complexity may be banished, but essential complexity must be consumed.

Scholarship as part of Agility

I found another John Boyd fan; increasingly relevant in this age of information overload:

http://blogs.hbr.org/cs/2012/07/act_fast_not_first.html

What this article fails to get across, however, is the emphasis that the OODA methodology places on understanding one's opponent, of getting inside his head so that you can, by your actions, disrupt his thinking.

It is unusual, in civilian life, to have an opponent that you are trying to outmaneuver, so Boyd's model does not apply directly to most of the situations that we encounter in our professional lives, but some lessons can still be drawn:

We live in an unpredictable world.
Sometimes we need to change direction because the world changes around us.
When this happens, we need to do it quickly.
To react quickly, we need to be prepared.
Part of being prepared is having a deep understanding and insight into the problem domain.

To be effective, warriors must also be scholars; and this applies equally to those of us in less violent occupations. To deal with a confusing, ever changing world, we must be scholars; to try to understand the factors that drive those changes.

Friday, 13 July 2012

Elites and Meritocracy

I have just finished reading an interesting op-ed piece in the NYTimes on modern elites; arguing that meritocracies tend to develop corrupt, oligarchistic tendencies.

http://www.nytimes.com/2012/07/13/opinion/brooks-why-our-elites-stink.html?_r=1&hp

A couple of thoughts speedily spring to mind.

Firstly, we have only limited ability to actually measure merit. As I have observed before, performance depends on a wide range of factors, most of them to do with situation and context, only a few to do with ability, attitude and aptitude.

Secondly, self-selection plays a strong role; as you get near to the top of an institution, certain personality types start to become more common, and man-management becomes harder. Self-serving decision making is by no means ubiquitous, but it is certainly more likely, and it becomes harder to trust the information being reported and the decisions being made.

This issue is really just a corollary of the Peter Principle writ large, and avoiding it is a non-trivial problem, but there are a couple of wrinkles worth exploring.

Different types of institution attract different personalities, so the flavor of the problem is different in different institutions, but it is particularly prevalent in institutions with a high public profile.

Within any given organization, the incidence and seriousness of the problem will tend to increase over time. This is partly a product of self-selecting individuals rising through the ranks, and partly a product of the spread of their ideas and culture. It is very hard to create a good, trustworthy culture, and very easy for a good culture to become cynical, jaded and self-serving. It might be worthwhile drawing the analogy between these memes and a pathogen, and thinking about the solution in terms of disease-control techniques.

Given that the incidence of the problem in an organization increases over the lifespan of the organization; increasing churn in the ecosystem, as new organizations and institutions replace old ones, should have the net effect of reducing these problems in the system as a whole.

This is another argument for an economy composed of many small, short-lived organisations and institutions, rather than a small number of large, long-lived organizations, as the damage done by the (relatively) small number of bad individuals is limited, and their malign influence quarantined and contained within their host organization.

Wednesday, 11 July 2012

An organizational model for an SME engaged in the development of complex systems

A lot of software engineering is really about organization. This is particularly important if you are working with a complex system. The better your organizational skills, the easier it is to learn about the system, and the less you need to think about day-to-day.

Most of my experience has been with smallish cross-disciplinary development teams working on commercial projects. Here are some best practices that I have been developing over the past 6-8 years, and continue to develop and improve over time. Please be aware that these practices may not be applicable to the very largest enterprises, nor to noncommercial (open-source) or academic development.

First of all, it helps if the business thinks about how it organizes it's filing system. Just because it is on a computer and searchable does not detract from the importance of a well laid out and well thought through document store. Putting documents in a document management system like sharepoint, or even using Google docs or some such is, on it's own, simply not enough. It is not just about having a document management system, it is about how that document management system is organized.

The structure of your filing system really is central to how your business is organized and, indeed, many other things. (More on that later)

Now, I know what you are saying: "How incredibly dull", and indeed it is, but that does not mean that it is not important. In fact, we will find that if we put a little bit of up-front effort into designing and standardizing the business' filing system, we will be able to reap enormous benefits from automation later on.

Now, most filing systems are either intrinsically hierarchical, or (at the very least) support hierarchical structures. A hierarchy is a conventional, effective way of organizing things. By creating an explicit hierarchy we are creating a common navigational frame-of-reference for the organization. We do sacrifice some of the flexibility offered by flat organizational systems, but we gain much more for the business in the form of more effective communication and coordination. This is because everybody has the same frame of reference; the same mental model of how everything hangs together.

I am aware that this seems antiquated in comparison with the modern trend to "tag & search", but hierarchy still has it's place, particularly in creating a common frame of reference.

So that we can persuade people that this stuff is important, and to make it sound like something new and interesting, we need an impressive and official sounding name. Let us try "Organizational Model", and see if that works.

Note that the organizational model is in principle "just" a collection of files and directories on a conventional computer file system, although in practice it may be so large that only parts of it are "active" and instantiated at any one time.

Since we are organizing things in a hierarchy, we need to decide what distinctions are drawn to the top of the hierarchy, and which ones are pushed down to the bottom of the hierarchy. Note that the obvious approach, that of mirroring the structure of the business in the structure of the filing system, is not necessarily the best idea, particularly since one of our goals is to facilitate interdisciplinary communication and avoid the formation of informational "silos".

We take strong cues from common software engineering practices over the years, since automation will be a big part of what we hope to gain from a disciplined approach to organization. In particular, we are influenced by common ways of working with Subversion (the version control system).

So, one distinction that we will make is between input and output: the source files that contain design information, and the destination files that contain completed reports, web-pages, executable programs, generated documentation and so on. We use the terms "src" & "dst" to denote source and destination. At this level, we also want to consider the third-party tools, libraries and other resources upon which our systems and processes depend. We use the term "env" to denote the environment upon which our systems depend.

Another distinction that we will make is between current work being carried out on a day-to-day basis and historical archives. We use the term "daily" to denote current work (Similar to how the term "trunk" is used in Subversion, but more forceful in suggesting how it is to be used), as well as "weekly", "monthly" and "annual" to denote archives taken at those intervals. We will also want to keep records of important moments - like product releases, as well as what is currently "in-production", for situations where that concept makes sense.

I prefer to place version-related distinctions at a higher level in the hierarchy than source-vs-destination distinctions, simply because I believe that more people make more switches between source and destination than between one version and another.

So, as a result, the most significant division is into what Subversion would call branches:

daily
archive

weekly

YYMMDD_foo

monthly

YYMMDD_foo

annual

YYMMDD_foo

release

stage - pre-release staging directory.
prod - current production system.
rollback - previous production system for quick emergency rollbacks.
YYMMDD_productA_v1.2 - Externally deployed or separately configuration-managed systems
YYMMDD_productB_v3.4 - Externally deployed or separately configuration-managed systems

The next most significant division is into source vs derived documents:

src:-

Human-generated source documents.

dst:-

Automatically generated binaries, documentation, test results, performance, financial and management reports. (dst/bin; dst/doc; dst/doc/performance etc..)

env:-

Installed third-party dependencies (Python virtualenv etc...)
This could also include images of clean VMs to be picked up by the continuous integration process, or a reference to which EC2 images to use in deployment & testing.

The next most significant division is into units chosen to help manage development costs:

bespoke:-

One-off bespoke work for customers.
Cost not amortized.
Cost borne entirely by customer.
Ideally contains configuration information rather than software.

library:-

Re-usable soft components.
Cost amortized across multiple products, systems & projects.

systems:-

Re-usable components with a hardware component.
Ongoing costs amortized across multiple products, systems & projects.
Ideally contains configuration information rather than software.
Includes management and financial reporting systems

products:-

Non-re-useable components;
Cost amortized over multiple sales.

research:-

Activities not directly related to product or internal system development.
Cost borne by enterprise.

sandbox:-

Test activities of little or no consequence.
Costs not tracked.

thirdparty:-

Links to third-party source-level dependencies.

The final division is into activities, to help identify the nature of the documents contained in that particular folder. This is a bit unusual in that it mixes software development with other business activities. The idea is that we can take advantage of automation (scripts etc...) to pull together technical reports on spend, ROI and so on in an integrated manner.

spec:-

Specification documents

doc:-

Technical documentation

test:-

Tests

config:-

Configuration files

sales:-

Sales estimates, leads & marketing collateral

support:-

Customer Feedback, support tickets etc...

financials:-

Cost estimates, budgets etc...

Because the organizational model is fixed, we can automate lots of things like financial & management reporting, business-ops as an equivalent to devops maybe?

Security and access control is managed through the use of Sub-Repositories that are embedded below this level of the organisation.

The organizational model seeks to accomplish its goals using out-of-band communication & organizational structure rather than management fiat. People tend to self-moderate & change to fit in with what is being done around them rather than seeking to make reasoned arguments for change.

The first goal is to facilitate automation by creating a conventional location for documents to be stored.

The second goal is to help manage the cost of systems development by explicitly organizing documents by how costs are amortized.

The third goal is to encourage "DRY" (Do not Repeat Yourself) systems design by encouraging reuse and facilitating development automation techniques (like code generation).

Monday, 9 July 2012

Honesty.

This post (http://learntoduck.net/the-curse-of-bullshit) reminded me of some thoughts that occurred to me last year.

We have this set of expectations, this mental image of how we are supposed to behave; to get up at 5 every morning, be in work before 8; work through 'till 8 or 9 in the evening, all the time cheerful, aggressive, the day punctuated by the regular, next! next! stand 'em up! knock 'em down; solving a new problem every 20 minutes, never making a mistake, always presenting solutions, innovating, driving the project forwards.

Sometimes we can manage it; circumstances come together, the stars align, and everything works out fine. When it happens, great! Work becomes pleasure, and we slip into the flow-state for months at a time.

Alas, this is not and cannot be normality for all of us, all of the time.

The problem is this; we cannot bring ourselves to admit it. We demand of ourselves, and of other people, the image of perfection, which means that we hide behind the lie; because we must maintain the image of performing perfectly all of the time, we loose the ability to admit when we are not performing, and effectively deal with it.

This is what humility is all about. It is not really a moral thing, it is deeply pragmatic. It is about giving yourself the freedom to analyse and understand your problems, and to do better.

Let us all be a bit more humble, let us admit, publicly, our flaws, and give each other the room to become better at what we do.

After all, if we demand perfection from ourselves and those around us, all we will get in return are lies.