Idle Conjectures in Search of Refutation: An organizational model for an SME engaged in the development of complex systems

A lot of software engineering is really about organization. This is particularly important if you are working with a complex system. The better your organizational skills, the easier it is to learn about the system, and the less you need to think about day-to-day.

Most of my experience has been with smallish cross-disciplinary development teams working on commercial projects. Here are some best practices that I have been developing over the past 6-8 years, and continue to develop and improve over time. Please be aware that these practices may not be applicable to the very largest enterprises, nor to noncommercial (open-source) or academic development.

First of all, it helps if the business thinks about how it organizes it's filing system. Just because it is on a computer and searchable does not detract from the importance of a well laid out and well thought through document store. Putting documents in a document management system like sharepoint, or even using Google docs or some such is, on it's own, simply not enough. It is not just about having a document management system, it is about how that document management system is organized.

The structure of your filing system really is central to how your business is organized and, indeed, many other things. (More on that later)

Now, I know what you are saying: "How incredibly dull", and indeed it is, but that does not mean that it is not important. In fact, we will find that if we put a little bit of up-front effort into designing and standardizing the business' filing system, we will be able to reap enormous benefits from automation later on.

Now, most filing systems are either intrinsically hierarchical, or (at the very least) support hierarchical structures. A hierarchy is a conventional, effective way of organizing things. By creating an explicit hierarchy we are creating a common navigational frame-of-reference for the organization. We do sacrifice some of the flexibility offered by flat organizational systems, but we gain much more for the business in the form of more effective communication and coordination. This is because everybody has the same frame of reference; the same mental model of how everything hangs together.

I am aware that this seems antiquated in comparison with the modern trend to "tag & search", but hierarchy still has it's place, particularly in creating a common frame of reference.

So that we can persuade people that this stuff is important, and to make it sound like something new and interesting, we need an impressive and official sounding name. Let us try "Organizational Model", and see if that works.

Note that the organizational model is in principle "just" a collection of files and directories on a conventional computer file system, although in practice it may be so large that only parts of it are "active" and instantiated at any one time.

Since we are organizing things in a hierarchy, we need to decide what distinctions are drawn to the top of the hierarchy, and which ones are pushed down to the bottom of the hierarchy. Note that the obvious approach, that of mirroring the structure of the business in the structure of the filing system, is not necessarily the best idea, particularly since one of our goals is to facilitate interdisciplinary communication and avoid the formation of informational "silos".

We take strong cues from common software engineering practices over the years, since automation will be a big part of what we hope to gain from a disciplined approach to organization. In particular, we are influenced by common ways of working with Subversion (the version control system).

So, one distinction that we will make is between input and output: the source files that contain design information, and the destination files that contain completed reports, web-pages, executable programs, generated documentation and so on. We use the terms "src" & "dst" to denote source and destination. At this level, we also want to consider the third-party tools, libraries and other resources upon which our systems and processes depend. We use the term "env" to denote the environment upon which our systems depend.

Another distinction that we will make is between current work being carried out on a day-to-day basis and historical archives. We use the term "daily" to denote current work (Similar to how the term "trunk" is used in Subversion, but more forceful in suggesting how it is to be used), as well as "weekly", "monthly" and "annual" to denote archives taken at those intervals. We will also want to keep records of important moments - like product releases, as well as what is currently "in-production", for situations where that concept makes sense.

I prefer to place version-related distinctions at a higher level in the hierarchy than source-vs-destination distinctions, simply because I believe that more people make more switches between source and destination than between one version and another.

So, as a result, the most significant division is into what Subversion would call branches:

daily
archive

weekly

YYMMDD_foo

monthly

YYMMDD_foo

annual

YYMMDD_foo

release

stage - pre-release staging directory.
prod - current production system.
rollback - previous production system for quick emergency rollbacks.
YYMMDD_productA_v1.2 - Externally deployed or separately configuration-managed systems
YYMMDD_productB_v3.4 - Externally deployed or separately configuration-managed systems

The next most significant division is into source vs derived documents:

src:-

Human-generated source documents.

dst:-

Automatically generated binaries, documentation, test results, performance, financial and management reports. (dst/bin; dst/doc; dst/doc/performance etc..)

env:-

Installed third-party dependencies (Python virtualenv etc...)
This could also include images of clean VMs to be picked up by the continuous integration process, or a reference to which EC2 images to use in deployment & testing.

The next most significant division is into units chosen to help manage development costs:

bespoke:-

One-off bespoke work for customers.
Cost not amortized.
Cost borne entirely by customer.
Ideally contains configuration information rather than software.

library:-

Re-usable soft components.
Cost amortized across multiple products, systems & projects.

systems:-

Re-usable components with a hardware component.
Ongoing costs amortized across multiple products, systems & projects.
Ideally contains configuration information rather than software.
Includes management and financial reporting systems

products:-

Non-re-useable components;
Cost amortized over multiple sales.

research:-

Activities not directly related to product or internal system development.
Cost borne by enterprise.

sandbox:-

Test activities of little or no consequence.
Costs not tracked.

thirdparty:-

Links to third-party source-level dependencies.

The final division is into activities, to help identify the nature of the documents contained in that particular folder. This is a bit unusual in that it mixes software development with other business activities. The idea is that we can take advantage of automation (scripts etc...) to pull together technical reports on spend, ROI and so on in an integrated manner.

spec:-

Specification documents

doc:-

Technical documentation

test:-

Tests

config:-

Configuration files

sales:-

Sales estimates, leads & marketing collateral

support:-

Customer Feedback, support tickets etc...

financials:-

Cost estimates, budgets etc...

Because the organizational model is fixed, we can automate lots of things like financial & management reporting, business-ops as an equivalent to devops maybe?

Security and access control is managed through the use of Sub-Repositories that are embedded below this level of the organisation.

The organizational model seeks to accomplish its goals using out-of-band communication & organizational structure rather than management fiat. People tend to self-moderate & change to fit in with what is being done around them rather than seeking to make reasoned arguments for change.

The first goal is to facilitate automation by creating a conventional location for documents to be stored.

The second goal is to help manage the cost of systems development by explicitly organizing documents by how costs are amortized.

The third goal is to encourage "DRY" (Do not Repeat Yourself) systems design by encouraging reuse and facilitating development automation techniques (like code generation).

1 comment:

Will26 October 2012 at 07:10
I am not totally convinced about the benefits of using the VCS as a CI system & to control pushing to production, as this limits you to "big bang" deploys (or forces you to implement a sophisticated dependency management system), so the "production" "staging" and "rollback" branches should be seen as a tentative suggestion only.

Wednesday, 11 July 2012

An organizational model for an SME engaged in the development of complex systems

1 comment: