Monday, 21 July 2014
Brooks & Conway in a a discussion around CI & development automation.
Discussion arising from this HBR blog post discussing CI:
http://blogs.hbr.org/2014/07/speed-up-your-product-development-without-losing-control/
William Payne:
"One of the reasons that the "Agile" movement has lost credibility in recent years is because many of the consultants selling "Scrum" and similar processes failed to emphasize the fact that a significant investment in automated "testing" and continuous integration is prerequisite for the success of these approaches.
A big barrier to adoption seems to be related to the use of the word "testing". In a modern and effective development process such as those mentioned in the article, the focus of the "test" function isn't really (only) about quality any more ... it is about maximizing the pace of development within an environment that does not tolerate preventable faults.
As a result of this, within my sphere of influence, I have tried to promote the notion of "Development Automation" as an umbrella term that captures the automation of the software build process, module & integration testing, deployment and configuration control, documentation generation and management information reporting ... a term that may help to speed adoption of the techniques mentioned in the article above.
In many ways "Development Automation" is to product development what "DevOps" is to SAAS systems development: Promoting the use of integrated cross-functional systems of automation for the testing and deployment of software and other complex systems.
Indeed, as products become more complex, the importance of automation becomes greater and more critical, and the requirement for a carefully considered, well planned, and aggressively automated integration, test and configuration management strategy becomes a prerequisite for success.
Nowhere is this more apparent than in my field of expertise: the production of machine vision and other sensor systems for deployment into uncontrolled and outdoor environments, systems where specification and test pose a set of unique challenges with considerable knock-on impacts on the system design and choice of integration strategy."
Bradford Power:
"Do you break down the product into small modules and have small teams that are responsible for design, deployment, AND testing? Do you use simulations to shrink the cycles on new product tests?"
William Payne:
"It depends.
Taken together, Rodney Brooks & Melvin Conway have the answer.
Firstly, Brooks' "No Silver bullet" tells us that we cannot drive down development costs forever. Complexity costs money.
Since we can't meaningfully reduce the cost of complexity, we either have to maximize the top line, or amortize that cost across multiple products. This is product-line engineering taken to the extreme.
Secondly, Conway's law tells us that our team structure will become our de-facto system architecture. Complex systems development is primarily a learning activity and the fundamental unit of reuse is the individual engineer.
Team structure therefore has to be organized around the notion that team expertise will be reused across products within one or more product lines, and the more reuse we have, the more we amortize the cost of development and the more profitable we become.
Whether this means small teams or large teams really depends on the industry and the nature of the product. Similarly, the notion of what constitutes a "module" varies widely, sometimes even within the same organization.
However, in order to facilitate this, you need a reasonably disciplined approach, together with a shared commitment to stick to the discipline.
Finally, and most importantly, none of this works unless you can 100% rely upon your automated tests to tell you if you have broken your product or not. This is absolutely critical and is the keystone without which the whole edifice crumbles.
You can't modify a single component that goes into a dozen different products unless you are totally confident in your testing infrastructure, and in the ability of your tests to catch failures.
I have spoken to Google test engineers, and they have that confidence. I have got close in the past, and it is a transformative experience, giving you (as an individual developer) the confidence to proceed with a velocity and a pace that is otherwise impossible to achieve.
Separate test teams have a role to play, particularly when safety standards such as ASIL and/or SIL mandate their use. Equally, simulations have a role to play, although this depends a lot on the nature of the product and the time and engineering cost required to implement the simulation.
The key point is that there is no silver bullet that will make product development cheaper on a per-unit-of-complexity basis ... only a pragmatic, rigorous, courageous and detail-oriented approach to business organization that acknowledges that cost and is willing to pay for it."
Andy Singleton:
"Yes, I think that Conway's law is very relevant here. We are trying to build a system as multiple independent services, and we use separate service teams to build, release and maintain them.
Yes, complexity will always cost money and time. However, I think that Brooks' "Mythical Man Month" observations are obsolete. He had a 40 year run of amazing insights about managing big projects. During this time, it was generally true that large projects were inefficient or even prone to "failure", and no silver bullet was found. Things have changed in the last few years. Companies like Amazon and Google have blasted through the size barrier.
They did it with a couple of tactics:
1) Using independent service teams. These teams communicate peer-to-peer to get what htey need and resolve dependencies.
2) Using a continuous integration machine that finds problems in the dependencies of one team on another through automated testing, and notifies both teams. This is BRILLIANT, because it replaces the most difficult part of human project management with a machine.
The underlying theory behind this goes directly against Brook's theory. He theorized that the problem is communications - with an increase in comunication channels of N partricipants to N^2 channels, which causes work and confusion. If you believe this, you organize hierarchically to contain the communicaitons. ACTUALLY, the most scalable projects (such as LInux) have the most open communications.
I think that the real problem with big projects is dependencies. If you have one person, he is never waiting for himself. If you have 100 people, it's pretty common for 50 people to be waiting for something. The solution to this is actually more open communication that allows those 50 people to fix problems themselves.
I have written several blog articles challenging the analysis in the Mythical Man Month, if you are interested."
William Payne:
"What you say is very very interesting indeed.
I agree particularly strongly with what you say about using your CI & build tools to police dependencies. This is key. However, I am a little less convinced that "peer-to-peer" communication quite represents the breakthrough that you suggest. Peer-to-peer communication is unquestionably more efficient than hierarchical communication, with its' inbuilt game of chinese-whispers and proliferation of choke-points. However, simply communicating in a peer-to-peer manner by itself does not sidestep the fundamental (physical) problem. You still need to communicate, and that still costs time and attention.
IMHO organizing and automating away the need for communication is absolutely the best (only?) way to improve productivity when working on complex systems. This is achieved either by shifting communication from in-band to out-of-band through appropriate organizational structures, or setting (automatically enforced) policythat removes the need for communication (aka standardisation).
These are things that I have tried very hard to build into the automated build/test systems that I am responsible for, but it is still a very difficult "sell" to make to professionals without the requisite software engineering background."
Sunday, 30 March 2014
In defense of defaults
It is necessary but not sufficient for something to be possible: The possibilities need to be communicated also, and we have limited capacity for in-band communication. Default settings are a great way to communicate desiderata in a way that sidesteps our ubiquitously crippling bandwidth limitations.
Friday, 28 March 2014
Social network topology and interpersonal bandwidth as a predictor for conflict
Response to: http://squid314.livejournal.com/329561.html?view=11210073#t11210073
A lot of arguments and disagreements come about because of misunderstood terminology: words and phrases that evoke different imagery and have varying definitions between different groups of people.
These differences come about because of uneven diffusion of information through the social network, and restrictions in interpersonal bandwidth more generally.
We can reduce the long-term aggregate severity of arguments & disagreements (at the expense of some short term pain) if we increase our communications bandwidth both individually and in aggregate; and act to bridge the gaps between super-clusters in the social network.
A lot of arguments and disagreements come about because of misunderstood terminology: words and phrases that evoke different imagery and have varying definitions between different groups of people.
These differences come about because of uneven diffusion of information through the social network, and restrictions in interpersonal bandwidth more generally.
We can reduce the long-term aggregate severity of arguments & disagreements (at the expense of some short term pain) if we increase our communications bandwidth both individually and in aggregate; and act to bridge the gaps between super-clusters in the social network.
Tuesday, 25 March 2014
Accidental and essential complexity
Response to: http://250bpm.com/blog:36
This sort of reminds me of Rodney Brook's commentary on essential and accidental complexity. Once you have got rid of the accidental complexity, you are left with the essential complexity, which cannot be further reduced without compromising on functionality.
You can shovel the essential complexity around your system design all you like, but the overall quantity of complexity still remains constant. A practical consequence of this is that you can move the complexity away from the 1500 line function by splitting it up into smaller functions, but that complexity remains embedded in the implicit relationship that exists between all of your 50 line functions. (Plus some additional overhead from the function declarations themselves).
Of course, splitting up the large function into smaller ones gives you other benefits: Primarily the ability to test each small function in isolation, but also (and more importantly) the ability to read and understand the scope and purpose of each small function without being confused by irrelevant details.
Personally, I would like to have the ability to give names to chunks of code within a large function - and to write tests for those chunks without having to extract them into separate smaller functions.
I would also like to see better tools (and widespread usage of those that exist) for building various graphs and diagrams showing the call-graph and other relationships between functions and objects, so that we can explore, understand, and communicate the implicit complexity encoded in program structure more easily.
This sort of reminds me of Rodney Brook's commentary on essential and accidental complexity. Once you have got rid of the accidental complexity, you are left with the essential complexity, which cannot be further reduced without compromising on functionality.
You can shovel the essential complexity around your system design all you like, but the overall quantity of complexity still remains constant. A practical consequence of this is that you can move the complexity away from the 1500 line function by splitting it up into smaller functions, but that complexity remains embedded in the implicit relationship that exists between all of your 50 line functions. (Plus some additional overhead from the function declarations themselves).
Of course, splitting up the large function into smaller ones gives you other benefits: Primarily the ability to test each small function in isolation, but also (and more importantly) the ability to read and understand the scope and purpose of each small function without being confused by irrelevant details.
Personally, I would like to have the ability to give names to chunks of code within a large function - and to write tests for those chunks without having to extract them into separate smaller functions.
I would also like to see better tools (and widespread usage of those that exist) for building various graphs and diagrams showing the call-graph and other relationships between functions and objects, so that we can explore, understand, and communicate the implicit complexity encoded in program structure more easily.
Wednesday, 12 March 2014
Fundamental prerequisites for development
Great technical leadership guides developers to move in the same direction by articulating a clear (and simple) technical philosophy and encouraging the formation of a strong and consistent culture.
Well thought through infrastructure and high levels of automation allow high quality work to be performed at pace.
Intelligent and passionate developers are required so that work is not pulled backwards by inconsistencies introduced because of insufficient levels of attention and concentration.
Once all this (and more) has been achieved; only then is it worthwhile thinking about whether "Agile" makes sense as a risk management and/or stakeholder communications approach.
Well thought through infrastructure and high levels of automation allow high quality work to be performed at pace.
Intelligent and passionate developers are required so that work is not pulled backwards by inconsistencies introduced because of insufficient levels of attention and concentration.
Once all this (and more) has been achieved; only then is it worthwhile thinking about whether "Agile" makes sense as a risk management and/or stakeholder communications approach.
Sunday, 9 March 2014
The expectation of perfection is poison
If you create an expectation for flawless perfection, you are setting yourself up to be lied to.
It is a distressingly rare thing to find an organisational culture that successfully puts a premium on humility, public acknowledgement of ones' own flaws, and the learning of lessons for the future.
I am particularly reminded of my experience whilst working for Fidelity: As a company, they tried very very hard to create a culture that stood apart from the financial industry mainstream: One of maturity, professionalism, introspection and self awareness: Yet still the testosterone-stewed aggression of some individuals, combined with industry behavioural norms rapidly undid all that good work.
In time, the company's self-regulating mechanisms kicked in and the offenders were shown the door, but the experience shows how quickly and how easily a shallow message backed by aggression beats a deep message backed by considered thought.
It is a distressingly rare thing to find an organisational culture that successfully puts a premium on humility, public acknowledgement of ones' own flaws, and the learning of lessons for the future.
I am particularly reminded of my experience whilst working for Fidelity: As a company, they tried very very hard to create a culture that stood apart from the financial industry mainstream: One of maturity, professionalism, introspection and self awareness: Yet still the testosterone-stewed aggression of some individuals, combined with industry behavioural norms rapidly undid all that good work.
In time, the company's self-regulating mechanisms kicked in and the offenders were shown the door, but the experience shows how quickly and how easily a shallow message backed by aggression beats a deep message backed by considered thought.
Towards a declaration of independence for the internet?
Shortly after I first heard of bitcoin back in 2011, my first thought was that it could be used to implement a sort of peer-to-peer exchange in which the instruments being traded would not be stocks and shares in the conventional sense, but voting rights in committees that made decisions controlling shared interests. I envisaged that this would operate as a sort of virtual crypto-corporation controlling some real-world assets.
Since then, others have come up with ways of using the bit-coin infrastructure to provide a general-purpose computing resource, so one could, in principle, replace the virtual crypto-corporation and the real-world assets with some source code and a computing resource; raising the possibility of truly autonomous, distributed, financially independent digital entities; which is about as close to a declaration of independence as the internet is ever going to be able to give.
Since then, others have come up with ways of using the bit-coin infrastructure to provide a general-purpose computing resource, so one could, in principle, replace the virtual crypto-corporation and the real-world assets with some source code and a computing resource; raising the possibility of truly autonomous, distributed, financially independent digital entities; which is about as close to a declaration of independence as the internet is ever going to be able to give.
Lessons from computer games
Operation Flashpoint & Armed Assault:
Total Annihilation:
- In a serious gunfight, pretty much everybody dies.
- Don't join the army: It's a dumb idea.
Total Annihilation:
- There is no such thing as too much.
- People set themselves up for failure by limiting the scope of their imagination.
Planetary Annihilation:
- Attention is the most critical resource.
- Play the person, not the game.
Friday, 7 March 2014
Computer and network security in the robot age
In response to: http://www.theatlantic.com/technology/archive/2014/03/theres-no-real-difference-between-online-espionage-and-online-attack/284233/
It doesn't matter if you are "just" eavesdropping or if you are trying to cause damage directly. If you are trying to take control over somebody else's property; trying to make it do things that the owner of that property did not intend and does not want, then surely that is a form of theft?
Possession isn't just about holding something in your hand: It is also about power and control.
The implications of this might be a little hard to see, because right now computers don't have very much direct interaction with the "real" world.... but it won't be like that forever.
The implications of this might be a little hard to see, because right now computers don't have very much direct interaction with the "real" world.... but it won't be like that forever.
Take me, for example. I am working on systems to control autonomous vehicles: Networked computers in charge of a car or a truck.
This is just the beginning. In a decade or more, it won't just be cars and trucks on the road that drive themselves: airborne drones; and domestic robots of every size and shape will be everywhere you look.
What would the world look like if we allowed a party or parties unknown to seize control of all these computers? What kind of chaos and carnage could a malicious actor cause?
We have an opportunity right now. A tremendous gift. We need to put in place the infrastructure that will make this sort of wholesale subversion impossible; or, at the very least, very very much harder than it is today, and we need to do it before the stakes become raised to a dangerous degree.
Saturday, 1 March 2014
Nature vs Nurture; Talent vs Practice
In response to: http://www.bbc.co.uk/news/magazine-26384712
Practice is the immediate (proximal) cause of high performance at a particular task. The notion that anybody has evolved an "innate" talent at something as specific to the modern world as playing a violin is obviously laughable.
However: The consistently high levels of motivation that an individual needs to practice a skill for the prerequisite length of time is very much a function of innate characteristics; particularly personality traits; Notably those associated with personality disorders such as GAD, OCD & ASPDS.
Of course, the extent to which a borderline personality disorder can be harnessed to support the acquisition of extremely high levels of skill and performance is very much a function of the environment. For example:
1. The level of stress that the individual is subjected to.
2. The built environment within which they live.
3. The social culture that they are part of.
4. The support that they get from family and friends.
In summary: To exhibit high levels of performance you do need some innate characteristics, (although not necessarily the innate "talents" that we typically associate with skill), but those characteristics need to be shaped and formed in the right environment: both built and social.
It should be possible to engineer a higher incidence of high levels of performance in selected individuals, but I suspect that the interaction between personality and environment is sufficiently subtle that we would not be able to guarantee an outcome in anything other than statistical terms.
Practice is the immediate (proximal) cause of high performance at a particular task. The notion that anybody has evolved an "innate" talent at something as specific to the modern world as playing a violin is obviously laughable.
However: The consistently high levels of motivation that an individual needs to practice a skill for the prerequisite length of time is very much a function of innate characteristics; particularly personality traits; Notably those associated with personality disorders such as GAD, OCD & ASPDS.
Of course, the extent to which a borderline personality disorder can be harnessed to support the acquisition of extremely high levels of skill and performance is very much a function of the environment. For example:
1. The level of stress that the individual is subjected to.
2. The built environment within which they live.
3. The social culture that they are part of.
4. The support that they get from family and friends.
In summary: To exhibit high levels of performance you do need some innate characteristics, (although not necessarily the innate "talents" that we typically associate with skill), but those characteristics need to be shaped and formed in the right environment: both built and social.
It should be possible to engineer a higher incidence of high levels of performance in selected individuals, but I suspect that the interaction between personality and environment is sufficiently subtle that we would not be able to guarantee an outcome in anything other than statistical terms.
Thursday, 27 February 2014
Taste on the frontier of complexity
Issues of taste and aesthetics are not normal fodder for engineers ... but when you make something sufficiently complex, you are by necessity operating on the frontier, far away from the well-trod paths of tradition and precedent.
Out here, survival is not so much a matter of rules, but a matter of gut instinct: Of taste. Of elegance. Of aesthetics and style.
Here, where the consequences of your decisions can be both impenetrable and dire: vigorous debate and discussion are essential; ego and stubbornness both fatal disorders, and a shared sense of taste an incredible boon.
This is where leadership at its' most refined can shine; The creation of a group aesthetic; A culture and identity that is oriented around matters of taste.
Out here, survival is not so much a matter of rules, but a matter of gut instinct: Of taste. Of elegance. Of aesthetics and style.
Here, where the consequences of your decisions can be both impenetrable and dire: vigorous debate and discussion are essential; ego and stubbornness both fatal disorders, and a shared sense of taste an incredible boon.
This is where leadership at its' most refined can shine; The creation of a group aesthetic; A culture and identity that is oriented around matters of taste.
Wednesday, 26 February 2014
The role of Machine Vision as a Rosetta Stone for Artificial Intelligence.
My life has not followed a straight path. There have been many detours and deviations. Never the less, if I turn my head to one side, turn around twice, squint through one eye, and be very selective about what I pick from my memories, I can piece together a narrative that sort-of makes sense.
It was in this context that the ethical supremacy of abstract (and disciplined) reasoning over unreliable and sometimes destructive emotional intuition was founded: A concept that forms one of the prime narrative threads that bind this story together.
To me, the abstract world was the one thing that held any hope of making consistent sense; and provided (now as then) the ultimate avenue for a near-perpetual state of denial. Not that I have been terribly successful in my quest (by the overwrought standards of my teenage ambitions at least), but the role of science & technology "groupie" seems to have served me and my career reasonably well so far, and has cemented a view of life as a tragedy in which abstract intellectualism serves as a platonic ideal towards which we forever strive, but are cursed never to achieve.
In any case, I quickly came to the conclusion that my intellectual faculties were completely insufficient to grasp the mathematics that my aspirations required. In retrospect this was less a victory of insight than the sort of failure that teaches us that excessive perfectionism, when coupled with a lack of discipline and determination will inevitably lead to self-imposed failure.
So, I drifted for a few years before discovering Artificial Intelligence, reasoning that if I was not bright enough to be a physicist in my own right, I should at least be able to get some assistance in understanding the world from a computer: an understanding that might even extend to the intricacies of my own unreliable brain. Indeed, my own (possibly narcissistic) quest to improve my understanding both of my own nature and that of the wider world is another key thread that runs through this narrative.
A good part of my motivation at the time came from my popular science reading list. Books on Chaos theory and non-linear dynamics had a great impact on me in those years, and from these, and the notions of emergence that they introduced, I felt that we were only beginning to scratch the surface of the potential that general purpose computing machines offered us.
In any case, I quickly came to the conclusion that my intellectual faculties were completely insufficient to grasp the mathematics that my aspirations required. In retrospect this was less a victory of insight than the sort of failure that teaches us that excessive perfectionism, when coupled with a lack of discipline and determination will inevitably lead to self-imposed failure.
So, I drifted for a few years before discovering Artificial Intelligence, reasoning that if I was not bright enough to be a physicist in my own right, I should at least be able to get some assistance in understanding the world from a computer: an understanding that might even extend to the intricacies of my own unreliable brain. Indeed, my own (possibly narcissistic) quest to improve my understanding both of my own nature and that of the wider world is another key thread that runs through this narrative.
A good part of my motivation at the time came from my popular science reading list. Books on Chaos theory and non-linear dynamics had a great impact on me in those years, and from these, and the notions of emergence that they introduced, I felt that we were only beginning to scratch the surface of the potential that general purpose computing machines offered us.
My (eventual) undergraduate education in AI was (initially) a bit of a disappointment. Focusing on "good old fashioned" AI and computational linguistics, the intellectual approach that the majority of the modules took was oriented around theorem proving and rule-based systems: A heady mix of Noam Chomsky and Prolog programming. This classical and logical approach to understanding the world was really an extension of the philosophy of logic to the computer age; a singularly unimaginative act of intellectual inertia that left little room for the messiness, complexity and chaos that characterised my understanding of the world, whilst similarly confirming my view that the tantalising potential of general-purpose computation was inevitably destined to remain untapped. More than this, the presumption that the world could be described and understood in terms of absolutist rules struck me as an essentially arrogant act. However, I was still strongly attracted to the notion of logic as the study of how we "ought" to think, or the study of thought in the abstract; divorced from the messy imperfections of the real world. Bridging this gap, it seemed to me, was an activity of paramount importance, but an exercise that could only realistically begin at one end: the end grounded in messy realities rather than head-in-the-clouds abstraction.
As a result of this, I gravitated strongly towards the machine learning, neural networks and machine vision modules that became available towards the end of my undergraduate education. These captured my attention and my imagination in a way that the pseudo-intellectualism of computational linguistics and formal logic could not.
My interest in neural networks was tempered somewhat by my continuing interest in "hard" science & engineering, and the lingering suspicion that much "soft" (and biologically inspired) computing was a bit of an intellectual cop-out. A view that has been confirmed a couple of times in my career. (Never trust individuals that propose either neural networks or genetic algorithms without first thoroughly exploring the alternatives!).
On the other hand, machine learning and statistical pattern recognition seemed particularly attractive to my 20-something-year-old mind, combining a level of mathematical rigour which appealed to my ego and my sense of aesthetics, and readily available geometric interpretation which appealed to my predilection for visual and spatial reasoning. The fact that it readily acknowledged the inherent complexity and practical uncertainty involved in any realistic "understanding" of the world struck a chord with me also: It appeared to me as a more intellectually honest and humble practitioners' approach than the "high church" of logic and linguistics, and made me re-appraise the A-level statistics that I had shunned a few years earlier. (Honestly, the way that we teach statistics is just horrible, and most introductory statistics textbooks do irreparable damage to an essential and brilliant subject).
The humility and honesty was an important component. Most practitioners that I met in those days talked about pattern recognition being a "dark art", emphasis on exploratory data analysis and intuitive understanding of the dataset. Notably absent was the arrogance and condescension that seems to characterise the subject now that "Big Data" and "Data Science" have become oh-so-trendy; attracting the mayflies and the charlatans by the boatload.
In any case, then as now, statistical pattern recognition is a means to an end: An engineering solution to bridge the gap between the messy realities of an imperfect world, low level learning and data analysis and the platonic world of abstract thought and logic. This view was reinforced by the approach taken by the MIT COG team: reasoning that in order to learn how to behave in the world, the robot needs a body with sensors and effectors, so it can learn how to make sense of the world in a directed way.
I didn't have a robot, but I could get data. Well, sort of. At that point in time, data-sets were actually quite hard to get hold of. The biggest dataset that I could easily lay my hands on (as an impoverished undergraduate) were the text files from Project Gutenberg; and since my mind (incorrectly) equated natural language with grammars and parsing, rather than statistics and machine learning, my attention turned elsewhere.
That elsewhere was image data. In my mind (influenced by the MIT COG approach), we needed to escape from the self-referential bubble of natural language by pinning abstract concepts to real-world physical observations. Text alone was not enough. Machine Vision would be the rosetta stone that would enable us to unlock it's potential. By teaching a machine to look at the world of objects, it could teach itself to understand the world of men.
One of my fellow students actually had (mirable diu!) a digital camera, which stored its' images on a zip-disk (the size of a 3.25 inch floppy disk), and took pictures that (if I recall correctly) were about 800x600 in resolution. I borrowed this camera and made my first (abortive) attempts to study natural image statistics; an attempt that continued as I entered my final year as an undergraduate, and took on my final year project: tracing bundles of nerves through light microscopy images of serial ultra-microtome sections of drosophila ganglia. As ever, the scope of the project rather outstripped my time and my abilities, but some important lessons were nonetheless learned.
As a result of this, I gravitated strongly towards the machine learning, neural networks and machine vision modules that became available towards the end of my undergraduate education. These captured my attention and my imagination in a way that the pseudo-intellectualism of computational linguistics and formal logic could not.
My interest in neural networks was tempered somewhat by my continuing interest in "hard" science & engineering, and the lingering suspicion that much "soft" (and biologically inspired) computing was a bit of an intellectual cop-out. A view that has been confirmed a couple of times in my career. (Never trust individuals that propose either neural networks or genetic algorithms without first thoroughly exploring the alternatives!).
On the other hand, machine learning and statistical pattern recognition seemed particularly attractive to my 20-something-year-old mind, combining a level of mathematical rigour which appealed to my ego and my sense of aesthetics, and readily available geometric interpretation which appealed to my predilection for visual and spatial reasoning. The fact that it readily acknowledged the inherent complexity and practical uncertainty involved in any realistic "understanding" of the world struck a chord with me also: It appeared to me as a more intellectually honest and humble practitioners' approach than the "high church" of logic and linguistics, and made me re-appraise the A-level statistics that I had shunned a few years earlier. (Honestly, the way that we teach statistics is just horrible, and most introductory statistics textbooks do irreparable damage to an essential and brilliant subject).
The humility and honesty was an important component. Most practitioners that I met in those days talked about pattern recognition being a "dark art", emphasis on exploratory data analysis and intuitive understanding of the dataset. Notably absent was the arrogance and condescension that seems to characterise the subject now that "Big Data" and "Data Science" have become oh-so-trendy; attracting the mayflies and the charlatans by the boatload.
In any case, then as now, statistical pattern recognition is a means to an end: An engineering solution to bridge the gap between the messy realities of an imperfect world, low level learning and data analysis and the platonic world of abstract thought and logic. This view was reinforced by the approach taken by the MIT COG team: reasoning that in order to learn how to behave in the world, the robot needs a body with sensors and effectors, so it can learn how to make sense of the world in a directed way.
I didn't have a robot, but I could get data. Well, sort of. At that point in time, data-sets were actually quite hard to get hold of. The biggest dataset that I could easily lay my hands on (as an impoverished undergraduate) were the text files from Project Gutenberg; and since my mind (incorrectly) equated natural language with grammars and parsing, rather than statistics and machine learning, my attention turned elsewhere.
That elsewhere was image data. In my mind (influenced by the MIT COG approach), we needed to escape from the self-referential bubble of natural language by pinning abstract concepts to real-world physical observations. Text alone was not enough. Machine Vision would be the rosetta stone that would enable us to unlock it's potential. By teaching a machine to look at the world of objects, it could teach itself to understand the world of men.
One of my fellow students actually had (mirable diu!) a digital camera, which stored its' images on a zip-disk (the size of a 3.25 inch floppy disk), and took pictures that (if I recall correctly) were about 800x600 in resolution. I borrowed this camera and made my first (abortive) attempts to study natural image statistics; an attempt that continued as I entered my final year as an undergraduate, and took on my final year project: tracing bundles of nerves through light microscopy images of serial ultra-microtome sections of drosophila ganglia. As ever, the scope of the project rather outstripped my time and my abilities, but some important lessons were nonetheless learned.
... To be continued.
Software sculpture
Developing software is a craft that aspires to be an art.
It is both additive and subtractive. As we add words and letters to our formal documents, we build up declarations and relations; descriptions of logic and process. As this happens, we carve away at the world of possibilities: we make some potentialities harder to reach. The subtractive process is *implied* by the additive process, rather than directly specified by it.
If this subtractive process proceeds too slowly, we end up operating in a space with too many degrees of freedom: difficult to describe; difficult to reason about; and with too many ways that the system can go wrong.
If the subtractive process proceeds too quickly, we end up eliminating potentialities which we need, eventually, to realise. This results in prohibitively expensive amounts of rework.
The balance is a delicate one, and it involves intangible concepts that are not directly stated in the formal documents that we write; only indirectly implied.
It is both additive and subtractive. As we add words and letters to our formal documents, we build up declarations and relations; descriptions of logic and process. As this happens, we carve away at the world of possibilities: we make some potentialities harder to reach. The subtractive process is *implied* by the additive process, rather than directly specified by it.
If this subtractive process proceeds too slowly, we end up operating in a space with too many degrees of freedom: difficult to describe; difficult to reason about; and with too many ways that the system can go wrong.
If the subtractive process proceeds too quickly, we end up eliminating potentialities which we need, eventually, to realise. This results in prohibitively expensive amounts of rework.
The balance is a delicate one, and it involves intangible concepts that are not directly stated in the formal documents that we write; only indirectly implied.
Thursday, 20 February 2014
The cost of complexity in software engineering is like the sound barrier. How can we break it?
In response to: http://www.pistoncloud.com/2014/02/do-successful-programmers-need-to-be-introverted/
Q: Do successful programmers need to be introverted?
A: It depends.
One unpleasant consequence of network effects is that the cost of communication has a super-linear relationship with system complexity. Fred Brooks indicates that the communication overhead for large (or growing) teams can become significant enough to stop work in it's tracks. As the team grows beyond a certain point, the cost quickly shoots up to an infeasible level. By analogy with the sound barrier, I call this the communications barrier; because both present a seemingly insurmountable wall blocking our ability to further improve our performance.
This analysis argues for small team sizes, perhaps as small as n=1. Clearly introversion is an asset under these circumstances.
However, irrespective of their innate efficiency, there are obvious limits to what a small team can produce. Building a system beyond a certain level of complexity requires a large team, and large teams need to break the communications barrier to succeed. Excellent, highly disciplined and highly effective communications skills are obviously very important under these circumstances, which calls for a certain type of (disciplined) extroversion; perhaps in the form of visionary leadership?
My intuition tells me that breaking the communications barrier is a matter of organization and detail-oriented thinking. Unfortunately I have yet to observe it being done both effectively and systematically by any of the organisations that I have yet worked for.
Has anybody else seen or worked with an organisation that has successfully broken the communications barrier? If so, how did they do it?
Q: Do successful programmers need to be introverted?
A: It depends.
One unpleasant consequence of network effects is that the cost of communication has a super-linear relationship with system complexity. Fred Brooks indicates that the communication overhead for large (or growing) teams can become significant enough to stop work in it's tracks. As the team grows beyond a certain point, the cost quickly shoots up to an infeasible level. By analogy with the sound barrier, I call this the communications barrier; because both present a seemingly insurmountable wall blocking our ability to further improve our performance.
This analysis argues for small team sizes, perhaps as small as n=1. Clearly introversion is an asset under these circumstances.
However, irrespective of their innate efficiency, there are obvious limits to what a small team can produce. Building a system beyond a certain level of complexity requires a large team, and large teams need to break the communications barrier to succeed. Excellent, highly disciplined and highly effective communications skills are obviously very important under these circumstances, which calls for a certain type of (disciplined) extroversion; perhaps in the form of visionary leadership?
My intuition tells me that breaking the communications barrier is a matter of organization and detail-oriented thinking. Unfortunately I have yet to observe it being done both effectively and systematically by any of the organisations that I have yet worked for.
Has anybody else seen or worked with an organisation that has successfully broken the communications barrier? If so, how did they do it?
Wednesday, 19 February 2014
Computer security in the machine age.
The complexity of modern technology makes it terribly vulnerable to abuse and exploitation. So many devices have been attacked and compromised that when faced with any given piece of technology, the safest assumption to make is that it has been or will be subverted.
For today's devices, the consequences don't intrude so much into the physical world. Some money may get stolen; a website may be defaced, some industrial (or military) secrets stolen, but (Stuxnet aside), the potential for physical death, damage & destruction is limited.
For tomorrow's devices, the story is quite terrifying. A decade from now, the world will be filled with thousands upon thousands of autonomous cars and lorries, domestic robots and UAVs.
Today, criminals use botnets are used to send spam and commit advertising fraud. Tomorrow, what will malicious hackers do when their botnet contains tens of thousands of robot cars and trucks?
What can we do to change the trajectory of this story? What can we do to alter it's conclusion?
Sunday, 16 February 2014
Performance
The internet is just terrible for our sense of self-worth. We can reach out and connect to the very best and most talented individuals in the world; we can read what they write, and even converse with them if we choose. It is only natural that we should compare ourselves to them and find ourselves wanting.
It is easy to retreat from this situation with a sense of despair and self-pity, but there is another thought that occurs to me also, and that thought is this: Role models are a red herring. Does your role model have a role model of his own? Probably not. You don't get to be an outstanding performer by emulating somebody else, nor (just) by competing with your peers, nor (just) by following your passion, nor (just) by putting in your 10,000 hours. True performance comes from owning an area of expertise; from living and breathing it. From having your identity and sense of self so totally wrapped up in it that you can do nothing else.
Clearly, this sucks for everybody else around you, so it is a path that not many people should follow .... which makes me feel a bit better about my own inadequacies.
It is easy to retreat from this situation with a sense of despair and self-pity, but there is another thought that occurs to me also, and that thought is this: Role models are a red herring. Does your role model have a role model of his own? Probably not. You don't get to be an outstanding performer by emulating somebody else, nor (just) by competing with your peers, nor (just) by following your passion, nor (just) by putting in your 10,000 hours. True performance comes from owning an area of expertise; from living and breathing it. From having your identity and sense of self so totally wrapped up in it that you can do nothing else.
Clearly, this sucks for everybody else around you, so it is a path that not many people should follow .... which makes me feel a bit better about my own inadequacies.
So there.
Friday, 14 February 2014
Side channel attacks on silos
Silos of knowledge and expertise build up all to easily, not only as a result of human tendency towards homophily, but also because of more fundamental bandwidth limitations.
As with most human failings, we look to simple technological fixes to resolve them.
One frequently overlooked technology is the use of ambient "side" channels to encourage or enforce communication:
1. Human environment. (Who do you work with).
2. Built environment. (Who do you sit next to).
3. Informational environment. (Where do your documents live).
Every act of sensory perception during the work day is an opportunity for meaningful communication.
As with most human failings, we look to simple technological fixes to resolve them.
One frequently overlooked technology is the use of ambient "side" channels to encourage or enforce communication:
1. Human environment. (Who do you work with).
2. Built environment. (Who do you sit next to).
3. Informational environment. (Where do your documents live).
Every act of sensory perception during the work day is an opportunity for meaningful communication.
Thursday, 13 February 2014
Code as haiku
Response to: http://www.ex-parrot.com/~pete/but-it-doesnt-mean-anything.html
"Code" is a horrible word.
I prefer "source documents", or, if pressed, "Formal descriptions of the program".
Using the word "code" implies that the document is "encoded" somehow, which is plainly undesirable and wrong.
With some notable (*) exceptions, the primary consumer of a "source document" is a human being, not a machine.
The machine's purpose is to ensure the formal validity and correctness of the document - but the majority of the informational content of the document (variable names, structure, comments) is exclusively directed to human developers.
We will never program in a "wild" natural language, but many programming languages (**) make a deliberate effort to support expressions which simulate or approximate natural language usage, albeit restricted to a particular idiomatic form.
There will always be a tension between keeping a formal language simple enough to reason about and permitting free, naturalistic expression - but this is the same tension that makes poetry and haiku so appealing as an art form.
So many source documents appear to be "in code", not because this is a necessary part of programming, but because it is very very difficult to write things which combine sufficient simplicity for easy understanding, and the correct representation of a difficult and complex problem. In most of these cases, clear understanding is baffled more by the complexity of the real world than by the nature of the programming language itself.
The rigidly deterministic nature of the computer forces the programmer to deal with a myriad of inconsistencies and complications that the non-programmer is able to elide or gloss over with linguistic and social gymnastics. The computer forces us to confront these complications, and to account for them. Legal drafting faces similar (albeit less extreme) challenges.
In the same way that Mathematics isn't really about numbers, but about the skill and craftsmanship of disciplined thought, programming isn't really about computers, but about what happens when you can no longer ignore the details within which the devil resides.
(*) Assembler & anything involving regular expressions.
(**) Python
"Code" is a horrible word.
I prefer "source documents", or, if pressed, "Formal descriptions of the program".
Using the word "code" implies that the document is "encoded" somehow, which is plainly undesirable and wrong.
With some notable (*) exceptions, the primary consumer of a "source document" is a human being, not a machine.
The machine's purpose is to ensure the formal validity and correctness of the document - but the majority of the informational content of the document (variable names, structure, comments) is exclusively directed to human developers.
We will never program in a "wild" natural language, but many programming languages (**) make a deliberate effort to support expressions which simulate or approximate natural language usage, albeit restricted to a particular idiomatic form.
There will always be a tension between keeping a formal language simple enough to reason about and permitting free, naturalistic expression - but this is the same tension that makes poetry and haiku so appealing as an art form.
So many source documents appear to be "in code", not because this is a necessary part of programming, but because it is very very difficult to write things which combine sufficient simplicity for easy understanding, and the correct representation of a difficult and complex problem. In most of these cases, clear understanding is baffled more by the complexity of the real world than by the nature of the programming language itself.
The rigidly deterministic nature of the computer forces the programmer to deal with a myriad of inconsistencies and complications that the non-programmer is able to elide or gloss over with linguistic and social gymnastics. The computer forces us to confront these complications, and to account for them. Legal drafting faces similar (albeit less extreme) challenges.
In the same way that Mathematics isn't really about numbers, but about the skill and craftsmanship of disciplined thought, programming isn't really about computers, but about what happens when you can no longer ignore the details within which the devil resides.
(*) Assembler & anything involving regular expressions.
(**) Python
Thursday, 9 January 2014
The UX of large-scale online education.
Written in response to: http://www.fastcompany.com/3021473/udacity-sebastian-thrun-uphill-climb
The number of students that complete the course may not be the right metric to look at. However, there are a number of steps that you could take that I think might improve completion rates.
Human beings are a pretty undisciplined bunch, as a rule. We dislike rigour and crave immediate gratification. Our Puritan work-ethic may predispose us to look down upon such human foibles, but there is no shame in exploiting them in the pursuit of the expansion of learning and the spread of knowledge.
Most of the following suggestions are oriented around giving students more fine-grained control over the timing and sequencing of their studies, as well as increasing the frequency and substance of the feedback. To complement this, some innovation may be required to come up with mechanisms that encourage and support the development of discipline without penalising those who simply cannot fit regular study around their other life commitments.
1. Recognise that study is a secondary or tertiary activity for many students: Study may be able to trump entertainment or leisure, but work and family will always come first.
2. Break the course materials up into tiny workshop-sized modules that can be completed in less than two weeks of part-time study. About 1 weekends' worth should be about right, allowing "sprints" of study to be interspersed and balanced with a healthy and proper commitment to family life.
3. Each module does not have to stand alone. It can build on prerequisites taught in other modules, but those prerequisites should be documented and suggested rather than enforced programmatically.
4. Assessments within and at the end of each of these micro-modules should be for the benefit of the student, and should not count towards any sort of award or certification.
5. The scheduling of study over the calendar year should be optional. One or more group schedules may be suggested, so collections of students can progress through the materiel together and interact online and in meet-ups, but others should be allowed to take a more economical pace, each according to their budget of time and attention.
6. Final exams can still be scheduled once or twice per year, coincident with the end of one or more suggested schedules. Students pacing their own study may need to wait a while before exam-time comes around, but the flexibility in study more than compensates for any disadvantage that they may have in the exam.
These suggestions should help lower barriers for students with otherwise packed calendars. In addition, it may be worthwhile experimenting with various techniques to grab students attention and re-focus it back on their learning objectives: Ideas from gamification point to frequent feedback and frequent small rewards to encourage attention and deep concentration. Also from the gaming world, sophisticated algorithms exist that are designed to match players of similar ability in online matches. The same algorithms can be used to match students of similar ability for competitive assessments and quizzes. In addition to gamification techniques, it should be possible to explore different schedules for "pushing" reminders and messages to students, or other prompts for further study. For example, you could send out emails with a few quiz questions that require just one more video to be watched. Finally, you can get people to pledge / commit to a short-term goal, for example, to reach a certain point in the module by a certain point in time (e.g. the end of the weekend).
The number of students that complete the course may not be the right metric to look at. However, there are a number of steps that you could take that I think might improve completion rates.
Human beings are a pretty undisciplined bunch, as a rule. We dislike rigour and crave immediate gratification. Our Puritan work-ethic may predispose us to look down upon such human foibles, but there is no shame in exploiting them in the pursuit of the expansion of learning and the spread of knowledge.
Most of the following suggestions are oriented around giving students more fine-grained control over the timing and sequencing of their studies, as well as increasing the frequency and substance of the feedback. To complement this, some innovation may be required to come up with mechanisms that encourage and support the development of discipline without penalising those who simply cannot fit regular study around their other life commitments.
1. Recognise that study is a secondary or tertiary activity for many students: Study may be able to trump entertainment or leisure, but work and family will always come first.
2. Break the course materials up into tiny workshop-sized modules that can be completed in less than two weeks of part-time study. About 1 weekends' worth should be about right, allowing "sprints" of study to be interspersed and balanced with a healthy and proper commitment to family life.
3. Each module does not have to stand alone. It can build on prerequisites taught in other modules, but those prerequisites should be documented and suggested rather than enforced programmatically.
4. Assessments within and at the end of each of these micro-modules should be for the benefit of the student, and should not count towards any sort of award or certification.
5. The scheduling of study over the calendar year should be optional. One or more group schedules may be suggested, so collections of students can progress through the materiel together and interact online and in meet-ups, but others should be allowed to take a more economical pace, each according to their budget of time and attention.
6. Final exams can still be scheduled once or twice per year, coincident with the end of one or more suggested schedules. Students pacing their own study may need to wait a while before exam-time comes around, but the flexibility in study more than compensates for any disadvantage that they may have in the exam.
These suggestions should help lower barriers for students with otherwise packed calendars. In addition, it may be worthwhile experimenting with various techniques to grab students attention and re-focus it back on their learning objectives: Ideas from gamification point to frequent feedback and frequent small rewards to encourage attention and deep concentration. Also from the gaming world, sophisticated algorithms exist that are designed to match players of similar ability in online matches. The same algorithms can be used to match students of similar ability for competitive assessments and quizzes. In addition to gamification techniques, it should be possible to explore different schedules for "pushing" reminders and messages to students, or other prompts for further study. For example, you could send out emails with a few quiz questions that require just one more video to be watched. Finally, you can get people to pledge / commit to a short-term goal, for example, to reach a certain point in the module by a certain point in time (e.g. the end of the weekend).
Subscribe to:
Posts (Atom)