Ask anyone who knows me, and they’ll tell you that I’m crazy. Too many mornings, I’ll wake up at 4 am worried about something at work. More often than not, the cause is a yet-unrealized fear about something at work. Will we hit the deadline? Did we test the right things? Maybe I should send my boss an email about how the big project is going. These are the scary monsters that live under my bed – or perhaps more accurately, in my inbox. I keep a pen and paper on my nightstand so that I can jot them down, and quickly fall back to sleep, knowing that my 6 am awake self will follow up on my 4 am self’s troubles. It turns out that fear and my love of (and biological need for) sleep is a great motivator, so when I’m at work, a single, simple principal governs many of my decisions: What do I need to do today so that I’ll sleep soundly tonight? Does that sound crazy? Crazy like a fox maybe.
As a species, we’re generally risk adverse, and it’s not hard to see why. We enjoy doing things that bring us pleasure. We avoid, procrastinate, and otherwise put off things that cause us pain. We prefer to work on fun things. Flashy things. But the corollary to this is that we oftentimes avoid the more substantive, scary monsters that live in our backlog. We put off the pain. However, prudence and our future success and well-being require us to do the opposite. If it hurts, we have to bring the pain forward. To win (and to sleep soundly), we need to modify our behavior. We have to use our fear and uneasiness to give us a competitive edge. Maybe, sometimes pleasure is the absence of pain.
With the advent of automation software, there so many tools and services that can help put your mind at ease. With NAGIOS, Splunk, OpenNMS, and a host of other solutions, you can monitor your hardware and network infrastructure. With just a bit of work, you can instrument your own applications so that they’ll phone home when they’re sick. Other tools will let you know when your system has just a case of the sniffles and not a full-blown cold, so you can take action before you have a real problem on your hands. Do this, and you’re well on your way toward a goal of “five nines” of uptime (or roughly 5 minutes and 15 seconds of downtime per year). And even if you’re not quite hitting those goals, at least you can sleep soundly knowing that your infrastructure will let you know when it’s down.
The more difficult part for me has always been making sure that my development teams are aligned with our product and business strategy. Basically, the set of “what ifs” that arise as you’re planning or architecting a product.
- What if my budget gets shrunk or cut? How can we ensure that we still deliver something with real business value?
- What if my due date (or an important demo) gets moved up? How can we still deliver something with real business value?
- What if the product is wildly successful and our servers get overwhelmed with traffic?
- What if we need to one-day support mobile platforms in addition to a more traditional browser-based delivery system?
Iterative methodologies like Agile and Lean go a long way toward helping us solve these problems. But the prevailing attitude in software development still seems to be that if something is difficult or expensive (or even just not much fun), we try to do it as few times as possible. This usually means deferring it until as late as possible. Some examples of this include:
- Merging and integrating the work of multiple people
- Merging and integrating the work of multiple teams
- Executing tests
- Testing the integration of your software components & environments
- Deploying into a production-like environment
So what are some strategies for dealing with this mess?
- Identify the most valuable things in your backlog. You should probably work on those first. The only feature that matters is one that’s in your customers’ hands, earning you money. Remember that a feature is “done” not when your developers have finished coding, or your QA team has finished testing it. Done means released.
- Prioritize things so that you can easily identify the point of diminishing returns and do the important things first. An example of this was a performance testing exercise I ran at a former job. My QA lead came to me with a set of 25 scenarios he wanted to test, and boy did he take customer satisfaction very seriously. He was like me – not knowing that we’d hit our SLAs would cause him to lose sleep. It turned out that doing 2 of those steps would get him to 80% confidence that we were going to hit our SLA, and that we could run those tests in about a week. Doing the top 5 would take 2 weeks and get him above 90% confidence. Those were the critical tasks that would let him sleep at night. So I asked him why those weren’t the first 5 things on his list, and he face palmed. The moral of the story is to be defensive. Your project’s budget might get cut or a release date might get moved up and you always want the most to show for it.
- Identify the riskiest thing in your backlog and prioritize it. As an example of this, one of my teams was once working on making pretty code coverage graphs, rather than getting access to and knowledge of a third party’s system that we needed to integrate into. As a development manager, are you more worried about your developers being able to write solid code, or of them integrating into a third party or legacy system that’s poorly documented, and not under your control? If you answered “the first thing”, you need to find yourself some better developers or get over some trust issues. If you answered “the second thing”, you’re in good company. Do that first. In this case, the solution was to integrate early and often. PoCs and stubs go a long way toward ensuring that your integration efforts are successful.
- Build quality in, using strategies like Continuous Integration and Continuous Delivery. Make sure that you’re continuously integrating into a production-like environment. Shore that up with a healthy amount of test automation to help catch defects as early as possible (an application of the principle “bring the pain forward”), and you’re well on your way toward ensuring that your code is always releasable.
- Reduce cycle times as much as possible. Release early & often. This is where Lean, Continuous Delivery, and the DevOps mindset really shine.
- In short: Be Defensive.
For those of you interested in learning more, check out the following resources:
- Continuous Integration – If something hurts, do it more often by Evan Bottcher
- DevOps and Agile Release Management by Jez Humble
- Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation by Jez Humble and David Farley