 Monday, 23 January 2017

As a teenager, I remember having a passing interest in hacking.  Perhaps this came from watching the movie Sneakers.  Whatever the origin, the fancy passed quickly because I prefer building stuff to breaking other people's stuff.  Therefore, what I know about hacking pretty much stops at understanding terminology and high level concepts.

Consider the term "zero day exploit," for instance.  While I understand what this means, I have never once, in my life, sat on discovery of a software vulnerability for the purpose of using it somehow.  Usually when I discover a bug, I'm trying to deposit a check or something, and I care only about the inconvenience.  But I still understand the term.

"Zero day" refers to the amount of time the software vendor has to prepare for the vulnerability.  You see, the clever hacker gives no warning about the vulnerability before using it.  (This seems like common sense, though perhaps hackers with more derring do like to give them half a day to watch them scramble to release something before the hack takes effect.)  The time between announcement and reality is zero.

Increased Deployment Cadence

Let's co-opt the term "zero day" for a different purpose.  Imagine that we now use it to refer to software deployments.  By "zero day deployment," we thus mean "software deployed without any prior announcement."

blog-are-you-ready-for-zero-day-software-deploymentBut why would anyone do this?  Don't you miss out on some great marketing opportunities?  And, more importantly, can you even release software this quickly?  Understanding comes from realizing that software deployment is undergoing a radical shift.

To understand this think about software release cadences 20 years ago.  In the 90s, Internet Explorer won the first browser war because it managed to beat Netscape's plodding release of going 3 years between releases.  With major software products, release cadences of a year or two dominated the landscape back then.

But that timeline has shrunk steadily.  For a highly visible example, consider Visual Studio.  In 2002, 2005, 2008, Microsoft released versions corresponding to those years.  Then it started to shrink with 2010, 2012, and 2013.  Now, the years no longer mark releases, per se, with Microsoft actually releasing major updates on a quarterly basis.

Zero Day Deployments

As much as going from "every 3 years" to "every 3 months" impresses, websites and SaaS vendors have shrunk it to "every day."  Consider Facebook's deployment cadence.  They roll minor updates every business day and major ones every week.

With this cadence, we truly reach zero day deployment.  You never hear Facebook announcing major upcoming releases.  In fact, you never hear Facebook announcing releases, period.  The first the world sees of a given Facebook release is when the release actually happens.  Truly, this means zero day releases.

Oh, don't get me wrong.  Rumors of upcoming features and capabilities circulate, and Facebook certainly has a robust marketing department.  But Facebook and companies with similar deployment approaches have impressively made deployments a non-event.  And others are looking to follow suit, perhaps yours included.

Conceptual Impediments to Zero Day Deployments

If what I just said made you spit your drink at the screen, I understand.  Perhaps your deployment and release process takes so long that the thought of shrinking it to a day made you laugh.  Or perhaps it terrified.  Either way, I can understand that it may seem quite a leap.

You may conceive of Facebook and other practitioners so alien to your own situation that you see no path from here to there.  But in reality, they almost certainly do the same things you do as part of your longer process -- just optimized and automated.

Impediments take a variety of forms.  You might have lengthy quality assurance and vetting processes, perhaps that require many iterations between the developers and quality assurance.  You might still be packaging software onto DVDs and shipping it to customers.  Perhaps you run all sorts of checks and analytics on it.  But all will fall under the general heading of requiring manual intervention or consuming a lot of time.

To get to zero day deployments, you need to automate and speed up considerably, and this can seem daunting.

What's Common Today

Some good news exists, though.  The same forces that let the Visual Studio team see such radical improvement push on software shops across the board.  We all have access to helpful techs.

For instance, the overwhelming majority of organizations now have continuous integration via dedicated build machines.  Software developers commit code, and these things scoop it up, compile it, and package it up in a deployable package.  This activity now happens on the order of minutes whereas, in the past, I can remember shops where this was some poor guy's entire job, and he'd spend days on each build.

And, speaking of the CI server, a lot of them run automated test suites as part of what they do.  Most commonly, this means unit tests.  But they might also invoke acceptance tests and even more exotic things like smoke, GUI, and functionality tests.  You can thus accept commits, build the software, run a bunch of test, and get it ready to deploy.

Of course, you can also automate the actual deployment as well.  It stands to reason that, if your build machine can ball it up into a deliverable, it can deliver that deliverable.  This might be harder with physical media involved, but as more software deliveries happen over networks, more of them get automated.

What We Need Next

With all of that in place, why don't we have more zero day deployments?  What's missing?

Again, discounting the problem of physical media, I'd say quality checks present the biggest issue.  We can compile, run automated tests, and deploy automatically.  But does this guarantee acceptable production behavior?

What about the important element of code reviews?  How do you assure that, even as automated tests pass, the application isn't piling up mountains of technical debt and impeding future deployments?  To get to zero day deployments, we must address these issues.

Don't get me wrong.  Other things matter here as well.  Zero day deployments require robust production checks and sophisticated "oops, that didn't work, rollback!" capabilities.  But I think that nothing will matter more than automated quality checks.

Each time you commit code, you need an intelligent analysis of that code that should fail the build as surely as failing tests if issues crop up.  In a zero day deployment context, you cannot afford best practice violations.  You cannot afford slipping quality, mounting technical debt, and you most certainly cannot afford code rot.  Today's rot in a zero day deployment scenario means tomorrow's inability to deploy that way.

About the Author

Erik Dietrich

I'm a passionate software developer and active blogger. Read about me at my site. View all posts by Erik Dietrich

 Saturday, 05 November 2016

During my younger days, I worked for a company that made a habit of a strategic acquisition.  They didn't participate in Time Warner style mergers, but periodically they would purchase a smaller competitor or a related product.  And on more than one occasion, I inherited the lead role for the assimilating software from one of these organizations.  Lucky me, right?

If I think in terms of how to describe this to someone, a plumbing analogy comes to mind.  Over the years, I have learned enough about plumbing to handle most tasks myself.  And this has exposed me to the irony of discovering a small leak in a fitting plugged by grit or debris.  I find this ironic because two wrongs make a right.  A dirty, leaky fitting reaches sub-optimal equilibrium, and you spring a leak when you clean it.

Legacy codebases have this issue as well.  You inherit some acquired codebase, fix a tiny bug, and suddenly the defect floodgates open.  And then you realize the perilousness of your situation.

While you might not have come by it in the same way that I did, I imagine you can relate.  At some point or another, just about every developer has been thrust into supporting some creaky codebase.  How should you handle this?

Put Your Outrage in Check

First, take some deep breaths.  Seriously, I mean it.  As software developers, we seem to hate code written by others.  In fact, we seem to hate our own code if we wrote it more than a few months ago.  So when you see the legacy codebase for the first time, you will feel a natural bias toward disgust.

But don't indulge it.  Don't sit there cursing the people that wrote the code, and don't take screenshots to send to the Daily WTF.  Not only will it do you no good, but I'd go so far as to say that this is actively counterproductive.  Deciding that the code offers nothing worth salvaging makes you less inclined to try to understand it.

The people that wrote this code dealt with older languages, older tooling, older frameworks, and generally less knowledge than we have today.  And besides, you don't know what constraints they faced.  Perhaps bosses heaped delivery pressure on them like crazy.  Perhaps someone forced them to convert to writing in a new, unfamiliar language.  Whatever the case may be, you simply didn't walk in their shoes.  So take a breath, assume they did their best, and try to understand what you have under the hood.

Get a Visualization of the Architecture

Once you've settled in mentally for this responsibility, seek to understand quickly.  You won't achieve this by cracking open the code and looking through random source files.  But, beyond that, you also won't achieve it by looking at their architecture documents or folder structures.  Reality gets out of sync with intention, and those things start to lie.  You need to see the big picture, but in a way that lines up with reality.

Look for tools that map dependencies and can generate a visual of the codebase.  Plenty of these tools exist for you and can automate visual depictions.  Find one and employ it.  This will tell you whether the architecture resembles the neat diagram given to you or not.  And, more importantly, it will get you to a broad understanding much more quickly.


Once you have the picture you need of the codebase and the right frame of mind, you can start doing things to it.  And the first thing you should do is to start writing characterization tests.

If you have not heard of them before, characterization tests have the purpose of, well, characterizing the codebase.  You don't worry about correct or incorrect behaviors.  Instead, you accept at face value what the code does, and document those behaviors with tests.  You do this because you want to get a safety net in place that tells you when your changes affect inputs and outputs.

As this XKCD cartoon ably demonstrates, someone will come to depend on the application's production behavior, however problematic.  So with legacy code, you cannot simply decide to improve a behavior and assume your users will thank you.  You need to exercise caution.

But characterization tests do more than just provide a safety net.  As an exercise, they help you develop a deeper understanding of the codebase.  If the architectural visualization gives you a skeleton understanding, this starts to put meat on the bones.

Isolate Problems

With a reliable safety net in place, you can begin making strategic changes to the production code beyond simple break/fix.  I recommend that you start by finding and isolating problematic chunks of code.  In essence, this means identifying sources of technical debt and looking to improve, gradually.

This can mean pockets of global state or extreme complexity that make for risky change.  But it might also mean dependencies on outdated libraries, frameworks, or APIs.  In order to extricate yourself from such messes, you must start to isolate them from business logic and important plumbing code.  Once you have it isolated, fixes will come more easily.

Evolve Toward Modernity

Once you've isolated problematic areas and archaic dependencies, it certainly seems logical to subsequently eliminate them.  And, I suggest you do just that as a general rule.  Of course, sometimes isolating them gives you enough of a win since it helps you mitigate risk.  But I would consider this the exception and not the rule.  You want to remove problem areas.

I do not say this idly nor do I say it because I have some kind of early adopter drive for the latest and greatest.  Rather, being stuck with old tooling and infrastructure prevents you from taking advantage of modern efficiencies and gains.  When some old library prevents you from upgrading to a more modern language version, you wind up writing more, less efficient code.  Being stuck in the past will cost you money.

The Fate of the Codebase

As you get comfortable and take ownership of the legacy codebase, never stop contemplating its fate.  Clearly, in the beginning, someone decided that the application's value outweighed its liability factor, but that may not always continue to be true.  Keep your finger on the pulse of the codebase, while considering options like migration, retirement, evolution, and major rework.

And, finally, remember that taking over a legacy codebase need not be onerous.  As initially shocked as I found myself with the state of some of those acquisitions, some of them turned into rewarding projects for me.  You can derive a certain satisfaction from taking over a chaotic situation and gradually steer it toward sanity.  So if you find yourself thrown into this situation, smile, roll up your sleeves, own it and make the best of it.

