No one spends any significant amount of time working in an “agile” software shop, or going through any type of “agile” training, without encountering the distinction between “agile” and “waterfall”. In fact, the distinction is so important, it might be fair to say that the only real definition of an “agile” process is that it isn’t waterfall. At least that is the case in this age of big consulting firms teaching software shops how to be “properly agile” (Big Agile, or Agile, Inc., as we call it). Waterfall is the ultimate scapegoat. However much we try to pin down its essential features, what matters, and what ultimately verifies whether we have correctly described it, is that it is the cause of everything that ever went wrong in the software world.
You may think I’m being unfair, and I’m not necessarily saying this was conscious on anyone’s part (I’m also not necessarily saying it wasn’t conscious), but if you haven’t already, eventually you will encounter debates over process that amount to, “that’s waterfall, and waterfall is bad”. Accusing something of being “waterfall” is the ultimate diss in the agile world. The defense will always be to deny the charge as ridiculous.
This really hits home if you, like the proverbial heretic going through a crisis of faith, start doing some research and discover that “waterfall” is a hypothetical process created in a paper as an unrealistic and over-simplified thought experiment. No company has ever even tried to follow it. It was conjured up in order to use it as a comparison point for a more realistic process. The now prevalent charge that the software business was following waterfall, and that’s why things were so expensive, bug-prone, slow-to-adapt, or whatever else, is literally a straw man!
This is so important because, like I said earlier, the most coherent definition of “agile” in the various training programs I have been able to identify is:
agile = !waterfall
(for those non-programmers reading this, the “!” means “not”)
So then the software business has been “agile” (not doing waterfall) all along! Why, then, are we paying this company millions of dollars to teach us how to stop doing what we’ve been doing?
In the effort to comprehend this, I have often seen people identify the essential feature of “waterfall” as being the existence of a sequence of steps that are performed in a specific order:
feature design -> architecture/implementation -> testing -> delivery
Any time this kind of step-by-step process rears its ugly head, we scream “waterfall!” and change what we’re doing. The results are humorous. It’s like yelling at someone to put their hands up and drop their pants at the same time.
These steps are necessary steps to building a piece of software, and no process is ever going to change that. You can’t implement a feature before the feature is specified. You can’t test code that doesn’t exist yet (TDD isn’t saying “test before implement”, it’s saying “design the test first and use it to drive implementation”), and you at least shouldn’t deliver code before testing it. The only way to not follow these steps, in that order, is to not build software. From the paper:
One cannot, of course, produce software without these steps
(While these are the unavoidable steps of building software, this does not imply that other things, that which become commonplace in the industry, are unavoidable. This includes organizing a business into siloed “departments” around each of these steps, believing that “architecture” and “coding” are distinct steps that should be done by different people, etc.)
As the Royce paper explains, the existence and order of these steps is not the essential feature of the “waterfall” model he was describing. The essential feature, the “fall” of “waterfall”, is that the process is unidirectional and there is no opportunity to move “up”. Once a feature is specified and coding begins, feature design is over and cannot be revisited. Once coding is finished and testing begins, we can’t revisit coding (not in the sense that we can’t fix bugs uncovered by testing, but that we can’t go back and rearchitect the code). Royce then “corrects” the process by letting the water “flow up”. Not only can something that occurs in one step induce a return to a previous step, it can induce a return to any of the previous steps, even the first one.
If “agile” is promising a way to build software without this sequence of steps, it is promising something impossible, even nonsensical. So then what is agile promising? What is its point?
Let’s remind ourselves of what the word “agility” actually means. Anyone who’s played an action RPG like the Elder Scrolls should remember that “Agility” is one of the “attributes” of your character, for which you can train and optimize. In particular, it is not “Speed”, and it is not “Strength”. Agility is the ability to quickly change direction. The easiest way to illustrate agility, and the fact it competes with speed, is with air travel vehicles: airplanes and helicopters. An airplane is optimized for speed. It can go very fast, and is very good and making a beeline from one airport to another. It is not very good at making a quick U-turn mid-flight. A helicopter, on the other hand, is much more maneuverable. It can change direction very quickly. To do so, it sacrifices top speed.
An airplane is optimized for the conditions of being 30,000 ft in the air. There are essentially no obstacles, anything that needs to be avoided is large and detectable from far away and for a long time (like a storm), and the flight path is something that can be pretty much exactly worked out ahead of time.
A helicopter is optimized for low altitude flight. There are more smaller obstacles that cannot feasibly be mapped out perfectly. The pilot needs to make visual contact with obstacles and avoid them “quickly” by maneuvering the helicopter. There is, in a sense, more “traffic”: constantly changing, unpredictable obstacles that prevent a flight path from being planned ahead of time. The flight path needs to be discovered and worked out one step at a time, during flight.
(This is similar to the example Eric Reis uses in The Lean Startup, where he compares the pre-programmed burn sequence of a NASA rocket to the almost instantaneous feedback loop between a car driver’s eyes and his hands and feet, operating the steering wheel, gas and brakes)
The goal of an “agile” process in software development is to optimize for business agility over speed. Instead of deciding now that 5 years from now we want one specific software product, with all the requirements and features worked out today, and to get there in one fast-as-possible straight shot, we accept that the software industry is rapidly growing and changing, and both the evolution of technology and the changes in market trends are unpredictable. If we even know where we want to be in 5 years, we don’t know the best path to get there. It’s more likely we really don’t know what kind of software we want to have in 5 years.
That is fairly abstract. To put in more concretely: instead of building a large set of features into infrequent releases of software, say once or twice per year, we would like to frequently release small updates (anywhere from once or twice per week to several times per day), so we can get quick feedback on them. Why? So we can feed the data we can only acquire by releasing into the market back into the process of deciding what features to build next. That is the essential goal, I believe, of agile software development. We want to be able to discover empirically what software product we want by quickly, iteratively releasing small pieces of it, and using the results of release as part of the information on what the product will eventually look like.
To be clear, “releasing” to internal groups like product owners doesn’t count. Agility isn’t prototyping. This is another standard practice that’s been going on for decades that Agile, Inc. is trying to claim it invented. Giving a rough alpha build every week with a new feature to the designer isn’t something new. What I’m talking is giving new builds on a weekly (or similar) basis to your paying customers (or for a “free” app, the public).
What this is not about is being able to build a piece of software faster. In fact, as I will explain more, being able to achieve this type of rapid iteration process comes at the cost of raw speed, though this tends to be obscured. Becoming agile is often the force that pushes a software shop to adopt good engineering practices that actually do make them faster. Agility is not going to magically make it take less time than it otherwise would to build a large, complex software product. It enables the product to be built and released one piece at a time, instead of all-at-once.
Unlike agile consultants, I have no claim that this is, or should be, the goal of every software shop. In fact, I can imagine some industries that would find this quite useless. Does the control software for an automatic transmission in a car need to be iterated and released to customers every two weeks? What can this software do with the same mechanical parts of the car (which isn’t updated more frequently than once per year) that it couldn’t do before? I’m not claiming to have an answer for that, and I don’t think agile consultants have a blanket right answer either.
As I mentioned, agility gets mixed up with mature engineering practices like test-driven development, design patterns, modularizing into APIs, test automation, and so on. Those are practices that every software shop will benefit from adopting, because they represent the proper way to build software. But those can be adopted without worrying about rapid release cycles. The two are, fundamentally, orthogonal to each other, even if the ability to rapidly release tends to come up against bad engineering practices more dramatically.
Regardless, most software shops, especially small businesses, very highly value agility. They need it to help guide exactly what software they should be building. It helps them beat larger, slower-moving corporations to market with new features. Assuming that a business values being able to rapidly release new features one-by-one, the question is now: what does it take do this? There are businesses who are used to delivering software infrequently, and now want to do it frequently. It is this “transition to agile” problem that the consultant shops are really trying to solve. To understand the problem, we need to understand why a software shop would become optimized for infrequent release cycles to begin with.
First, let’s think about this sequence of steps in building software. Each one of these steps has an associated group of specialists who are skilled primarily, or exclusively, in one of these steps. As in any case where a structure has specialized parts for performing one step in a sequence, the steps will be pipelined. The organization won’t work on just one step at a time. The designers will design some features, hand them to engineering, then immediately begin designing the next round of features. By the time they finish this new round, hopefully engineering has finished the first round of implementation, handed them off to testers, and are ready to accept this new batch of features. Since any real software pipeline is not unidirectional like, say, a pipelined CPU (the fabled “waterfall”), it’s a bit a more complicated, but the essentials are unchanged. Each group is constantly busy working on whatever is the next thing available for them to do.
Like any pipeline, this is only as fast as its slowest stage. If one stage in this pipeline takes 2 months, then nothing can flow from one end of the pipe to the other in less than 2 months. If stage X can complete something in 1 week, but stage X + 1 takes 1 month on that stage, then a clog is going to occur at the junction between these stages. At the end of month 1 (that is, a month after Stage X + 1 has begun), 3 weeks worth of work still remains. At the end of month 2, 6 weeks is now piled up. Then 9, then 12, and so on. It is, in fact, quite pointless for the other stages to produce any more output than this one limiting stage can handle. This is the central consideration we need to keep in mind.
Now, we need to decide how “long” the pipe is going to be. Not how large, how long. Large refers to how much work can flow in parallel through the pipeline. This means how many workers you have in each stage, assuming they can all be kept busy. We need to decide how long it takes for each batch of work to flow through this pipeline. A very long pipe processes a very large amount of work in each cycle, and delivers a large quantity of finished work infrequently. A very short pipe processes a very small amount of work in each cycle, and delivers a small quantity of finished work frequently. Note the total rate of delivery is, we can assume, the same for both. What varies is the delay, in time, between input and output.
Agility is about building a very small pipe. An agile shop runs this whole pipeline, from beginning to end, with a delivered product coming out each time, on a small time scale, adding a small set of new features each time. A non-agile shop runs the whole pipeline on a large time scale, adding a large set of new features to each release. The essential goal of “agile” processes is to figure out how to operate a short pipeline.
Let’s assume that the time each stage takes to complete its part of the work is directly proportional to the “size” of the work. Let’s also assume a large software product can be divided into a large number of small, equally “sized” (in terms of effort) features (and this is a big assumption, we’ll need to relax it eventually). Now let’s say it takes designers X many days to work out the UI/UX specs for each of those features. It takes the engineers Y many days to implement each one. And it takes testers/devs Z many days to test, fix, and eventually deliver each one. The implicit assumption is that it takes 2X days to design 2 features, 2Y days to implement 2 features, and 2Z days to test/fix/deliver 2 features. We are assuming that the time every step of the pipeline takes is directly proportional to the number of (equally “sized”) features they are working on.
If this assumption is reasonably accurate (it does not need to be exact), then it would make no essential difference how long we make the pipe. We could push a single feature through the whole pipe, beginning to end, pretty rapidly, or we could push a big batch through, with each step completing all of them before handing them all off to the next step. It would be trivial to make the pipe shorter. Just reduce the batch size. That’s it.
But it isn’t this simple, because our assumption of proportionality is completely wrong for some of the steps. The reason why is that very important variables change over time and do not reset after each cycle. We need to keep in mind that the goal is not to release one tiny standalone app every week. The goal is to continue growing one app to have a larger and larger set of features. The crucial variable that is going to be different at the start of each cycle is the size of the current feature set that has already been delivered, and will need to be delivered again at the end of the next iteration. Today the app does X and Y. Tomorrow, we won’t want it to do Z. We’ll want it to do X, Y and Z. It still needs to do X and Y. That basic fact of software development totally spoils our proportional time model.
We can assume proportional time is reasonably accurate for the design step. Designers will typically need about the same amount of time to whip up the new UI/UX for each new feature of roughly the same scope, and while admitting I am not a designer, based on what I’ve seen, the presence of other screens/experiences in the app doesn’t wildly impact the design time (if anything, it may make the time shorter because already designed widgets/concepts can be reused).
What about implementation? Does adding a button to a brand new project look wildly different, in terms of effort, than adding a button to a million-line project? Well, it depends. A million-line codebase with terrible design will make adding a button a Herculean task. A very well-designed million-line codebase will easily accommodate a new button and, like with feature design, provide reusable tools to make new work even easier. So now we’re starting to see what the answer to, “what does it take to be agile?” involves. Part of that answer is excellent code design.
Have you figured out yet that I am obsessed with design, and preach that it is the most important part of a developer’s job?
Now let’s move onto testing. This is where things get really interesting. Remember, we aren’t trying to deliver an isolated micro-app every week. We’re trying to add a feature to an existing feature set while preserving that existing feature set. One of the most fundamental lessons that any software QA department learns is that you can’t “just” test the new features. At every release, you have to test everything. It is crucially important that you do what is called regression testing: making sure the existing features weren’t broken by the latest round of development. The obvious implication is that the workload for the testers grows proportionally with the total number of features, not the total number of new features. As the app grows older, each cycle of testing is going to involve more and more work. This completely screws up our wish to deliver new features every week. If there are 200 existing features, they all have to be regression tested. How in the world are you going to do that every week!?
The answer (unless you solve this problem, which we’ll talk about next) is you don’t, and either accept it and don’t even try to release frequently, or deny it and have your QA team scrambling and working overtime just to either fail builds and slip deadlines, or worse allow large quantities of bugs to escape to customers. Eventually you’ll end up releasing infrequently even though you “plan” to release frequently (most of the release candidates will fail).
It’s even worse than what I’ve described. The “testing” phase doesn’t just include testing. It’s a repeated cycle of:
test, find bugs, fix bugs, retest, find more bugs, fix bugs, retest, etc.
Just like the first round of testing needs to include regression, every one of these cycles needs full regression. This is another place where if people are under intense time pressure, they’ll try to “cheat”. We found a bug on screen X, got a new build from devs that fixed that bug, but we tested screen Y in the first round and found no bugs. Do we really need to test it again? Yes! That bug fix? That’s a change just like any other. It can break things. In fact, this is the end of the development process, when everyone is getting impatient, and deadlines are creeping up (or already slipping). This is where people get sloppy. Devs start throwing in patches and bandaids hoping to make one little symptom uncovered by QA disappear. This stage can easily generate most of the escaping defects.
The sensible way to deal with is problem, if the testing time cannot be prevented from growing linearly, is what successful software companies chose to do for decades: make the release cycle long enough to accommodate regression testing. If it’s going to take a few months to test and stabilize a large software product before releasing, then we can bundle feature designs and new implementation work so that those steps also take a few months each. This makes all the steps of the pipeline “fit” together nicely, everyone is kept busy, and the pipeline spits out working software iterations consistently.
A lot of the design and engineering practices of yesteryear were created to optimize for these conditions. If months of testing, bug fixing, retesting, and so on is necessary before each release, then it’s best to leave all the testing and bug fixing (or at least most of it) to this stage, and not worry about stability during development of new features. Remember, things are pipelined, so the people in the implementation step are busy coding the next round of new features at the same time the people in the test/stabilize step are tightening the screws on the upcoming release. A fundamental paradigm for working in this way is branching, and more specifically a branching policy called unstable master. The “trunk” of the codebase is the “latest but not the greatest”. This is where the “implementation” guys are coding new features, and spending little to no time making sure everything is still working fully (they will stabilize, but only enough to unblock their work on new features).
Simultaneously, there is a “release branch”, which is the code that will be delivered to customers at the end of the cycle. This branch is subject to a code freeze. No work is done on this code except to fix bugs found by the testers. In particular, adding a new feature to a release branch is the exception, not the rule, and typically requires a special “change request” process to allow it. This may be where some of the misconceptions about “waterfall” arose. Yes, within the release branch, going “back” to feature design or implementation is avoided as much as possible, but it isn’t forbidden. In fact there’s a special change request process made just for going back, even if it isn’t invoked haphazardly. This does legitimately express the inflexibility of this way of building software. If we decide this late that some feature really needs to be shipped with the other features in this upcoming release, inserting them is difficult, error-prone and likely to slip the delivery date for the entire batch of features. But this is out of necessity. If regression testing takes a really long time, then a stabilizing phase is required, and introducing large changes while trying to stabilize is disruptive. This is a consequence of the difficulty in testing large complex software, not of any “process” for building software.
An important point to consider is whether it is inherently “wasteful” to operate a long pipeline like this. If testing is the phase where some kind of flaw in the architecture, or even the feature design, gets uncovered, then didn’t we waste a lot of time building out feature specs and implementations that have to be scrapped anyways? This is a more complex problem than I think people give it credit. The time at which a flaw is discovered that necessitates revisiting (thus triggering the “upflow” that a “waterfall” would forbid) is certainly shifted right. But this shift occurs within the large time window of a single release cycle. Since the testing phase is longer, there’s actually more time, potentially, between discovery and the upcoming planned release date. The issue here is not so much that the testing of any one feature began much later, but the very fact that you have to get all the way to testing to discover the problem. The same waste is implied, it’s just shifted around, unless that waste spills over into other features that were batched into the release.
For example, let’s say a whole batch of features are all handled with one overall architecture, and a flaw in that architecture is discovered during testing that necessitates reimplementing all of those features. It would be less wasteful to discover the flaw after implementing just one of those features, right? Yes, but this implies high coupling among those features, which typically means the whole architecture has to be laid down just to deliver any one of them. This also tends to imply coupling in the testing of features. In this scenario, the features aren’t really conceived of as standalone, so the prospect of building a single feature just to get to testing earlier will typically entail significant rethinking of the features and architecture themselves, just to make this kind of isolation possible. This is extra cost. Whether it outweighs the saved cost of implementing features that end up having to be reimplemented will depend on the exact details of a situation.
The real solution to eliminating the waste of late discovery is shifting the discovery to earlier in the process. This is a whole different matter worthy of its own discussion: how do you ensure design flaws are caught at the moment they appear instead of some later part of the process? Whatever the answer is, it can be applied to a large, infrequent release paradigm and a small, frequent release paradigm (spoiler alert, the solution is BDD and TDD).
Once a release branch is stabilized, it is merged back to master, hopefully carrying along all the stability with it (though some will be lost or made worse by merge conflicts, but since master is expected to be unstable, this is okay). Then, immediately, a new release branch is created for the next release from master (meaning it contains all the new features that were added after the last release branch was made), and work begins on stabilizing that.
If you want a shorter pipeline, you have to get rid of the linear growth of testing time. By this point, the industry knows full well what the solution is: test automation. Human testers will never be able to test 200 features in anywhere near the amount of time they could test 10 features. But computers? Now we’re talking. How long does it take a computer to test a feature? 200ms? Even if its 1 minute per feature, that means a computer could test 200 features in less than 3 hours. Not bad. Get that down to 1 second per feature, and you’re running full regression in less than 2 minutes.
Now, you might hear this and think: “why do I need to retest everything? If I make a change to this screen, sure I should retest the stuff on that screen, but why do I need to retest everything?” The problem this brings up is: can you divide an app into truly separate components that can be validated independently? The answer is yes, and that is a very important part of the strategy to become agile, but for now I can only say you don’t get this for free. The fact something is on a different “screen” is an arbitrary division, and any company that has tried to get away with this kind of “only test around what you changed” strategy inevitably gets burned by it. Especially if the code is badly designed (there I go about design again), it will contain tons of tight coupling (“spooky action-at-distance”, as we call it) that will almost guarantee “random” bugs popping up everywhere after a change to one single area.
Assuming the problem of modularizing hasn’t been solved, which is a highly complex engineering problem, the only other solution is to somehow make the amount of time it takes to test a growing set of features independent of the total size of those features. Now obviously it can’t be completely independent. More features will take some more time to test, period. But whatever that proportionality constant is, we have to make it as small as possible.
It’s instructive to think about how the engineering practices change when testing becomes a rapid step and there is no longer a need to bundle features up together into large numbers of “releases”. When we do this large batching, we create branches for each release. When, instead, we want to release frequently, one feature at a time, the branching policy becomes inverted. We switch to a stable master paradigm, where the trunk is “the latest and the greatest”. Automated regression enables a large master branch to both be updated with new work frequently, and also remain stabilized. Developers instead create feature branches off of master, complete that one feature, then attempt to merge it back to master. If testing is automated, it can be integrated into the integration pipeline (the pipeline that takes a feature branch and merges it into master), so that a dev can simply submit a merge request, and it will either be automatically merged after running the test suite or, if a test fails, get rejected and notify the dev.
By decentralizing the new work, ideally a problem introduced in one feature branch only delays delivery of that feature. By preventing it from getting into master, other features being worked on can, if they have no issues, be merged to master and get delivered to customers. This is very much unlike the older paradigm, where the “pass/fail” condition is on the entire batch of features. This is continuous integration (CI). If master is kept stable by it, then there’s no reason to not automate the delivery of every new snapshot of master, which becomes continuous delivery (CD). So then with a frequent release cycle, CI/CD becomes a crucial part of the tools.
The more extreme form of this is to take feature branches down to the level of single commits, which removes the need to branch altogether. If master is protected by automated tests, and especially if those automated tests are so well-designed they take mere seconds to run, you might as well make quick, rapid single working changes and immediately commit and push them to master. But that means “half-finished” features are in master, and automatically getting delivered to customers. How do we handle that? With “dark releasing”, meaning some way of toggling access to a feature in code. This decouples making a feature “available” to the end user from having the code from that feature in the app the user is running.
There is then no reason to not get the code into master as quickly as possible. The less time spent between code getting typed into your code editor, and getting pushed to master, the less you have to deal with merge conflicts, and other unpleasant side effects of maintaining multiple divergent copies of the code (what’s really going on here is realizing that the problem we’ve delegated to “version control” software like GIT isn’t really a problem of version control but a problem of variation, and that’s better solved with the same code that handles all the other variations in software).
Anyone who’s attempted this paradigm of working on a codebase with bad design can attest to how utterly impossible it is. You cannot make changes to master multiple times per day and expect it to stay “release-worthy” if you have not worked to keep your code in excellent shape.
So in summary, we’ve identified three engineering problems that need to be solved to enable agility:
- Excellent code design that ensures the development effort for new features does not grow with the number of current features, and hopefully shrinks
- Modularizing the software into small components, each of which can be independently worked on, tested/validated and delivered in isolation
- Automating the regression tests, so that the only work in each cycle for testers is to implement automation for the new features. Everything doesn’t need to be automated. As long as the set of non-automated features remains reasonably small, so that manual testing of those can be completed quickly
Note that all of these problems are nonexistent at the beginning of an app’s life, and only become issues later, and they continue to become bigger and bigger issues if not addressed. This is the honeymoon period. I think the combination of starting a fresh new project and bringing in the expensive agile consultants at the same time is like mixing alcohol with coed college kids. It’s gonna be a fun, exhilarating time, until you wake up the next morning with a disaster on your hands trying to trace your steps and figure out what went wrong. It’s easy to be agile at the beginning. None of these problems need to be solved. The consultants might even tell you, “those aren’t essential problems. As long as you follow our process of goofy team names, backlog groomings, and Fibonacci sequences, you’ll get the agility you seek” (this isn’t a straw man. On multiple occasions I had the “Agile gurus” specifically tell my dev team to eschew good design and “just get the feature working”). And at this stage, that will appear true. It’s extremely seductive, especially for the product/marketing side who never fully understand the engineering problems anyways and get annoyed they exist in the first place.
Don’t fall for it.
Now that it is clear test automation is the central enabler of business agility for software, we can realize it’s quite an unfair accusation that the software industry wasn’t agile three decades ago. Test automation is a new, cutting edge technology. It simply didn’t exist that long ago. The only choice companies had was to manually test, and the process they built around this necessity was quite well-optimized. Today we’re spoiled by all the advanced tools for test automation, plus the knowledge gained from experience of engineering practices that effectively produce trustworthy automation (TDD and BDD).
I want to touch briefly on the problem of modularizing in order to allow “isolated testing”. This deserves an entire article, or series even, on its own, but I want to mention what that entails. If you want to break an app up into pieces like this, each piece has to be its own product. It has to have its own pipeline, its own product requirements (even if it is a code library, the library needs requirements to be precisely defined), and its own deliveries. The central engineering practice that needs to be adopted for this is API design.
The goal is to build software like manufacturers build PCs. A PC is made up of a motherboard, a CPU, memory, peripherals, etc., and they can all be purchased independently and assembled in endless combinations. The number of combinations grows factorially, or worse (this is called combinatorial explosion). It is simply not possible for every motherboard manufacturer to plug in every single combination of CPU, memory, graphics card, hard disks, etc. that exists for testing. Instead, they design specifications for how these components will work together.
This includes both physical specifications about number of pins, shape and size of the plugs, and so on. By following these specs, we can be sure a CPU will actually fit into the CPU slot on a motherboard. Then, they have to define specs for what voltage levels the pins work on, and other aspects of the electronics. Finally, they have to define logical specs: what sequence of bits gets presented on each pin, what they mean, and what is expected to come back to another pin. Each manufacturer takes a very precisely defined specification and uses it to test that:
- Its own components follow the specification correctly
- Its own components perform the desired function when connected to other components that follow their specifications correctly
A motherboard manufacturer then test their components across the range of valid behavior for CPUs, memory, etc. in the spec. If the spec says the pins operate between 1 and 3 volts, they’ll test to make sure the motherboard works within that range. Meanwhile, the CPU manufacturers test to make sure their CPUs work within that entire voltage range.
You may have conceptually divided your software up into areas of concerns, even defined them as “libraries” or “modules” in your project. The system architecture may have drawn clear lines between areas (the most obvious is the line between the client and the server). But this doesn’t mean you’ve truly modularized the software. There’s a simple check you can make: do you test the modules on their own, or do you test them by connecting them to the app and then testing the app? If your workflow on the server side is: make server updates, run unit tests you don’t really trust, then have the client developers connect to your new endpoint and make sure all the client features that use your service still work, then even the server isn’t a standalone module (and if it isn’t, you better not be making updates to it that go to the production environment without going through a normal “release cycle” for the client).
Let’s say, instead, you have an API spec that is a first-class citizen in the requirements world, the server code is updated, tested and delivered only against this spec (using acceptance tests for each item in the spec). You are developing and validating the client app against a mock server based, again, on this API spec. If any defect is found in a final, end-to-end test, it requires root cause analysis to determine which of the two components’ acceptance tests are broken and why this wasn’t caught earlier, then you can safely say the server is a standalone module, and you can make changes to it without having to regression-test the client. Until then, you better do proper regression testing after every change, and since those tests will necessarily be end-to-end, they’ll probably take a long time to run.
That brings us, finally, to the product side. At the beginning I said, “let’s assume a product has been divided up into a large set of small features”, and that this is a big assumption. As it turns out, this is no more trivial to accomplish than the engineering capabilities we just discussed. The agile consultants parrot the words “vertical slicing, vertical slicing” over and over, as if it provides any really meaningful advice on how to conceive of a new capability as a composition of small capabilities. In reality, what I’ve seen is that a large feature is “broken up” into its various technological facets: the database, and third party library requirements, server-side work, client-side backend work, client UI, and so on. This is the kind of “horizontal slicing” we’re instructed not to do. But how do we not do it!? Usually the team can’t imagine how else to do it (this is evident in the creation of the so-called “enabler story” to formalize the workflow of building up a feature in multiple steps, with nothing being deliverable until the last step is completed).
What “vertical slicing” really looks like is defining minimum viable products (MVP). Even if you can find a way to “vertically” slice, say, a new screen in an app, if it’s useless to customers until every widget is built, then what’s the point? You aren’t going to release the half-finished screen to customers anyways, so this isn’t “agility” (getting something new all the way to customers frequently) in the sense we mean here. That doesn’t mean getting working widgets into master one at a time isn’t valuable (it is). What is agility (business agility) is figuring out what the bare minimum screen that is useful to customers is.
This probably isn’t a screen that is blank and just has a “back” button. You have to figure out what is “essential” to the new feature, and what is an enhancement. This can either be an aspect of the feature, or an aspect of quality (for example, an MVP might only cover the happy path, acknowledging that it might explode if any unexpected edge cases occur).
This tends to make product owners upset. No one wants to release “incomplete” or “unpolished” work like this. And no discussion of how to be agile can tell you whether you actually want to release incomplete/unpolished work, or exactly how complete/polished it needs to be before releasing it. All we can say is, this is the question you have to work out for your business to determine how agile you really want to be. The obvious question that comes up is: what’s the point? If you’re going to make a new screen with no bells or whistles, why give it to customers before?
Well, there are some good answers to this question. Maybe doing so will let you beat your competition to market. They’re worried about perfecting it, but you get something out there quickly, and you’ll polish it by the time your competitor releases their polished version. In the meantime you can genuinely say you’re offering more than the competition.
Even more critically, are you sure this feature is worth the business’s time and money? Maybe you think this new screen will double your app’s purchases for month. But will it? Maybe a rough draft of it in customer’s hands will help answer this. If that experiment provides clear evidence that customers don’t care about this new screen, then it’s a waste of time to polish it! Better to know that before you’ve polished it than after.
However, if you’re dead set on this batch of features, they’re all already designed, you’re not worried about getting to market quickly, and no market feedback is going to make those features budge, then it’s quite likely that it serves no purpose to iterate like this (again, this doesn’t mean there’s no purpose in making small, frequent working changes to master). It will inevitably make the eventual goal of getting to the polished feature set take longer. It is extra work to make sure early versions of the features are a useful customer experience by themselves and good enough to release, compared to there being no requirement whatsoever that any of it works or provides value until everything, in its polished form, is working at the end. It’s just a needless cost if you’re going to ignore any feedback this early releasing will afford you anyways.
Okay, so dividing a new capability into small features that can be iteratively released quickly involves defining MVP. What does this mean exactly? Primarily it means making difficult choices. Designers will typically always conceive of features in their final state that they hope to one day achieve. The main thing they need to do is decide now what is going to be sacrificed for the time being, to enable what’s left to get out the door. The more they’re willing to sacrifice, the faster what’s left gets delivered. Product owners and designers have trouble with this. The stripped down, bare minimum feature isn’t something they really want to ever exist. It is only a temporary stop on the way to a long-term destination. But they have to let it go. They have to avoid the emotional attachment to the finished product, and recognize arguments like, “we’ll get bad reviews if we don’t polish it” as justification for this emotional attachment.
Okay, so the product people let go of their pot of gold at the end of the rainbow. How do you carve a feature up in a way that identifies a small piece as the first minimally viable one? This is the practice of product development that requires continual practice, ideally some training, and patience. It’s a craft/skill just like software engineering. The techniques of behavior-driven development, particularly expressing requirements in Gherkin, are well-suited in aiding this task. The Gherkin, developed in collaboration among the Three Amigos, helps to flesh out all the minutia of details that are involved in building a feature.
This will, hopefully, result in a very large scenario will a lot of givens and, even better, a lot of whens and thens (from what initially looked like a small, “atomic” or indivisible feature). This provides a path forward to slicing it up. Separate along the “whens” and the “thens”. Then try to identify “given-when-then” sequences embedded in the given of one scenario, and factor them out. Then start organizing the collection of small scenarios. Some of them are independent. Others are tied sequentially to each other (the “then” of one is the “given/when” of another). This will reveal dependencies among these scenarios. Dependent ones can’t be implemented without implementing the ones they depend on first. So identify where the dependency chains begin. How long does that chain need to be to provide some valuable unit of customer experience?
This is, like the engineering problems above, a dedicated topic of its own. But that is, roughly, what starting to solve the problem of product development would look like.
The main point here is that solving these engineering and product development problems is the essential task of increasing and optimizing for business agility. If you want to release working software frequently, this is where your time needs to be spent. If you work on learning about and solving these problems, you will work toward achieving business agility.
If you spend all your effort thinking about how to define your org chart, whether to call this organization a “release train” or “solution train”, how many planning meetings to have per week and what to call them, etc., and you’re ignoring or downplaying the engineering and product development problems, then you won’t.