Done right, Continuous Integration (CI) can have a transformative effect on software development. C++ teams benefit greatly from CI practices and robust automation: CI helps us integrate, build and test our code, and also helps mitigate some of the traps and pitfalls in our language.
Almost twenty years ago now, these words appeared in a short book with an unusual title - Kent Beck’s “Extreme Programming Explained: Embrace Change”, first published in 1999.
Although some XP approaches have gone mainstream, other ideas seem as polarising now as the day that Kent first proposed them. For lasting controversy, there have been few works about software development to rival it.
The tools progressed quickly from the manual integration that Kent described, with most teams moving to the early CI build servers like CruiseControl and Buildbot. Martin Fowler expanded on the idea of CI as an umbrella of related practices, and his revised 2006 article is still my go-to resource if I want to refresh my memory on the fundamentals. I recommend everyone to read that article at least once, and the rest of this series is going to assume that level of familiarity with CI concepts.
After twenty years and a growing body of knowledge, I think we can agree that the concept of CI is not new. Nor is the practice - or at least some semblance of it. Of all the XP practices, CI seems like the most visibly successful, with build monitors visible wherever development teams work.
But despite these encouraging signs, it’s surprising how few teams are getting all the benefits from CI that they should be, and how many teams are sleepwalking through the motions with almost no benefit at all.
We’ll look at some of the barriers that teams hit with CI adoption, but for context, here’s a quick graphical recap of the classic CI workflow. The picture shows the perspective of three programming pairs working on different tasks on the same codebase, but committing to the same code mainline.
Each pair experiences the CI build differently - their interactions and timeline are different - but there’s a single, canonical view of the build shared by all three pairs.
It’s clear from the picture that when we say Continuous Integration, we do indeed mean integration that happens continuously, and when we say integration, we mean that changes are integrated into the same mainline source branch. The build is a focal point of the team’s work, and the feedback it gives is both immediate and immediately relevant - nobody is more than a few hours from pulling those changes into their local copy (or repository).
What’s less clear from this picture is the activities that need to happen for high-frequency integration to be safe and predictable. CI is a holistic practice and is driven by humans and their actions as much as tools and build servers.
Many teams seem to have what social scientists would call a Value-Action Gap . They have a viewpoint that writing tests, automating the build, and integrating small changes would give a better outcome, but then don’t take actions towards those values, or sometimes even work against them.
In my experience, this gap doesn’t happen - except in rare cases - because the team are lazy or ill-intentioned. It almost always happens because they are inexperienced, disempowered and unsupported in the changes in development approach they need to make.
Isolation zero integration: I’ll talk about source control workflow in a later post because it’s a deep, dark rabbit hole that I’m not planning to disappear down in this introduction. For now, I’ll say that having multiple workstreams operating in isolation is a damaging anti-pattern for CI.
Mega-batches sporadic integration: It’s hard to continuously integrate changes when individual developers are given enormous tasks, with no obvious place to begin and no apparent end. In a later post, I’ll outline a test approach that helps break up large features into smaller, more concrete and testable chunks.
Halfhearted developer testing unverified integration: Writing a unit test is simple. Developing an effective unit test style is not - it’s hard work, with a long learning curve. When CI automation is retro-fitted to older codebases, teams often find that their tests - if they have any at all - are far less effective at detecting regressions than they imagined, and cover only a small part of the codebase. In these situations, changes can be integrated, but with problems arising long after the code integration took place.
Skills gap fragile build infrastructure: For most C++ developers, build server configuration, DevOps automation and containerisation will be a secondary skill. Build servers are often maintained on a “best endeavours” basis and are fragile when they have configuration changes applied to them.
Proxy measures misplaced confidence: In the absence of a solid test suite, some teams take the easy route and place their faith instead in lint checks and static analysis. These tools can be a useful addition to a CI build but are far less effective than unit and acceptance tests at detecting issues. The fact that code complies with a standard does not mean that it’s defect-free, nor does a clean bill of health from a static analysis tool mean your code is functionally correct. Nothing is a substitute for tests.
Dashboards quality erosion: Managers love dashboards: static analysis results, code coverage, size and complexity metrics: all rendered in reassuringly simple red, amber and green. There’s a place for tracking dashboards on problematic codebases, but if you’re building a new system, it’s better to remediate issues immediately. Favour tripwires over tracking.
Rigid constraints poor returns: With C++, there’s no does-everything tool, and a useful CI build pipeline will draw on a disparate set of tooling: mostly open-source, but possibly with proprietary tools in the mix too, depending on your technology. If an organisation places constraints on tools, practices and infrastructure, you’ll find CI adoption to be an uphill struggle.
Lack of shared ownership neglect: When a team doesn’t feel fully invested in CI, members of the team don’t work consistently, and the result is an unequal contribution. Anyone who ends up as “the person who writes a lot of tests” or “the person who knows how to fix the build server” will quickly become disenchanted if they feel like their efforts are unappreciated or not reciprocated.
C++ is one of the hardest languages to get right for CI, but also one that benefits the most from the combination of CI and rigorous developer testing.
Every language has traps and pitfalls, and C++ isn’t any different. Bjarne Stroustrup is often quoted on the potential for self-inflicted injuries with C++. You should read his commentary on this quote, though, because there’s more to it than commonly assumed:
“C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off.” – Bjarne Stroustrup
In his commentary, Bjarne points out that all languages have similar pitfalls, and that trying to cosset users by “protecting” them from simple dangers leads to other hazards more insidious and complex.
C++ has an unrivalled span of application, from high-level generic algorithms and functional programming, to direct interaction with hardware in bare-metal embedded systems. These simple dangers will always be present in C++, but the good news is that modern practices and tools help mitigate the risks in two ways.
Firstly, you’ll be working in tiny increments and committing changes frequently, so goofs and bloopers are likely to be localised. Frequent small commits and a rebuild on every commit mean there’s no question about when. Small and manageable diffs tell us where. The results of tests help to answer both the what and the why.
Secondly, you’ll be leveraging the investment you’ve made in tests by instrumenting builds with sanitizers, as well as running static analysis, lint checks, code coverage reports and size and complexity metrics. Intelligently combined, these increase your insight into the fitness of your code.
You may still sustain occasional language-related injuries, but you’ll notice these in time for them to be treatable.
We have some things in our favour. On the whole, we tend to have a flatter and more manageable dependency graph than other technologies, and this helps the stability and reproducibility of builds. If you have a colleague who works on Java enterprise applications, ask them to show you the transitive dependency graph for their project. Gaze upon it in wonder and horror.
Our tools are stable too. Even with the C++ language evolving more rapidly than ever, it’s not uncommon for an entire project to run to completion using a single, fixed version of a compiler and standard library. Other languages and ecosystems have tools and libraries that shift constantly. How they ever get any work done, I do not know.
We’re also lucky that C++ toolchains are oriented to command-line use
On the skills front, C++ developers tend to be comfortable with raw, low-fidelity configuration and toolchain files, and don’t immediately reach for a GUI tool that makes things “easier”. It’s painful to get the magic command-line incantations right, but it’s far better to end up with these in a version-controlled text file rather than buried in an obscure IDE dialog box. I expect this point will resonate with Xcode developers.
We have challenges too, plenty of them.
To get the most out of CI, we need a mixed bag of tools. C++ has a long heritage, and there’s no single, standard way of doing…well, anything really. All of our tools have unique config-fu, and it’s tricky to get this right.
C++ has in the past been less supported by documentation. The world is very skewed towards other more fashionable languages, and the online documentation on how to do CI well for C++ builds is flimsy. I hope to change that in this series.
Another perennial challenge with C++ is that compilation times on large codebases can slow down the feedback we get from the CI build. We can mitigate this by prioritising the highest value steps to the left side of our pipeline and moving to a more modularised build. Slow builds are still going to need some infrastructural horsepower though.
In the next post, I’m going to be covering Docker as the foundation for reproducible, version-controlled build server configuration, but I’ll also touch on some of the other platforms on this image (Conan.io , isn’t included here, although it will be covered in a later post):
A CI build needs these things to work in concert, so over the series, I’ll be taking a vertical slice through the build pipeline and picking apart what we need to do in our CMake build, Jenkins pipeline, and software design, referencing back to the Docker container when we need to.
In this series, I’m going to show how to get the most from both the practices and the tooling. Apart from this opinionated and wordy opening post, it’s going to be concrete and practical, with examples and configuration aplenty, and of course, all of it specific to C++.
I’ll also mention that an early draft outline of this series started out life as “CI for Embedded C++”. It’s still going to focus on embedded in later posts, but much of the groundwork I’m going to cover applies to most C++ development.
I’ll leave you now with another nugget of wisdom from Kent Beck, this time being quoted by Martin Fowler in their 2006 “Refactoring: Improving the Design Of Existing Code”:
Continuous Integration is one of those habits.
--- Mike Ritchie, 25 March 2018 ---
You can reach us via email or phone, and we're always happy to meet in person to talk through any ideas or requirements you have:
Unit 8 Deer Park
Fairways Business Park Livingston EH54 8GA
+ 44 (0)1506 343015