This chapter covers
- Dealing with slow builds
- Scaling the CI process
- Measuring the maturity of the CI process
Now that you have your CI process up and running and you’ve added some capabilities, you may have found that not everything is a bed of roses. Some things don’t work as you expect or become slower as you add more capabilities. You may also have found political or legal issues that cause you pain.
This chapter will help you. We’ll wrap up our discussion of CI by talking about extending your process and how to overcome some of these obstacles. The topics we’ll present in this chapter aren’t necessarily directly related to each other, but they’re important enough that we need to discuss them. You’ll learn how to work with large projects, lots of projects, large and geographically separated teams, and legal roadblocks, and see where your CI process should be headed to make it more efficient and able to handle more of your software pipeline.
We’ll discuss the seven deadly sins of slow software builds and how to receive absolution. You’ll tweak the MSBuild scripts. We’ll look at how to technically scale the TeamCity server builds. And after dealing with Sarbanes-Oxley, we’ll examine the Enterprise CI Maturity Model. Let’s begin with team and project problems.
How many times have you started a project and, when it was finished, found that it required no enhancements, no bug fixes, and no more work? We’re guessing rarely, if ever. It’s inevitable that projects become bigger and more complex over time. Additional functionality means that more is happening in your CI process, which slows it down. You may be creating more and more projects. It’s rare to have a single project in your shop. And what happens when you’re working in California and the rest of your team is in New York?
The solutions to all these problems are pretty much the same. Let’s talk about the not-so-obvious first. You can remove code. Yes, you read that correctly: remove code. At some point, all projects have old code that’s no longer needed. Why do you carry around the old code? Because you might need it someday? Delete the old code from the source. You won’t lose anything. It’s still in your source control system, so you can always get it back. After you strip out the old code, your compile time should drop; the difference may only be a few seconds, but it may be a minute or more if the codebase is large.
Did you take out the unit tests with the old code? That will save even more time during your build process.
Now for the more obvious: you can get a bigger, faster build box. That should speed things up. You can also spread out the build to multiple machines. Most CI servers support build agents running on more than one machine. When you do this, one machine controls the build process and farms out the actual build to other machines; then it aggregates the results into the feedback mechanism. In other words, you’re working with parallel builds. We’ll have more to say about this in a moment, when we discuss the seven deadly sins of slow software builds.
You can also set up a different build machine for each project. We don’t recommend this: it makes feedback more complicated because you’ll need multiple feedback mechanisms or have to do some fancy configuration to pull all the build results into a single reporting tool. We also don’t recommend running geographically separated build servers. But there are other things you can do without spending money on more hardware.
Do you need to run all those unit or integration tests with every check-in? Remember, the build time is ideally less than ten or five minutes. Separate the unit and integration tests. Typically, integration tests take longer to run than true unit tests. Run the integration tests at night when you can afford the time for longer tests.
Next, you can look for other steps in your CI process that don’t need to run at every check-in. You may be doing additional testing or different types of statistical analysis on the code. These items can wait until the nightly or even weekly build.
One other thing you may want to do for geographically separated teams it use a distributed source control system such as Git. Doing so allows developers to work locally, without check-in/checkout wait times.
Table 12.1 describes your options for dealing with slow builds.
|Slow compile time||Delete old code.|
|Slow build machine||Scale using multiple build agents.|
|Slow tests||Categorize your tests, and don’t run them all every time.|
|Overall slow build||Categorize your builds.|
Now you know the obvious solutions for speeding up the build. But how about darker lore? Let’s examine the seven deadly sins of slow builds.
When you get right down to it, serialized builds, where one step has to complete before the next begins, are slow. When your build runs slowly, you may have one or more of the seven deadly sins of slow software builds. This is a term coined by Usman Muzaffar of Electric Cloud, a company that provides high-end build-management and build-analysis software. The seven deadly sins lead to serialized processes. Why not speed up that large project or multiple projects by running things in parallel as much as possible? Table 12.2 lists the seven deadly sins and contrasts them with good guidelines.
|Make at the bottom||Let make drive your build.|
|Targets with side effects||Update only one file in a make rule.|
|Multiple updated files||Write each file only once during a build.|
|Pass-based builds||Visit each directory once.|
|Output in the source directory||Write output to its own directory.|
|Monoliths||Split long jobs into multiple targets.|
|Bad dependencies||Specify relationships in a make file.|
Now that you’ve been introduced to the seven deadly sins, let’s dig into them to understand what they mean and how you can counteract them with the good guidelines. We’ll paraphrase Usman’s explanations of the seven deadly sins.
Throughout this book, we’ve looked at MSBuild. Have you enjoyed spelunking into the XML that makes MSBuild run? We didn’t think so. You may think it would be easier to wrap MSBuild in a cool PowerShell or Python script and have it call MSBuild multiple times. The problem is, you’re not letting MSBuild do the things it does best, such as comparing timestamps on files and dealing with dependencies.
MSBuild is designed to understand what you want to happen, and then go back and figure out how to do it. When externally scripted, you make multiple calls to MSBuild. You’re telling it not only what you want, but how to get there. This causes unnecessary serialization. The external script is at the top of the process, not MSBuild. You want MSBuild to drive the actual compile, not have some external script do it.
There are two ways to fix this. First, you can remove the external scripts and put MSBuild back at the top. But this approach can lead to monoliths, where you have one big job; or it can create bad dependencies. These are two of the seven deadly sins. Don’t have a single target in the MSBuild script; use multiple targets to break things up.
The second approach is a much better solution. It’s called separation of powers. Don’t have MSBuild do everything for you; if you need to do some setup work, such as checking out code or cleaning up folders from a previous build, do so outside of your MSBuild script. If you’ve followed us through this book, that’s exactly what we’ve done.
Now that the first sin has been resolved, let’s look at the second: targets with side effects.
Do you need to pass data from one build target (see chapter 3 for details) to another? For example, does one target get the new version number and then pass it to another? Or does a target figure out if you’re doing a debug or release build, only to send that information to another target? This sounds innocent enough, doesn’t it?
The problem is that you have serialization, where one build target implicitly calls another. To make it worse, this serialization is hidden. And even worse, it’s often impossible to tell if it isn’t working properly. But wait: there’s more bad news. If the two targets run in parallel, things can fail unpredictably. In this case, you should introduce serialization with an explicit dependency. Or, even better, merge the multiple targets into a single target.
The next sin occurs when you update a file more than once.
The build process updates many files, some of them more than once. Examples include build logs, program database (.pdb) files, and updated zip files. The problem is that if a file is updated multiple times (see figure 12.1), it takes longer to create multiple input files than to create the final file one time (see figure 12.2).
Figure 12.2. Multiple output files that are combined only once to create the final output can be significantly faster than updating the output file multiple times.
Figure 12.1 shows a serialized process that updates an output file more than once. Source file 1 is processed, and the result placed in the output file. A second source file is processed, and the output from that process updates the output file. Depending on how the output code is written, the output for source 2 can do a simple append to the output file, or it can cause the output file to be completely rewritten. This process gets even more lengthy when source file 3 is added.
Figure 12.2 shows an alternative process. Each source file produces its own output file after processing. Each separate output file is then combined one time to create the final output file. This way of updating a file only once can provide significant speed increases, especially if the source files can be processed in parallel.
In the case of the .pdb file, you can compile each source file separately. Use the /pdb switch on the compiler (either csc or vbc) command line to specify that each source file gets its own .pdb file. Don’t worry about combining them at the end of the build. You don’t debug on the build server, so why take the extra step to combine the .pdb files?
Next we turn to pass-based builds.
Pass-based dependencies occur when you have to build things in a particular order or you say, “We have cyclical dependencies, so we have to control the order of the build.” This completely serializes your build.
First, there’s no such thing as a cyclical build. MSBuild doesn’t allow it. If you think you have one, you don’t. Analyzing this more closely, we get figure 12.3.
Figure 12.3. Project A compiles and fails. We then compile project B, which succeeds. We then compile A again. Because it’s successful, we assume we have a cyclical dependency.
In this figure, we know that project B depends on project A; but when we compile A, we get errors, and it doesn’t actually build. We then build B, and it succeeds. If we then build A again, we assume that we have a cyclical dependency. But what’s really happening is depicted in figure 12.4.
Figure 12.4. What really happens is that part of A builds, and then B. Then, the remainder of A that relies on B can build.
The reality is that B relies on only part of A. That part builds. The other part of A relies on B, so A fails the first time. Because it’s partially built, B can now build, and then A can build properly when the build is run a second time. The solution is to break A into two parts and serialize them. There’s always a serialization that works.
Now we move on to where to place output from the build process.
When you build your Visual Studio project, where does the output go? By default, it goes in either the bin\debug or the bin\release folder under the project directory. You can then manually or automatically copy the generated assemblies to some other folder for unit testing. The additional copy takes time.
Do you clean out the generated files before each build? If you have to specify each individual file, it takes more time to delete these generated files. What if you have to clean the output folders for multiple Visual Studio projects?
The solution is obvious: point all the generated output from the build to a single folder. This way, you’ll have one output folder for the compiled assemblies and a second output folder for the build artifacts, such as the build log, unit test results log, and so on. Also, if you put this output on the local disk instead of a server drive, you’ll get better performance.
Next, we look at monoliths.
Monoliths exist when you have a single, large build target or a build target that performs many tasks. This can get tricky. It’s fairly easy to identify monoliths, but it can be difficult to break them up. Where they occur in the build process can make a difference. You may have one at the beginning of your build that identifies dependencies. At the end, you may have many file copies or copy a large file.
The bad news is, monoliths are often unavoidable. You have to use them. But if you do have a monolith, you can ask several questions to try to break it up:
- Is it necessary?
- If it’s necessary, is there a faster way?
- Can it be rewritten as MSBuild targets?
- Can it be pushed later in the build process?
- Can it be made optional?
- Should it be run locally (on the build server) rather than on a file server?
- Can it be cached?
- Is it really part of the build or the setup?
- Can it be broken into smaller steps, each one a separate target?
There’s one more sin to cover: bad dependencies.
Dealing with bad dependencies is another tricky area. It’s almost impossible to resolve all dependencies so that everything builds cleanly. The problem is that sometimes dependencies build correctly, and other times the build fails. Even worse, you may not know exactly why the build failed. Things may build correctly when you run the build serially but fail on a parallel build. Or the build may not fail every time, only when race conditions are just right. A race condition exists when the output of one process is dependent on other processes completing in a specific sequence, but they run in parallel, thus completing in an unexpected order.
How do you deal with this issue? First, you can run parallel builds often. The more you run them, the more often the race conditions are likely to show up. This will alert you to where the errors are so you can fix them.
Second, whenever possible, add the missing dependency. Make sure it exists. Maybe you can serialize that part of the build process and run the rest in parallel.
Finally, you can centralize build rules. Doing so helps eliminate mistakes. Don’t copy the same build code into multiple build files for the same project; this increases the chances that one build script will be changed and another won’t. In other words, follow the same rule for your build scripts that you would for source code: don’t copy and paste code into multiple files. Put things in one place and one place only.
We’ve covered a lot of ground discussing ways to speed up your build. We’ve talked about things as simple as adding more hardware and as varied as the seven deadly sins of slow builds. But you may have to deal with other issues as your CI process grows. Next, let’s scale.
Your CI setup grows. You have more and more projects working on the build server. You’ve done your best to minimize the time the build takes to run, but the CI server is overloaded. You have many projects, and you work with many developers. The changes are pouring in more quickly than the build server can process them. One option is scaling up (in other words, scale vertically) by buying more memory for the server, switching the processor for a better one, or adding disk space. It’s easy, but you may bump the ceiling. What if you can’t scale higher? Try scaling out.
Most modern build servers can scale vertically. This means you can add more physical machines. We touched on this topic briefly in chapter 4 when we discussed Microsoft Team Foundation Server. Let’s dig a little deeper.
The theory is simple. There’s one central CI server and a bunch of build agents. The server is responsible for build management. The server doesn’t process any builds itself; it passes the order to build to one of the build machines. The server checks whether there’s something to do, and if so, it queues the build or chooses the build agent to do the work (see figure 12.5).
The algorithm to assign jobs to build agents is a science in itself. It’s based on measuring the build-agent workload. You can measure the workload by analyzing the build results. Does one particular agent take longer and longer to build or is it sitting idle most of the time?
When using an agent, the CI server assigns a job to the build agent that is least used or one that’s idle. It can also direct a build to a given build agent because of the build agent’s characteristics; for example, it may have the proper operating system to perform the build. The server can start builds simultaneously on different machines—for example, to test the software under various environments and give feedback more quickly. Build agents are often categorized, and builds are marked to check for compatibility.
Build agents often don’t need to communicate with the source repository. The CI server deals with getting the last version to build.
TeamCity lets you set up a build grid. It’s a TeamCity server with a farm of build agents. Setting up such a farm is easy; you’ll build one now. Follow these steps:
1. Prepare a separate machine onto which to install a build agent.
2. Go to the TeamCity website on your server, and switch to the Agents tab. You’ll see something like figure 12.6.
Figure 12.6. Default build agent running together with the TeamCity server. From here, you can install a new build agent.
If the machine you want to install on uses Windows, you’re asked whether to install the build agent as a Windows service (if you want to use it productively, you should choose to install it as a service). On other operating systems, you can use Java Web Start or take care of the installation yourself (a zip package of shell scripts is available to help you).
4. When you’re finished with the installation, you’re given the opportunity to configure the newly installed build agent; figure 12.7 shows the details.
Figure 12.7. Configuring the TeamCity build agent. Pay close attention to the ownPort variable (the port needs to be open for communication) and the serverUrl variable (it’s the TeamCity server location).
You can change the configuration variables using this window. The build agent comes with its own Java Runtime Environment, and the path is set to it by default. TeamCity is installed on the current machine using port 9090 by default. Remember to open the port on the firewall to make communication possible. Change the serverUrl variable to match your TeamCity server installation. You can change the temporary and working directories if you like. You can always change the variables later by editing the XML configuration file.
5. After installing, you’re finished on the build-agent side. Go back to the server website. On the Agents tab is a new Unauthorized agent. Click its tab, and click the Unauthorized link to authorize the build agent (see figure 12.8).
You have one build agent installed, together with the TeamCity server and an additional build agent. If both your machines use Windows and all the projects are .NET projects, you have no compatibility issues, and TeamCity will always assign builds to the build agent that is least occupied. As a result, TeamCity can execute more builds simultaneously. You’ve achieved horizontal scaling, as you can see in figure 12.9.
Build agents in TeamCity are characterized using system properties and environment variables. You can freely set the requirements for build agents at the project level. For example, you can say that you want your build to run only on Windows machines with .NET Framework 4.0 installed. Let’s configure a project to do so:
1. Go to the build configuration, and choose the seventh wizard step: Agent Requirements.
2. Add two system property requirements for the build, as shown in figure 12.10 (you can use the Frequently Used Requirements link on the page if you wish).
Figure 12.10. Build agent with additional system requirements (Windows and installed .NET Framework 4.0). Both of the connected build agents are compatible.
You can also set the requirements by using environment variables on the build-agent machine. If you wish, you can give your custom variables a condition (like “exists” or “contains”) if you like. The condition will be checked before assigning a job to the build agent. If it’s fulfilled, the job will be assigned. If not, another build agent will be used.
As you can see, scaling a modern CI server is easy. If you’re using CruiseControl. NET, you’re in a more difficult situation; you can configure a project trigger to react according to changes on another CCNet server, but you can’t design more sophisticated scaling scenarios.
On the other hand, if you’re using TFS 2010, you have even more possibilities. You can use build queuing and a grid of build agents to perform simultaneous integrations. You can use TFS Proxy for distributed teams. Let’s say your headquarters are in America, and you have one offshore team in Asia. If you let the offshore team connect directly to the TFS server at headquarters, you’ll most likely end up with internet traffic as a communication bottleneck. It helps to set the TFS Proxy in the offshore location. The developers connect only to the TFS Proxy server, and it optimizes communication with the main TFS server.
Some setups use network load balancing with multiple TFS instances to lighten the workload on one TFS application tier. You can find a good paper about scaling TFS 2010 at http://blogs.msdn.com/b/tfsao/archive/2009/11/05/scaling-tfs-2010-beta-2.aspx.
Next, let’s change directions and look at softer topics. We’ll begin with legal issues related to CI.
The last thing a developer wants to hear about is a legal roadblock to their application being fully tested and deployed. You may well be in such an environment. Federal, state, or local laws may impose restrictions on moving your application internally in your company.
One such law in the U.S. is the Sarbanes-Oxley Act, commonly called SOX. Passed in 2002, SOX applies to all publically traded companies regardless of size. The bill was created after several major accounting scandals at firms such as Enron, Tyco, and WorldCom. The law creates tough restrictions on corporate accounting procedures and reporting. It requires that documented processes be in place so that similar accounting scandals don’t happen again.
But what does a law governing accounting have to do with CI? If you’re creating internal applications that do any type of accounting, inventory, financial management, and so on, you may have to comply.
Briefly, to comply with SOX, developers can’t touch QA or production systems. QA can’t touch production or development systems. Production can’t touch QA or development systems. This may make it more difficult for your CI system to function cleanly. But here’s what you may be able to do. The development CI system compiles the code and runs unit and integration tests. It may run Sandcastle and do static code analysis. It may not run other tests such as acceptance, stress, scalability, load, performance, and so on. It certainly can’t push an application directly into production. If the build succeeds, it can push the compiled assemblies or even an install set onto a shared server. Think of it as a demilitarized zone (DMZ) (see figure 12.11) where no work takes place. It’s a drop point for the QA files. Source code never goes here.
Figure 12.11. Under SOX, there’s separation between development, QA, and production. One team can’t access the systems of another. It’s as if a brick wall exists between the teams.
If a build is deemed ready for release, the QA team pushes the install set out to another DMZ between them and production. The production team then picks up the bits and installs them, and users begin working with the new version.
We must stress that we aren’t lawyers and aren’t giving legal advice. You should consult with an attorney to determine if any laws pertain to your environment and, if they do, what specifically you need to do to comply with them.
To wrap things up, we now move on to a topic that seems out of place in a book on CI: a CI maturity model.
Different models that show the maturity of a process have grown out of the Capability Maturity Model (CMM). It’s a methodology for businesses to help improve their processes. In software, CMM is most often associated with application lifetime management (ALM) in shops that use waterfall project-management methodologies. That said, it seems as though a maturity model has no place in CI, a practice that came out of the Agile movement, which is the complete opposite of waterfall.
2 Capability Maturity Model is a service mark of Carnegie Mellon University.
But if we look at a maturity model as a way to improve processes, we start to see areas that can be improved. The Enterprise CI Maturity Model (ECIMM) was developed by Eric Minick and Jeffrey Fredrick at Urban Code,. a leading company in build and release solutions. It was the result of a discussion at the Continuous Integration and Testing Conference (CITCON).
ECIMM breaks a CI process into four distinct areas: building, deploying, testing, and reporting. In each area, five different compliance levels—introductory, novice, intermediate, advanced, and insane—are provided, for ranking a level of compliance. We’ll look at these in a moment. Additionally, the industry norm and a best practice target level for each area are identified. You can use ECIMM to rate where your company is compared to others and set a target for where your company should be in relation to best practices. But to do this, you need to understand ECIMM, starting with building.
As we discuss ECIMM, you’ll see that many companies are in the introductory stage for most areas. Building (see figure 12.12) is no exception.
Figure 12.12. The building area of ECIMM specifies levels for storing source code and compiling the application. The industry norm (smiley face) and target levels (star) are specified in the Urban Code ECIMM document. The graphics in this section are adapted from that document.
Building refers to using a source code repository, and the way your CI process performs the actual build. Look carefully at the introductory level, where most companies are. If you’re performing manual builds that check out the latest changes, you’re probably at this level.
Compare that to the best practice. Are your builds continuous, meaning that they run with every check-in? Do you use a single machine, or are your builds clustered? (Remember that earlier in this chapter we talked about parallel builds.) What will it take to get you to the intermediate level? If you’ve been following our advice, you should be well on your way to complying with the best practices of the intermediate level. You should have a dedicated build machine executing build scripts automatically and using a source control repository.
Now it’s time to move on to deploying.
How do you get your application from development into QA? What about getting your application to your users in the production environment? That’s what the next step, deploying (see figure 12.13) is all about.
Figure 12.13. The deploying stage of ECIMM maps out levels to help you make your deployment easier and more complete.
Chances are, you have a few scripts in your CI process that push the compiled bits out to your QA department. But what if QA isn’t ready for that build? Did you just overwrite the version they had only partially tested?
You may also have multiple environments that you support, such as 32- and 64-bit. Do you have different configurations of the software for each environment?
Why not let QA pull the latest build when they’re ready for it? That’s the idea behind self-service test deploys. And a single, standardized configuration for each environment is essential. It not only makes it easier to program, it also makes testing significantly easier.
Automatic deployment to production, although it sounds ideal, falls into the insane level. It’s difficult to do properly, and few companies do it because of the level of complexity. One of the biggest issues here has to do with SOX compliance as discussed earlier in this chapter.
We just talked about your QA department getting the latest build, so it’s a good time to examine ECIMM testing.
We’ve spent several chapters talking about different types of testing. ECIMM addresses testing your application (see figure 12.14).
Many companies have some areas of unit testing. And perhaps other types of testing have some automation attached to them. But are you doing static code analysis with FxCop or StyleCop? How much unit testing do you have in place? How about security scans? Even managed code can have security issues that need to be addressed.
It’s interesting that 100% test coverage is placed at the insane level. We agree with this, because 100% test coverage is not only almost impossible to achieve, but also undesirable. Not everything needs to be unit tested, and doing so slows down the build process.
The final step of ECIMM is feedback.
Getting good feedback is a key step of any CI process. It’s also important to ECIMM (see figure 12.15). After all, without good reporting, you have no way of knowing if your CI process is doing its job or needlessly running through the steps.
Reporting is the only ECIMM stage that has novice as the industry norm. This level is where we find reporting from most CI tools. Think about the reports you get from a build. You see what the latest build result was, how many unit tests passed, and possibly other important information such as other test results, static code analysis, documenting, and so on.
But these standard reports don’t give you much in the way of trend analysis. Are bug counts going up or down over time? Is the percentage of code covered by unit tests increasing? Is the speed of the build staying at a manageable level?
Cross-silo analysis is important too. This can involve collecting data across different projects or teams inside your company. One team may be better at creating and running unit tests than another. Perhaps there’s something you can learn from this, which you can apply to other teams.
The ECIMM is a valuable tool to use in your business as you expand the use of CI. By using it as a guide for where you should be, you can improve your CI process.
In this chapter, we presented several topics for you to consider as your CI process grows. We discussed areas that cause the build to slow down, including the seven deadly sins of slow builds, and we presented several ideas that you can implement to speed up your CI process.
We then turned to legal issues that can impose roadblocks in moving applications through your company. Specifically, we talked about SOX and one idea for helping you comply with the law.
Finally, we discussed the Enterprise CI Maturity Model, which presents a way for you to determine how your CI process compares with other companies and what you should be targeting as a best practice.
We’re at the end of our journey in this book. We’ve dealt with various aspects of a well-designed CI process. By now, you should know all you need to build your own setup and to maintain and extend it, from creating a solid source control system to automating the build, setting up a CI server, automating various types of testing, performing code analysis, generating documentation, creating setup routines, and incorporating database integration and scaling. You’ll profit greatly from this knowledge. We wish you well on your journey with continuous integration in .NET!