Software Life Cycle

We've looked at some of the infrastructure that supports an open-source project. Now let's look at some of the activities that need to happen as the project progresses.

Planning And Decision Making

All of the planning and decision making for an open-source project should take place either on the project's public mailing lists or at public community meetings. Small groups can, and should, get together in private to work up proposals and suggestions, but as soon as possible these need to be opened for community feedback.

Public discussion generally takes longer to make a decision than a proprietary development group does, but because the diversity of the viewpoints is greater for an open-source effort the resulting decision is likely to be of higher quality. This can translate into a shorter overall development cycle, because subsequent work will probably not need to be discarded because the real issues came up after, rather than during, the discussion period.

In an open-source project without explicit discussion cut-offs, the discussion can sometimes go on for a long time with diminishing benefit. In many of these cases, it is relatively clear what the sense of the decision is, but unless the discussion is cut off at the right time, the community will not seem crisp and well run. Therefore, it is crucial that someone be sensitive to when the discussion seems to be winding down. At that point, it is important to post a wrap-up message that summarizes the main issues and the consensus on what should be done. Part of the wrap-up is to list who has volunteered to actually do the work. Often the person who has initiated the discussion is the one to wrap things up, but some developers--especially those used to working in a hierarchical company--will expect someone else to make a decision and tell them what to do, which is not the way open-source works.

One of the most common reasons that a software project fails is because it does not meet the needs of its intended audience. This is much less likely to occur if the actual users of the project can join in the discussions where the project's functionality is being defined. For a really successful open-source project, it is probably the users that started this discussion in the first place. Only after the need has been clearly articulated and possible solutions have been debated can the developers work on a plan to implement a solution. If there are several proposed solutions, then the developers may choose to experiment by implementing trial versions of each, so that the users can try them all out to decide which is the best. Note that this process is called user-centered design or evolutionary design and is too seldom practiced by proprietary projects; the open-source process naturally encourages a user-centered approach.

Design decisions, and the rationale behind them, need to be recorded in the project's design documents. There should be a person assigned to keep them up to date. Scheduling decisions should be recorded in a project road map document. As already mentioned, the road map allows developers and users to get a sense of what changes they can expect and when they might happen. For a developer looking for a way to help, the road map can point them at work that needs to be done. It is very important that the road map be kept current; it is the definitive source of decision information on the project.

Scheduling takes on a very different flavor for an open-source project because many people are volunteers. Internal developers can be assigned to tasks based on your company's priorities, but outside developers choose what they will work on and set their own schedules. We say more later about how to do a release, but two points should be mentioned here. First, for a typical company-run project, the features that will go into the next release are defined, a schedule is made up, and the features are then coded. For an open-source project this process is reversed: new features are implemented, and, when there are a significant number of them, a release date is set. Second, setting a release date will motivate people who are working on almost finished modules to get them done in time for the next release. This process of setting a release date and then including in the release only those modules that are ready by that date is sometimes referred to as the "train model": any modules not ready must wait for the next train (that is, the next release).

Reliance on volunteers is another reason that it is important to assure that the project uses good modular design to enable decentralized development. In his book Open Source Development with CVS , Karl Fogel points out that "free projects optimize in favor of a distributed burden, lessening the vulnerability of the module to any one person's schedule (or lapse in judgment for that matter)" (p. 147).

In a hybrid open-source project directed by a company, the scheduling can have some aspects of company-run projects. Remember that in such a project, decisions are still made transparently, but the volunteers may agree to defining a feature set as the primary determiner of when a release will happen, as long as the company developers do that work. In this case, other features may still be added, and perhaps the release will be delayed for important or interesting additional features to be completed.

Deciding what internal developers focus on is a matter of private planning. However, if they are all working on matters peripheral to the main community activity, that can be bad--project leadership requires that they work to help achieve community goals. On the other hand, a company doesn't always need to demonstrate leadership. For example, your company might join an existing open-source project with the sole aim of adding a few features or porting it to another platform. But for an open-source project that your company initiates, ignoring the wishes of the community can be fatal: At best your company will be seen as irrelevant and the project will fork, and at worst the community will cease to participate and the project will be seen as a public failure that only your company cares about. In some cases, a company that starts an open-source project is best off ceding leadership to the community; cases where this makes sense include projects where the company is trying to make inroads in open-source-dominated markets or where the experts in the area are in the community and not in the company.

In the end, it's the people who write the code and who integrate the contributions who have the final say. The phrase "show me the code" often comes up in open-source project discussions.

Integrating Contributions

In a successful open-source effort, many developers contribute bug fixes and new features. Although everyone is allowed to read the source code archive, only a small group of developers are granted permission to directly modify it. In some projects, such as Linux or Mozilla, each module has one or more owners--the module owners --who are the only ones who can edit the source code for that module. Others wanting to make changes must submit their contributions to the official module owner, who may or may not accept them. Other projects, such as Apache, rely on a core group of developers--called committers --that jointly oversee the entire project; any one of them can modify code in any of the project's modules, although major changes are often reviewed and voted on by the entire core team. For simplicity, we refer to all the developers who can make changes to the official source code as module owners, but please keep in mind that some projects do not necessarily control access on a module-by-module basis. Also, for smaller projects, the original author may act as the project owner and be the only person with the right to make any changes.

When a developer who does not have write access to the source code submits a bug fix or modification, it is part of the job of the appropriate module owner to review the contribution and integrate it into the code base if it is both well written and in line with the project's goals. Accepting a contribution may involve some work, but it is fairly straightforward. Rejecting a contribution, however, requires some sensitivity.

Some proposed new features may introduce more complications then they are worth or they may be questionable in the first place. Module owners need to ensure the quality of the code they are responsible for and sometimes need to reject a contribution. For the health of the community, however, people making contributions should never be made to feel rejected. They have donated their time to make a contribution, and you want to encourage them to keep working on the project. Requesting that they make some changes to make their work better is one possible way to do this. Another is to reserve an official part of the CVS archive for nonstandard contributions; others can see them and incorporate them in their builds if they want to. Many open-source developers have gone on from receiving constructive rejections of their work to become productive project members. A healthy community helps to educate and develop its members.

Module owners must also be aware of the attitude of the community to their decisions. Module owners are allowed to make decisions only as long as they have the community's respect. Their authority is largely based on merit. If enough other developers disagree with the direction they are taking the project, they will either be replaced or the project will split into two factions--that is, the code will fork. It can become more difficult when a company sponsors an open-source project, because initially all the module owners are company employees. Module owners then have a manager to answer to in addition to the community. Moreover, that manager may or may not be responsive to community feedback. For this reason, it's important for a company starting an open-source project to educate the management chain and to put in place performance goals that reflect the success of the community and the open-source project.

How Decisions Get Made Varies Among Open-Source Projects

Roles associated with software development include architects, designers, implementors (sometimes called coders), quality assurance people (or testers), and release managers. Some of these roles have to do with managing or executing part of the development process (the implementors, release managers, and quality assurance people), whereas others are thought of as producing or creating important artifacts associated with the software (the architects and designers).

In the conventional view of software development, the requirements, specifications, architecture, and design are formal artifacts produced before implementation, although in even the most traditional development projects there is iterative refinement of them. In open-source projects, if there are similar artifacts, they may be either informal or reflected in the source code itself. Open source uses continuous (re)design, in which the design and even the architecture is fluid based on the use of the software and the desires of its developers. Any descriptions of the architecture and designs typically reside in discussion archives and other peripheral documents as well as in the source code itself. Requirements and specifications are likewise at best informal, captured in archives, source comments, and the source itself.

The majority of open-source projects have fewer than 20 developers working on them and rely on a single person, usually the originator of the code, to act as a "benevolent dictator" making the major decisions. For larger projects, decisions are often delegated to the senior developer in charge of each code module. Other open-source projects use a more democratic method. For example, each Apache Software Foundation project has a group of committers who vote on changes to the software and on who can become committers. Committers can be elected to be Foundation Members.

We say more on governance in the section Who's in Charge? later in this chapter when we discuss community issues.

Code Reviews

A module owner should do at least an informal review of the code before accepting a contribution or bug fix. But who reviews the module owner's code? Actually, all of the code is continually being looked at by various developers. If there is code that is badly written, inefficient, confusing, or buggy, then someone is likely to complain and maybe even rewrite it. Some projects have a mailing list that is automatically sent a message whenever anyone makes a change to a file so that everyone subscribing to the list can review the changes as they happen. Code review can also take place in discussion about implementation issues on the project mailing lists. Some larger open-source projects, such as Mozilla, have a formal code review process that any new code must go through before it can be checked in.

An open-source project effectively has an ongoing, informal peer review of its code. This is what allows continuous evolution of the code. A proprietary project is much more likely to have important sections of code that only the original author has ever looked at.

Daily Builds

A crucial requirement for an open-source project to be successful is that developers be able to make a small change to the latest source code, compile it, and see the outcome of the change. If they can't do this, they will be much less motivated to continue working on the project. This is why it is important to have working code that does something useful before asking developers to join a new open-source project. However, a problem can arise if someone has checked in code that includes fatal bugs or, even worse, causes the code not to build correctly; then other developers cannot test any of their work.

It is vital that there be someone who is responsible for making sure that the build is not broken. This buildmaster needs to identify the cause of any problems and fix the build as soon as possible, usually by reverting bad files to earlier, good versions. The buildmaster then needs to contact the developers who checked in the bad code and get them to fix it.

The Mozilla project has a special tool called tinderbox1 that consists of a farm of machines whose sole purpose is to continually check out and build the source tree on various platforms and then display the status of the builds on a continually updated web page. This lets people know when it is safe to check in new changes.

It is also very important to have the most recent stable build available for download. This is for your cutting-edge users who want the latest features that have been initially tested but have not yet gone through a full release cycle.

To facilitate testing, it is important that the latest build (executable) be available for download by the testers. You want the smallest delay possible between when a developer checks in a fix for a bug and when the testers can download a new version containing that bug fix. The faster the feedback received the better. In the early days of Linux, Linus Torvalds on occasion made several releases of the Linux kernel in a single day. That's giving users and developers immediate gratification.

Testing

In a very real sense, everyone who uses an open-source application is part of the testing effort. The easier you make it for users to report a bug, the more likely they are to do so. However, the most significant difference between user-based testing in an open-source project and traditional alpha/beta testing for proprietary software is that, in addition to reporting a bug, some developers will isolate or track down the cause, and perhaps even submit a fix for the bug.

For some open-source projects, there is no formal testing; all the bugs are reported from actual use. For some other efforts, testers are recruited as part of the release cycle. In some sense, people are volunteering to help test code whenever they download the latest build or the last stable build. The cutting-edge users and developers are the first line of bug finders.

Looking at the quality of Linux, especially in comparison with other PC operating systems, we can see how successful this user testing can be. Another example is that the early users of Mozilla's 1.0 alpha and beta releases filed about 1000 bug reports per month.

Having users do the testing has extra benefits because they test not just that the new functionality has been correctly implemented but also that the functionality meets the users' actual needs. This is combining usability testing with QA.

It's important to thank your testers. Their efforts do not get the same visibility as the developers who contribute code or folks who write documentation. They should be acknowledged somewhere on the project's website and also in a README file that is part of the standard distribution.

For a larger project with more resources, doing some form of automatic regression testing can really help make sure old bugs do not recur. Jikes, for example, has a regression testing framework--which is itself an open-source project, called Jacks. Likewise, the Visualization Toolkit project uses extensive automated testing. If full-time QA people are available to do more formal testing, then so much the better.

Releases

Every time someone checks in a change to the source code, that is a new release. This is what open-source projects mean by continuous releases. For active developers this is great: They are guaranteed that they are working on the latest code. They won't waste time fixing a bug that someone else has already fixed. Other people can start using their contributions immediately--and report any bugs they find in those contributions.

But for some users, continuous releases are less desirable. Users want some stability in the programs they rely on. However, the amount of stability desired varies from one user to the next. Some want to be able to use the latest features, whereas others want something really solid and bug-free. It should be easy for a newcomer who wants to learn about your project to locate and download an executable that is known to work and has been well tested. This gives newcomers something to try out that will provide the best initial experience possible.

To satisfy these conflicting needs, many projects do a series of frequent, small incremental releases using code that has been mostly debugged, with infrequent major releases after the remaining bugs have been discovered and fixed. So the more adventurous users become the QA team to flush out the bugs in the minor releases.

When to do a major release is determined by the current state of the code: Have a number of serious bugs been fixed? Have significant new features been added? Has it been a long time since the last release? If the community decides the answer is yes, then the project enters a release cycle aimed at creating a stable version suitable for release.

The release process for an open-source project is very similar to that used for proprietary products. The main difference is that the open-source process is looser. For example, if the code for a new module is fairly stable and does useful things, then it may be included in a release even though the documentation for it is slim or nonexistent.

A release manager often is needed to coordinate the release process. This includes recruiting testers, coordinating the testing process, and even making sure that the testers are properly acknowledged afterward.

The release manager's job also includes helping to decide what goes into the release and what is not yet ready. This generally involves a code freeze, during which new functionality is not being added. Once it has been decided to do a new release, developers who are in the midst of writing new modules will be motivated to finish them quickly so that they can be included in the release. Motivating developers is good, but finishing the implementation of a module is not sufficient; it also must be debugged so that it is stable. Part of the release manager's job is to allow only stable code into the release. Even some bug fixes may not be included if the benefit of fixing the bug is outweighed by the possibility of new bugs that may be introduced by the fix. The release cycle is not the time to make big changes to the code.

In an open-source project (unlike many proprietary projects), ongoing development is likely to continue during the release process. Developers may want to continue to work on new and experimental modules that will not be included in the release. Projects that use a source control system such as CVS can start up a branch for the release activity while normal development continues on the main trunk. There are pros and cons to code freezing versus code branching. When code is branched for a release, developers can continue the work they're doing that's not related to the release; this maximizes ongoing work. But the cost is that any bug fixes made to the release version of the code must be merged into the main source branch, and changes made to the main branch while the release was being done can make such merges difficult. When code is frozen while a release is being done, developers who want to continue to work on new project code can be frustrated when they cannot check in their code and test it promptly. Such frustrations can sometimes lead developers to abandon the project.

When most of the known bugs have been fixed and the release is becoming stable, it is often a good idea to put out a beta release. More people are willing to try out a beta version because it has already undergone substantial testing. This second batch of testers will help catch the remaining bugs.

Finally, when the last major bugs have been fixed, the release is ready to be packaged for distribution and announced to the world. It is important that every release be given a unique release number so that everyone knows which is the newest version and so bugs can be reported against the correct version of the source code. Some projects, such as Linux, have adopted the convention of giving even version numbers to stable releases and odd numbers to untested ones. So in October 2002, the latest stable Linux kernel was version 2.4.19, the development version was 2.5.44, and the beta (prepatch) for the next stable kernel was version 2.4.20-pre11.

Companies such as Red Hat make their living packaging the major Linux releases and selling them to users who value stability, ease of installation, and product support. If your company plans on offering a branded product based on the open-source code, then it may follow a similar model.

Support

The various open-source licenses clearly state that open-source software comes with no support whatsoever. This is generally discussed as an opportunity for third parties to sell support, and indeed companies such as Red Hat make good money doing so.

But the support story is not so clear-cut. In fact, often the greatest source of support is from the other users and developers working on the project. These are the people who care about the software the most and know it the best. In a successful open-source project, the main mailing list is used by people to ask questions about problems they are having and quickly get answers. Some of the answers may not be the best, but generally with a little persistence people get the information to solve their problems. As long as people realize that they are asking for help, rather than demanding support, then the mailing list can be a major benefit for users of open-source software. But for those users who need someone to solve their problems for them, purchasing or contracting support from a third party is the way to go.

This is another way that user feedback can help improve the software. Developers get to see the problems real users are having and can modify the code accordingly. It's also just a small step from asking how to do something to suggesting something useful that the program could do.

Adding A New Module or Subproject

Contributing an entirely new module is one of the most exciting ways a developer can move an open-source project forward. This can add totally new capabilities to the software, often taking it in an unforeseen direction. But not every proposed new module will be a winner--some will be useful only to a small group of users, some will never work out, and some will just seem wacky--so it is important for the project to have a thorough but flexible approval process for new modules.

Many open-source projects have an experimental area where anyone can easily set up a new subproject to develop a new module. This provides a sandbox for developers to test new ideas and make them available to the community to try out. Generally the new module is not included in any of the project's official releases, but the developers working on the new module can create an experimental version that does include it and make it available for anyone who wants to download and play with it.

When the community decides that a new module provides important functionality, then it's time to move it out of the experimental area and make it an official part of the project. This can involve lots of effort, because the standards for an experimental module are quite different than for an official one. For an experimental module, all that matters is that it does something useful or neat, but making it "product quality" can require adding user documentation (possibly including online help), internationalization (I18N2), localization (L10N), accessibility (A11Y), usability testing, a build script to add it to the project build process, and a test suite; adopting the official project look and feel; and polishing whatever rough edges the module currently has.

That's a lot of work, probably more than the original developer originally put into writing and debugging the module. Bringing the module up to the project's standards requires other developers to help out. That could involve employees from your company--UI team, graphics designers, technical writers, and QA folks--just as if it were a module you had developed.

There's a potential tension between wanting people to innovate by creating new modules and having high standards for the official product quality. You need to be careful not to make too high a barrier that prevents volunteer contributions from becoming part of the official release. One way to go about this is to let the community decide what the standards are for the open-source project, and then your company can choose to do additional work--such as writing more complete user documentation or more formal testing--to create your own branded version. Just be sure to contribute back as much as you can to the open-source effort.

Finally, it's important to periodically go through the list of subprojects and weed out those that are not being used and are no longer being worked on. Having too many abandoned subprojects makes it more difficult for people to find the live ones, and it may make the whole project seem dead. Approving subprojects and weeding out the dead ones are part of the community coordinator's job.

Making It Happen

As you can see, lots of work is needed for an open-source project to be successful. Here is a list of some of the jobs that you must commit resources to:

Evangelist/community coordinator to encourage and coordinate developers, get publicity for the project, increase community involvement, host mailing lists, and in general just keep the project moving along. (We say more about these activities in the section on creating a community of developers.)
Module owners to be responsible for the development of the code and to integrate contributions and bug fixes from other developers. They also need to participate on the project mailing lists.
Infrastructure support for the CVS archive, mailing lists, bug database, and website.
Website editor to keep the website alive with new content.
People to document the system architecture and record reasons for design decisions.
Buildmaster to oversee the build process and fix problems with broken builds.
Release manager to coordinate release activities.

Going open-source is not a way to get something for nothing. It takes real work to make an open-source project successful. But putting in the required effort can yield a project that grows much more quickly than if you tried do to it all by yourself. For example, a year after Sun started the open-source Tomcat project it had 31 developers with commit access, only 9 of whom were Sun employees.

1. http://mozilla.org/tinderbox.html

2. This is a cute abbreviation style programmers and user interface designers have come to use for certain very long words. Here, "I18N" means that the abbreviated word starts with the letter "I," ends with the letter "N," and has 18 letters in between. The particular abbreviations shown here are widely used in the software engineering community, others are G11N (globalization) and D11N (documentation).

Innovation Happens Elsewhere
Ron Goldman & Richard P. Gabriel
Send your comments to us at IHE at dreamsongs.com.

Previous Table of Contents Up Next