If you compare the way software is developed under an open-source approach and a traditional proprietary approach, you might reasonably conclude that both approaches have their strengths and weaknesses – and that those strengths and weaknesses spring from the very different organizational models they use. An open-source project has the advantage of mobilizing a whole lot of eyeballs: getting many different programmers with different perspectives and backgrounds – many of whom are also actual users of the software – to contribute to solving problems, adding features, and fine-tuning the code. But it has the disadvantage of relatively weak coordination – the programmers are usually at a distance from one another, which impedes collaboration. A proprietary project, by contract, has the advantage of rich collaboration – the programmers typically work for the same company, have the same bosses, go to the same meetings, and are even in the same building. On the other hand, the team is smaller and more homogeneous and probably not as tightly connected to the users of the software – there are a lot fewer eyeballs.
But an interesting research paper by three Harvard scholars, Alan MacCormack, John Rusnak, and Carliss Baldwin, reveals that the apparent weakness of the open-source organizational model – the constraints on close collaboration among programmers – may actually be a hidden strength. The researchers compared the architecture, or design, of one open-source software program (Linux) with that of one proprietary program (the last version of the closed-source Netscape Navigator, which became the first version of the open-source Mozilla). They wanted to test a hypothesis: That the organizational structure of a software development effort will directly manifest itself in the design of the product. In particular, they believed that the tight organization of a proprietary project would result in a tightly integrated program, with a lot of interdependencies, while the loose organization of an open-source project would lead to a more modular design, with fewer interdependencies.
Their analysis of the source code of Linux and the original Mozilla validated their hypothesis. They found many more interdependencies among source files, or code groups, in Mozilla than in Linux. “Specifically,” they write, “a change to one element in Mozilla is likely to impact three times as many other elements as a similar change in Linux. We conclude that the first version of Mozilla was much less modular than a comparable version of Linux.” They also found that Mozilla, after its release as open source, was rapidly and successfully redesigned to become much more modular – at least as modular as Linux, in fact. That shows that there isn’t anything inherently less modular about a browser application than an operating system. The differences in code appear to result from differences in organization.
On the surface, this isn’t surprising. As a proprietary commercial product, Netscape Navigator was designed to optimize its performance. Modularity was neither a goal nor a necessity. Having a lot of interdependencies in the code was, it seems likely, the easiest way for the close-knit team to enhance the performance of their product. No doubt, the team saw their ability to collaborate closely on the software design as a great strength. Linux, on the other hand, had no choice but to have a more modular design – it was a necessity given the loose, informal organization of Linux’s far-flung volunteer workforce.
What’s interesting is that, as we move from an age of isolated software, in which the performance of each individual program was judged separately, to an age of plug-and-play software, in which performance will be judged on the ease with which different programs can be assembled and disassembled, the barrier to close collaboration in the open-source model may turn out to be not a weakness but a strength. If true, this would have broad consequences for the future of the software industry. It would suggest that the traditional model of software development – even if used to produce free software – will be at a significant disadvantage to the open-source model.
There’s also a bit of an irony here. The research implies that open source’s advantage doesn’t stem from the strength of the programmer community. It stems from the weakness of that community.
The collective wisdom is that more modular software architectures generally provide more flexible solutions. Is this then an argument to move from the single room software development methodologies (which include XP and the agile family) to more distributed approaches with teams distributed geographically? Mirroring the open source model, an in-house team carves a problem into a suite of loosely coupled components—blending on- with off-shore, and in-house and out-sourced depending on the requirements of each component. This is the opposite of current approaches (exemplified by XP et al) where the development team and SMEs are corralled into a single room.
The loosely-coupled approach may find strength in increased modularity, which can provide greater flexibility. However, such an approach combined with a high degree of complexity also renders a project more vulnerable to what Perrow terms “normal accidents” than a more tightly-coupled and linear project might be. Error correction in a loosely-coupled project is dependent on the participants noticing the problems. There is nothing inherent in open source software development that ensures that the “many eyeballs” will be there to spot coupling failures that, in the end, could bring down the enterprise. It is happenstance that the Linux development community is large and detail-oriented, but there is no guarantee that other open source communities would generate the required number of participants to ensure problem detection.
I think if you’ve linked the wrong cause with the outcome of modularity. As Peter stated above, “The collective wisdom is that more modular software architectures generally provide more flexible solutions.” The last version of Netscape Navigator was famous for being spaghetti code. I don’t think any software engineer in their right mind would intentionally try to get to the state in which the NN team found itself – for the purpose of performance or otherwise.
Here’s a quote from an old Wired magazine article:
As I understand the situation, Netscape would have loved to rewrite major portions of their browsing engine much earlier than they did, but they were in a high-pressure browser war with Internet Explorer and felt it was only feasible to keep tossing new features on top of the unstable base than taking a year or two to reset by building a more stable foundation. Indeed if you look a the modern descendant of Netscape Netscape – Mozilla Firefox – it’s much more modular than NN and has a much better reputation for performance, standards-compliance, and pluggability than Internet Explorer.
I think the major factor affecting the quality (including modularity) of systems like Linux and Firefox is having experienced engineers design and implement the systems.
It’s darn near impossible to create a great design the first time you attempt a certain type of system. Look at Windows. There used to be a bunch of unstable spaghetti (Windows 3.x/9x). Then MS created NT/2000/XP which was much more stable, robust, secure, etc. Why did this happen? Because they brought in some top notch engineers such as Dave Cutler from DEC who had already created new operating systems several times and had hundreds of person-years of experience with what worked and what didn’t work.
Likewise with Mozilla/Firefox, they had a plate of spaghetti as well in NN because the code that originated NN near the beginning of the world wide web. Very few people in the world had experience with creating a web browser so it was really learn-as-you-go. By the late 1990s, the web standards had evolved a good deal and the NN creators saw a lot of things they wish they could do differently. But they couldn’t because of the competitive situation. When they finally took the plunge and could re-write major parts of the application … five years later we have Firefox.
I’m not trying to sound condescending here (honestly!), but have you ever written any software and/or participated in a software development team, open source project, etc.? If not, I think you should consider using some of your copious spare time to become an amateur software developer; perhaps participate in an open source project or two. I think this experience would lead to some amazingly insightful writing. Your writing an economics and the software industry is very interesting, but when you venture into software development topics, it seems to get a bit iffy.
– Bill
This is actually a disadvantage:
“A proprietary project, by contract, has the advantage of rich collaboration – the programmers typically work for the same company, have the same bosses, go to the same meetings, and are even in the same building.”
“Rich collaboration” is actually some poor schmuch trying to ride herd on a bunch of programmers that resent it, are under the gun to get something out the door, and who don’t actually report to the schmuck. They resent working long hours (see EMC 8-day work week for latest example), don’t really care and are under constant pressure of jobs being eliminated, moved overseas, outsourced, etc. All the while, upper managemnet gleans wonderful salaries, bonuses, perks, etc.
Open source ships it when it’s ready (it’s ready when it’s ready), cares about the product (as you said, they’re users, too), and there is no pressure or some poor schmuck trying to ride herd because of an aritifical “deadline.” And they care.
Bill:
Thanks for emphasizing role of talented engineers – you’re surely correct. But I don’t really see how anything you’re saying contradicts the implication of the paper: that organization influences software architecture, and that more tightly knit organizations will tend to produce less modular architectures. Do you think that’s wrong?
I realize I may have to wait a long time for you to scare up a little spare time in which to compose a reply.
Nick
you may enjoy a post I recently wrote “Breaking the shackles of the software “iron triangle” linked below. Whether it be open source community or Indian vendor investments in CMM and Six Sigma or communities forming around elance and amazon or mushromming software industries around the globe it is clear that traditional software dovelopment has to change and reflect new quality, schedule and cost benchmarks…
http://dealarchitect.typepad.com/deal_architect/2006/01/the_iron_triang.html
I think this strength of open source doesn’t necessarily exploit a weakness in the community. Rather, it explicitly addresses a weakness of human interaction that is always present. The geographic distance makes communication obviously sub-optimal, so it is addressed directly by strong component design (in the best case). In a more close-knit environment, communication is often less of a concern because you can always tap your colleague on the shoulder and ask a question. Yet this human accessibility is masking an underlying communication problem that still exists: the engineers can’t read each other’s minds. This can lead to the corporate black hole of too many meetings and no productivity.
I think the software analysis you discuss here is really more appropriately viewed as an engineering case study. People working in the same office are perfectly capable of producing wonderful code. A single case study of the development process isn’t sufficient; one could easily find an open source project that floundered and compare it to some proprietary software that was developed to release due to some combination of pressure from the paycheck and colleagues pushing each other.
Interestingly, Linus was well aware very early on of the importance of code modularity for the success of the Linux development model. As he told me in 1996 when I interviewed him for a Wired feature (later used in my book Rebel Code):
The link between organizational structure and software development has some real term implications for managers in terms of the levers they can pull. I did a project a few years back that looked at how organizations were designed versus how communication really flows (using social network analysis). The boiled down takeaways are: 1) The “on paper” organizational design is often 100% different than how things are actually done 2) Understanding how things are actually done will allow you to influence them by affecting real communication patterns. In traditional software development, for instance, your code might be dependent on the relationship between two developers in two different groups who have a friendship. This doesn’t appear on paper, but can have real impact because they talk and influence one another. In open source, you might think development is very distributed, but if you look at linkages between the open source community, you will often find that there is distributed “lumpiness”; this might result in less modularity than would be expected.
“A proprietary project, by contract, has the advantage of rich collaboration – the programmers typically work for the same company, have the same bosses, go to the same meetings, and are even in the same building.”
I agree with ordaj: this is a disadvantage. While I think the close proximity would be a plus, the entire environment defeats it and pulls everything down with it.
In a company, there’s always politics. There are always deadlines. There are always power struggles.
This “rich collaboration” is rarely used effectively. It oftens winds up in endless meetings scheduled by managers which only kills productivity.
I think the most substantial aspect of the open source community is the word “community”. Members are there because they want to be there, not because they have a mortgage. Even better, they don’t have the productivity and spirit killers that plague corporate environments. Their participation is entirely voluntary which makes it that much better all around–the individual, the project, the community.
Regarding the geographical separation, that too is better. Chatting software like ICQ, e-mail, forums, etc. are very effective methods of communication. Being in walking distance of one another lends itself to chatting and constant interruptions. Effective communication methods coupled with separation provides a custom throttle between work and communication. Balancing the two as desired improves performance immensely. Its no wonder telecommuters who enjoy their work are more productive than their office-shackled counterparts.
Think about the last time you attended a collaboration meeting with your colleagues. Now think about the last time you went out with your friends. Both are social gatherings to disseminate information…
Deborah Lafky points out that error correction relies on errors being noticed, and comments that Linux depends on a large number of participants who notice errors. However, there is nothing intrinsic regards Open Source here — many, many factors are involved. The willingness to incorporate regression testing, or any sort of testing. The pressures to release early. The willingness to devote resources to quality assurance. The predelection, or not, to release buggy code to the user base. The size of the user base. The ease with which communication regards bugs makes its way back to the developers. The availibilty of deveopment releases. The availability of development release testers and their quality. Etc, etc.
In general, the entirely transparent nature of the Open Source development model and it’s tendency to encourage direct communication with deveopers would, theoretically, tend to favor the Open Source development model when it comes to early error detection, diagnosis, correction, and incorporation into future releases.
Having said all that, I believe that Deborah Lafkey misses the point. Modularity, by definition, reduces complexity.
In closing, not to belittle the Harvard work as folks have to start somewhere, it’s not at all clear to me that any sort of general conclusions can be developed from such a miniscule sample size.
Why hire an engineer to do a job when you can make it an open source and let others do it for free. Less expenses, less time spent, more happy users. I like LINUX better than Windows (hear that Bill?) and I’d like all software to adopt an open source program.