Sunday, May 3, 2009

Everything I needed to know about Maintenance...

So much has been written about software development: there are good books and blogs on software engineering, agile methods, design patterns, requirements requirement, software development lifecycles, testing, project management. But so little has been written about how most of us spend most of our careers in software: essentially maintaining and supporting legacy software, software that has already been built.

I learned most of what I needed to know about successful software maintenance and support a long time ago, when I worked at a small and successful software development company. At Robelle Solutions Technology we developed high-quality technical tools, mostly for other programmers: an IDE (although we didn’t call it an IDE back then, we called it an editor – but it was much more than that), database tools, and an email system, which were used by thousands of customers world wide. I worked there in the early 1990s as a technical support specialist – heck, I was half of the global technical support team at the time. Besides the two of us on support, there was three programmers building and maintaining the products, another developer building and supporting our internal tools, a technical writer (who moonlighted as a science fiction writer), and a small sales and administrative team. The developers worked from home – the rest of us worked out of a horse ranch in rural BC for most of the time I was there. The company succeeded because of strong leadership, talent, a close-knit team built on trust and respect, clear communications, and a focus on doing things right.

Looking back now I understand better what we did that worked so well, and how it can be applied today, with the same successful results.

We worked in small, distributed teams, using lightweight, iterative design and development methods; building and delivering working software to customers every month, responding to customer feedback immediately and constantly improving the design and quality of the product. Large changes were broken down into increments and delivered internally on a monthly basis. All code was peer reviewed. The developers were responsible for running their own set of tests, then code was passed on to the support team for more testing and reviews, and then on to customers for beta testing at the end of the monthly timebox. Back then we called this “Step by Step”.

Incremental design and delivery in timeboxes is perfectly suited to maintenance. Except for emergency patch releases, enhancement requests and bug fixes from the backlog can be prioritized, packaged up and delivered to customers on a regular basis. Timeboxing provides control and establishes a momentum to releases, and customers see continuous value.

We maintained a complete backlog of change requests, bug reports, and our product roadmaps in a beautiful, custom-built issue management system that was indexed for text lookup as well as all of the standard key searching: by customer, product, date, engineer, priority, issue type. Using this we could quickly and easily find information about a customer’s issue, check to see if we had ran into something similar before, if we had a workaround available or a fix scheduled in development. When a new version was ready, we could identify the customers who asked for fixes or changes and see if they were interested in a pre-release for testing. There are issue management systems today which still don’t have the capabilities that what we had back then.

Technical debt was carefully and continuously managed: of course we didn’t know about technical debt back then either. One of the developers’ guiding principles was that any piece of code could be around for years: in fact, some of the same code is still in use today, more than 25 years after the original products were developed! If you had to live with code for a long time, you better be happy with it, and be able to change it quickly and confidently. The programmers were careful in everything that they wrote or changed, and all code was peer reviewed, checking for clarity and consistency, encapsulation, error handling, portability, and performance optimization where that was important. The constraints of delivering within a timebox also focused the development team to come up with the simplest design and implementation possible, knowing that in some cases the solution may have to be thrown away entirely or rewritten in a later increment because it missed the requirement.

The same principles and practices apply today, taking advantage of improvements in engineering methods and technology, including automated unit testing, continuous integration, static analysis tools and refactoring capabilities in the IDE. All of this helps ensure the quality of the code; allows you to make changes with confidence; and helps you avoid falling into the trap of having to rewrite a system because you can no longer understand it or safely change it.

We knew our products in detail: on the support desk, we tested each release and worked with the developers and our technical writer to write and review the product documentation (which was updated in each timebox), and we used all of our own products to run our own business, although not pushing them to the same limit as our customers. Once or twice a year we could take a break from support and work in development for a timebox, getting a chance to do research or help with a part of the product that we felt needed improvement. All of the developers worked on the support desk a few times a year, getting a chance to hear directly from their customers, understand the kind of problems they were facing or what kind of improvements they wanted, and thinking about how to improve the quality of the products, how to make troubleshooting and debugging customer problems easier.

Since we delivered to customers as often as once per month, we had a well-practiced, well-documented release process, with release and distribution checklists, release notes, updated manuals, install instructions that were carefully tested in different environments. All of this was developed iteratively, constantly improved as we found problems or new tools or new ideas. Today teams can take advantage of the ITIL release management practice framework, books like Release It! and Visible Ops to build effective release management processes.

I have learned a lot since working at Robelle, but sometimes I find that I am relearning old lessons; only now truly understanding the importance, the value of these practices, and how can they be applied and improved on today.

No comments:

Site Meter