In his widely read Refactoring: Improving the Design of Existing Code¸ Martin Fowler defines refactoring like this:
Refactoring is a controlled technique for improving the design of an existing code base. Its essence is applying a series of small behavior-preserving transformations, each of which ‘too small to be worth doing’. However the cumulative effect of each of these transformations is quite significant. By doing them in small steps you reduce the risk of introducing errors. You also avoid having the system broken while you are carrying out the restructuring – which allows you to gradually refactor a system over an extended period of time.
In other words, refactoring makes the code’s design simpler and more clear, without affecting what, from the user’s point of view, the code does. More technically, the structure of the software is improved without changing its external behaviour.
Refactoring is a crucial part of Agile. Agile is famous for the way it improves an organisation’s adaptability and responsiveness to change. But Agile isn’t just about delivery. Agile places equal emphasis on the quality of the product it delivers, on the ease with which future development can be carried out, and so on. As far as such priorities are concerned, the efficiency and speed of code needs to be balanced with its maintainability and extensibility – the basic objectives of refactoring.
What does refactoring look like?
When refactoring, you’ll find yourself doing many small tasks that look something like this:
- Deduplicating code.
- Changing the names of updated components so that they continue to describe clearly and precisely what those components do.
- Moving newly developed code from where it was originally written to where it belongs in the codebase as a whole.
- Redistributing functionality between classes.
- Splitting methods when they start to be too complex to understand or change easily, or when a single method starts to perform multiple weakly-related tasks.
- Re-ordering inheritance hierarchies to support shifting functionality.
- Restructuring the code as a whole to reflect standard design patterns.
And so on.
Catalogues of refactoring techniques
It’s probably impossible to write a truly complete list of refactoring techniques, though many people have tried.
Such lists tend to be very long – 70+ techniques is quite normal.
In a sense all these lists do is restate good coding and programming practice. However, by organising refactoring techniques systematically, they make it easier to develop an equally systematic approach to the problem of code quality. Likewise lists of code smells.
So catalogues are worth reviewing carefully:
- To learn individual techniques.
- To analyse the patterns among refactoring techniques themselves.
- To improve coding skills.
Principles of refactoring
There aren’t any convincing methods for identifying opportunities for refactoring yet, but it is possible to lay down some basic principles:
- Refactoring is largely about implementing all those coding and programming standards & practices developers aren’t usually allowed to implement!
- Refactor early, refactor often.
- While the necessary changes are still fresh in the developer’s mind.
- So the mistakes you make are smaller, more localised and easier to fix.
- Refactoring must be limited to changes to the code that don’t affect its external behaviour (functionality, usage, security, etc.).
- Is code’s performance part of its ‘external behaviour’?
- Refactoring that worsens performance should almost certainly be rejected.
- However, even when refactoring improves performance, you should check carefully whether this has any negative effects.
- Comprehensive unit tests are essential for verifying that refactoring hasn’t affected the code’s external behaviour.
- Keep refactoring separate from changing features and functionality.
- It’s easier to refactor if you’re not also trying to change functionality.
- It’s easier to change functionality if you don’t have to worry about writing elegant code at the same time.
- Confusing these tasks undermines both.
- Especially when dealing with code that hasn’t previously been refactored, consider refactoring the code as you review it in preparation for your own changes.
- Refactored code is much easier to understand and change.
- When you’ve finished refactoring:
- The software shouldn’t do anything new or different.
- The software should still passes all its tests.
- There should be no need to change or add any tests to see whether the software is still working properly.
- Unless it is an extremely low-level test that actually verifies implementation details, not features or functionality as such.
- When this occurs, refactor the tests themselves –and make sure that you retest them fully too.
- All code smells have gone.
- Test frequently:
- Testing reassures the developer that refactoring hasn’t broken the code’s intended functionality.
- Test refactoring changes at least once between each refactoring.
- … and ideally before each commit.
Refactoring is often motivated by ‘code smells’. ‘Code smell’ is the name given by developers to a symptom of a more serious illness in the code.
Technically code smells are heuristics, and should tell the developer:
- Where to look for problems.
- What sort of problems to expect.
- What sort of refactoring techniques they should try.
Code smells aren’t bugs:
- Bad practice may not affect how well the code implements the story it was built for, or be technically wrong.
- But bad practice may mean that future development is harder or slower or more buggy, or that small problems need surprisingly large solutions.
- Generally speaking bad smells mean more technical debt.
Code smells are also often symptoms of more than one disease, so investigating one code smell may well lead to other smells being detected.
Although ‘code smell’ applies only to code, similar smells can exist in any part of the delivery cycle, from Product Vision and Plan to architecture to code libraries and development or production environments.
Code smells often result from external pressures on the development cycle, such as ‘time-to-market’ taking priority over the long-term investment value of the code.
Look here for a taxonomy of code smells.
Criticisms of code smells
Although code smell is an intuitively plausible idea, it is open to criticism:
- Code smells are subjective. There is evidence that code smell depends on:
- The individual developer.
- The developer’s level of experience.
- The task they are performing (e.g., development vs maintenance).
- So without a more systematic approach to evaluating code, code smell is a weak basis for driving refactoring.
Refactoring is not limited to small tweaks to low-level code. Refactoring to design patterns (and eliminating anti-patterns) helps developers achieve the same objective. Indeed, using design patterns is often specifically designed to ensure that code-level refactoring is kept to a minimum from the start.
For more on design patterns:
- See Wikipedia’s article on Design Patterns for a list and explanation of overall code structures for different purposes.
- See the Refactoring Guru.
Criticisms of design patterns
Like refactoring itself, design patterns are not immune to criticism. These are valid criticisms, but they don’t prevent patterns from being valuable to developers.
- Emergent design is an important attribute of Agile development:
- The optimal design is often unclear until a good deal of the codebase has been written.
- So patterns should not be applied too rigidly in advance, or enforced too rigidly as a ‘standard’.
- Design patterns are often no more than abstractions any experienced developer would eventually apply.
- In other words, patterns is not really a different approach.
- Even with poorly structured code, a few cycles of low-level refactoring will often lead to higher-level patterns emerging spontaneously.
- Like refactoring, using patterns is a trade-off between the long-term benefits of maintainable, extensible structure and the short-term value of performance, code efficiency, etc.
- If you only have a hammer, every problems looks like a nail.
- And this is true even if you have a whole toolbox of patterns.
Many tools include some automated support for refactoring, although the range of capabilities they often is generally limited.
Wikipedia maintains a useful list of refactoring tools.
Refactoring databases is considerably more difficult than refactoring application code. This is because refactoring a database requires any changes to preserve informational semantics as well as external behaviour (i.e., information content).
- g., data normalisation must also avoid changing the interpretation of the data.
- g., migrating a legacy database must also avoid changing its semantics.
This in turn is difficult because databases are typically massively ‘coupled’ to a wide range of system components, from application code to database management tools to documentation.
For more on database refactoring, see Scott Ambler’s:
Refactoring is like getting your car serviced: if you don’t, one night your car may leave you stranded miles from home.
For the sake of a little careful attention from time to time, you are massively inconvenienced and have a huge bill to pay for getting towed you home. Of course, it might not happen at all, but is it worth taking the risk?
The ultimate purpose of refactoring is to improve the quality of your code. Code quality is an issue at at least two levels:
- Code that successfully provides a given feature or function isn’t always optimally maintainable and extensible.
- Code is subject to ‘code rot’, as unanticipated changes in the surrounding environment affect it.
In fact refactoring is needed anywhere that the basic rules of object-oriented development are violated. Indeed, the invention of object-oriented programming can be seen as a truly radical attempt to industrialise refactoring. And the same applies wherever the counterparts of object-oriented development are not adhered to at higher levels, such as the principles of good design or architecture.
Much of what follows can be summarised in a single fact – that, apart from the most short-term or minimal developments, it is both more efficient and more effective to create simple software than it is to search for the easiest, most convenient or even the fastest methods of development.
The reason for this is summarised in this diagram. In brief, the benefits of developing the easy way are soon overwhelmed by the costs – and the reverse is true when you emphasise the simplicity of the software and other artefacts.
While developing new code (especially when using Test-Driven Development), refactoring offers benefits at many levels.
Improvements to code
Refactoring improves the working software you deliver in many ways:
- Refactoring makes code more maintainable.
- Rationalising the structure (i.e., improved and more explicit mapping of the code’s functionality onto its structure) makes functionally complicated code structurally simpler.
- The structure of the code (its internal architecture or object model) is made more expressive:
- It makes explicit its authors’ intentions, or what they want the code to do rather than how to do it.
- Increased readability makes the code easier to understand.
- The improved expressivity of properly refactored code makes its intention, functioning and purpose clear to other
- As more time is spent reading useful code than writing it, and it is generally read more by people other than by it author, this is critical to its value.
- This in turn makes it easier to evaluate the problems and identify potential solutions.
- Refactoring tends to uncover and remove undiscovered bugs.
- Refactoring makes code more extensible.
- Improving the code’s logical structure makes it easier to change.
- Isolating functions:
- Makes them more accessible to the developer.
- Requires less knowledge to use them properly.
- Creates a system of reusable building blocks.
- More expressive code is less risky to alter, with fewer unexpected effects.
Benefits to the developer
From the individual developer’s perspective:
- Previously refactored code:
- Is more expressive and less ‘noisy’ – there are fewer problems with understanding and changing it.
- Can be improve with less effort, rework and stress.
- Refactoring their own code increases job satisfaction.
Benefits to the IT profession
From the point of view of the IT profession as a whole, refactoring is one of the key elements of software development that allow it to be thought of as true engineering.
- It ensures that robust engineering disciplines are applied.
- It increases the quality of work.
- It encourages the creation of technically advanced disciplines, development strategies, tools and techniques.
Benefits to the business/organisation
From the point of view of the business and the organisation as a whole:
- Refactoring accelerates the delivery of working software, and therefore of real business value.
- Refactoring makes plans and strategies easier and more reliable to implement.
- Refactoring makes it more likely that budgets and deadlines will be met reliably.
- Refactoring is good value for money: it reduces the cost and duration of future development by more than it increases the cost and duration of current
- Refactoring is a basic tool for maximising the return on your investment in code.
- Refactoring reduces Technical Debt.
As code accumulates it tends to ‘rot’. That is, it gets worse even though it has not been changed.
How is this possible? Software rots because of small, unplanned, unnoticed interactions. When fixing bugs, modifying interfaces, adding minor features, updating libraries, etc., other code is affected indirectly.
- Code is duplicated as new functionality is added.
- Code is poorly located for future development.
- Names lose their meaning as code is modified.
- Methods and classes become overloaded with new and only loosely connected functions.
- Hastily created dependencies turn good code to spaghetti.
- The environment changes, leading to anything from small changes in how older code works to making it impossible to run older code at all.
- Entire programming languages (or other system components) become obsolete, and support is no longer available.
- Previously unused code comes into use, exposing previously unrecognised bugs.
- The code can no longer be understood because the original developers have moved on, leaving poorly structured code and no documentation.
More generally, code rot makes coding harder, more expensive, less reliable, and much more frustrating for the developer.
By constantly reviewing and refining the structure of code and its relationship to the software’s functionality, refactoring aims to prevent code rot.
NB: Not all of these forms of code rot can be realistically reversed by refactoring unless refactoring is extended to all aspects of the system, including non-code components. Even if technically possible, this can massively undermine the business justification for doing so.
Issues and risks
Refactoring is subject to a wide range of issues and risks: