Notes from a Software Architect, on a small island: metrics

Showing posts with label metrics. Show all posts

Tuesday, June 23, 2009

Writing poor code requires skill

Recently I read an interesting post on decaying code and the various ways that could be adopted in order to improve code quality. It briefly outlined three approaches to improving code (presumably on a code base which has been evolved due to enhancements and bug fixes):

Detect the bad code and fix it. But this is too expensive...
Don't write it in the first place. But this requires you (or a tool) to be able to spot bad code (consistently) in the first place.
Formal Training. This is fine, but how do you ensure that that 'training' is put into practice correctly? And it is all too easy to fall back into bad habits which won't get spotted.

This got me thinking. Are there any other approaches to prevent bad code (or code smells)? Well I reckon there are. In fact it should take skill to write poor code given the amount of help that there is at constructing software now.

Increase use of automatic code generation

When I started programming last century, there were two languages I could choose to write my software in. One was a high-level language and the other was assembly language. While all but the most time critical code was written in the high-level language, I sometimes found it necessary to look at the generated assembly code to understand why my program was failing or to optimise the code. Now, I never look at the object code as I never question the quality of the object code generated by the compilers. However badly the code is written, the compiler would normally ensure that the generated code is efficient; this often means that engineers can get away with sloppy or bad code as it is often automatically refactored in the background

by the compiler to something more optimal. It also encourages, in my opinion, the lazy programmer who understands that the compiler does all the hard work in terms of 'writing' good code.

The increasing use of model driven development (MDD) as part of a model driven architecture (MDA) as a way of improving software productivity is moving the goalposts again. In its purest form, the development uses visual tools and the generated code (in a high level language) and subsequent object code is never seen (by a human). However, I have yet to see any real world MDD which doesn't involve some algorithmic code still being written by hand in a conventional high-level language. Over time, the increasing use of MDD should result in less 'algorithmic' code being written which will, by implication, reduce the potential for less code to decay.

MDD offers the opportunity for engineers to focus on the design rather than the implementation which should result in more maintainable systems which can readily be adapted to include enhancements and changes throughout their life. As the generated code is not the primary artifact which engineers work with, the quality of the generated code becomes less important. However, a key question to consider is does the use of MDD lead to the potential of (over time) of creating a decaying model and if so, how do you prevent it? And can you perform 'bad' MDD?

Reuse (of code snippets)

There are an increasing number of source code repositories (for example see here) now available on the Internet offering a variety of resources from simple algorithms to reusable components. All of these repositories help in providing a (hopefully!) proven way of solving a particular problem, and should also be written in such a way to be readable in case the snippet needs to be tweaked in the future. Although I cannot advocate that the quality of all snippets will meet the criteria of not being 'bad' code, it is reasonable to expect that the many will be examples of 'good' code. If the code snippets can be used 'as is' without modification the code should remain maintainable; if the code is modified, the style of the original code should normally be preserved so that the code remains 'good'.

Although the repositories do not require that the code passes any quality checks with regards maintainability etc, it should become obvious that the better code will be downloaded more frequently.

Use Multiple Compilers

I have always advocated compiling code with two different compilers as a way of improving code quality. No two compilers are ever the same as each one has different strengths and weaknesses. I have also always promoted 'clean' compilation i.e. ensuring that all code compiles without warnings once the set of compile options have been defined. If the code compiles cleanly with two different compilers, there is an increased probability that the code is well-structured, which IMO implies that the code will be more maintainable. It should also help testing as a number of latent faults can often be removed prior to run-time.

Clearly if the multiple compiler approach is adopted, this must be used at all subsequent code evolutions; if not the code will clearly decay albeit more slowly than if a single compiler has been used.

Conclusion

Writing maintainable software requires skill. With a bit of thought (and resisting the temptation to code the first thing that comes into your head), quality code can be produced using one or more of the techniques outlined which will support future product evolutions in a cost effective manner.

Friday, June 6, 2008

Assessing code quality

How do you assess a software modules's quality? It is a question I have been struggling with for some time as I try to perform a peer review of a large code base.

Over time, a software module evolves from its intended form to something less than beautiful as bugs are discovered (and fixed) and enhancements over the original requirements are implemented. This is particularly true for code which is developed on a multi-person project, where personnel can change and often a module gets changed by different engineers. Although I adhere to the rule, that the code structure should reflect the original author's style (and how many people change the comment at the top of the file to identify that they have been one of the author's? This assumes that this information isn't automatically added by the configuration management system.), it can become increasingly difficult to make changes.

So what is the best way to assess code quality throughout it's development?

Wednesday, April 2, 2008

SPA2008 - some final thoughts

It is now 2 weeks since the end of the SPA 2008 Conference and I have finally finished writing my notes. (Yippee!) So what are my final thoughts and what I am going to take away for exploring in the next few weeks and months?

I thought the conference was excellently run with a good mix of software practitioners attending. I was spoilt for choice in selecting the sessions to attend but there are some good notes appearing which summarise the sessions I didn't attend. There are also a number of articles appearing on various blogs. The dialogue during and between sessions was also stimulating and I hope to continue this when time permits.

Technically, I am particularly interested in following up the following topics:

Python, Django and other frameworks

Erlang, particularly as part of a multi-language development probably with C, C++ or Java

Further development of metric dashboards (which is work in progress ) now that I have got some further input to consider

Look more at DSM, DSL's and domain specific code generation

Decaying architecture and legacy code bases, and attempting to detect when to intervene

I also learnt a lot about me. The conference confirmed that generally, the work I do is pretty well in line with what the rest of the software world in the UK is currently doing (or tyring to do!). There are differences but these appeared to be related to the domains, size of developments and team dynamics.

Saturday, March 29, 2008

Smelly Software Architectures

At the SPA 2008 Conference, Mark Dalgarno led an interesting workshop which explored the deterioration of software architecture over time. Mark outlined a number of different conditions which could indicate that the software architecture (assuming that there was one to start with!) is decaying from the as-intended architecture. There was an interesting debate about the importance of architecture and there was a clear distinction about architecture minimalism from those practicing agile techniques, to those engaged in large and complex systems of which software is only a small part of the overall solution.

I firmly believe that a key principle is that any architecture should be resilient to change and should be an enduring artifact. It should also be flexible and scalable, although this can depend on the expected lifetime of the architecture (architectures that are only expected to last for a year or two clearly don't need to exhibit the same attributes as an architecture which is intended to last for 10 years or more). This also requires that the architecture and design are seen as distinct activities - I have seen many examples where the architecture has been assumed and the detailed design and implementation has proceeded without any thought about the architectural options or future maintenance requirements.

How an architecture can be assessed was an interesting discussion with metrics being used (e.g. dependency counts) or consideration of some potential change scenarios although the more agile developments considered this to be unnecessary. Example scenarios were offered such as a change in the operating system version or a database upgrade and the resilience of the architecture to this change. However, it is clearly unachievable to think of all such change scenarios that could occur. Some examples of changes that weren't originally anticipated but were then implemented with some significant pain included the addition of error messages in multiple languages. The impact of a project driven by time pressures (time driven coding) were also likely to lead to a less enduring solution and a less robust architecture.

The economics of architectural decay was considered with a view that 'rewriting is considered harmful' particularly for solutions which are deployed widely where there is no acceptable alternative to maintaining what is currently implemented. While I sympathise with this as a view, any maintenance regime should always consider UUD (upgrade, update and disposal) of an in-service solution and to recognise that there may be some time when multiple versions of the same solution operating on potentially different architectures may have to be maintained in parallel.

While the session didn't offer any solutions to a difficult problem, it did pose some interesting thoughts which will require some further consideration for future software architectures.

Thursday, March 27, 2008

Metrics that are useful

As an experiment, I ran one of the BoF sessions at the SPA 2008 Conference on the topic of 'Metrics which are useful'. An interesting discussion ensued in the group consisting of academics, software practitioners and quality specialists.

The following is a summary of my notes:

Why Capture Metrics and what are you trying to achieve?

Purpose of metric capture depends on customer and business

Provide 'Bird's eye view of projects'

Used to improve quality

Used to provide evidence to support quality measure e.g. CMMI

Some thoughts on metrics (good and bad)

SLOC (source lines of code). Easy to calculate once agreement has been made with regards 'a line of code' but considered not good measure. Not particularly appropriate when system includes COTS components as part of solution. SLOC can change depending on language choice. No incentive to encourage reuse (see later), abstractions etc and can lead to excessive cut n'paste.

Coupling/Cohesion of interfaces as a mechanism for showing well-designed modules which can be reused

Use Metrics to monitor 'right first time' during integration. Key issue is how and what to measure. Can also be used as a measure of achievement by Project Manager
Monitoring reuse but difficult to measure or demonstrate. Potential measure could be number of hours saved. Reuse is dependent on expertise, team, business processes and functionality.

Measuring quality of code by examining use (or not!) of framework primitives and higher levels of abstractions.

Number of dependencies (Java) – the more a class is used, the greater the chance that bugs could have been found. Also consider number of interfaces used by component.

Number of tests passed. Not a good measure as it says nothing about requirements achievement. Better measure would be number of requirements passed – each test would have to reference the requirement(s) which are being tested (in part).

Measurement source code changes between different phases e.g. Unit test and Functional Test
Code coverage and number of tests (and completed) are not particularly useful. Code coverage for TDD is always 100%.

Measuring capabilities delivered can be useful metric particularly when adapted to meet business needs.

Key Performance Indicators (KPIs) often used for measures of performance across diverse set of projects (i.e. Not just software)

Using Metrics

What to do with the data once calculated/presented. Metrics must be presented in an easy to understood form (examples include traffic light reports with appropriate thresholds for each colour, graphs (are these always clear?)).

How frequently should the data be reviewed and corrective action instigated? Time should probably be a function of the size of the project/development and the anticipated development time. Small projects may be appropriate to measure daily (probably as part of an overnight build). Other projects it may be more appropriate to record on a weekly or monthly period depending on the likely changes between each report.

The cost of measuring/calculating the metrics should be negligible.

Metrics are always a snapshot. Need to examine the trends/dynamics rather than the absolute values and take appropriate action if trend is going in wrong direction. Doesn't matter if you have 1000 bugs this week, what matters next week is that you have less than 1000 bugs!. Any thresholds need to be appropriate to the project and reviewed and revised continuously. New developments may be able to start with a threshold of no compilation warnings when all code is new; this threshold might not be appropriate for a legacy system.

Some Recommendations

Measure the business value and not the code. Measuring something that is significant to the business is more important. Examples include how many times has the software been used?

Measurement must be understood by EVERYONE (and described in appropriate language), easy to calculate (i.e. Not subjective) and explain what it means to the business. The types of metrics to be selected depends on the type of organisation (and the business structure) and frequently change!

Metrics should be used to encourage good practice (e.g. Reuse, abstractions, frameworks) and not to use to punish offenders!

Starting a project from scratch is ideal to set good practices. However, regardless of where metrics are introduced, the key practice is to monitor the dynamic nature of the project.