Tuesday, July 20, 2010

Lessons in measurement and data analysis

Recently I attended a very interesting and entertaining lecture by Peter Comer, from Abellio Solutions, to the BCS Quality Management Specialist Interest Group (NW) on lessons learnt in measurement and data analysis following a recent quality audit of an organisation’s quality management system (QMS).

The talk started by highlighting the requirements for measurement in ISO9001 (section 8). Key aspects that were highlighted included

  • Measure process within QMS to show conformity with and effectiveness of QMS
  • Monitoring and measurement of processes, products and customer satisfaction with QMS
  • Handle and control defects with products and services
  • Analyse data to determine suitability and effectiveness of QMS
  • Continual improvements through corrective and preventative actions

It was noted that everyone has a KPI (Key Performance Indicator) to measure the effectiveness of products and services although every organisation will use the KPIs slightly differently.

Peter outlined the context of the audit, which was an internal audit in preparation for a forthcoming external audit. The audit was for a medium sized organisation with small software group working in transport domain. A number of minor non-conformances which were relatively straightforward to correct. However, after the audit an interesting discussion ensued regarding the software development process which stated that they were finding more bugs in bespoke software development than anticipated and a lot harder to fix. Initial suggestions included:

  • Look at risk management practices. However, the organisation had already done this by reviewing a old (2002) downloaded paper looking at risk characteristics.
  • Look at alternative approaches to software development.

It was the approach to risk which intrigued Peter. The quality of the paper was immediately considered. What was the quality of the paper? Has it been peer-reviewed? Is it still current and relevant?

Peter then critiqued the paper. The paper proposed a number of characteristics supplemented by designators; it was quickly observed that there was considerable overlap between the designators. The analysis of the data was across a number of different sources although no indication of what the counting rules are (and no indication if they were rigorous and consistent). The designators were not representative of all risk factors that may affect a development and said nothing about their relevance to the size of development. The characteristics focused on cultural issues rather than technical issues - risk characteristics should cover both. Just counting risk occurrences does not demonstrate the impact that the risk could have on the project.

Turning to the conclusions, Peter considered if the conclusions were valid. What would happen if you analysed the data in a different way, would the conclusions be different? Can we be assured that the data has been analysed by someone with prior experience in software development? It was observed that designators were shaped to criteria which is appropriate, but one size doesn’t fit all. Only by analysing the data in a number of different ways can the significance of the data can be established. It can also show if the data is not balanced which can in turn lead to skewed results. In the paper under review, it was clear that qualitative data was being used quantatively.

Peter concluded by stating that by ignoring simple objective measures can lead to the wrong corrective approach which might not be appropriate to their process and product. This is because ‘you don’t know what you don’t know’. It is essential to formally define what to count (this is a metric) with an aim to make the data to be objective. Whatever the method for collection, it must be stated to ensure that it is consistent.

The talk was very informative and left much food for thought. I have always aimed to try and automate the collection process to try and make this consistent. However this does nothing if the data is interpreted incorrectly or inconsistently. It is also difficult to know if you are collecting the right data but that is what experience is for!

Wednesday, March 17, 2010

Creating Intelligent Machines



I have just attended the excellent IET/BCS 2010 Turing Lecture 'Embracing Uncertainty: The New Machine Intelligence' at the University of Manchester which was given this year by Professor Chris Bishop who is the Chief Research Scientist at Microsoft Research in Cambridge and also Chair of Computer Science at the University of Edinburgh. The lecture allowed Chris to share his undoubted passion for machine learning, and although there were a number of mathematical aspects mentioned during the talk, Chris managed to ensure everyone was able to understand the key concepts being described.

Chris started by explaining that his interest is in building a framework for building intelligence into computers, something which has been a goal for many researchers for many years. This is now becoming increasingly important due to the vast amounts of data which is now available for analysis. With the amount of data doubling every 18 months, there is an increasing need to move away from purely algorithmic ways of reviewing the data to solutions which are based on learning from the data. This has traditionally been the goal for machine (or artificial) intelligence and despite what Marvin Minsky wrote in 1967 in 'Computation: Finite and Infinite Machines' that "within a generation ... the problem of creating 'artificial intelligence' will substantially be solved", the problem still does not have a satisfactory solution for many classes of problem.

A quick summary of the history of artificial intelligence showed that expert systems, which were good at certain applications but required significant investment in capturing and defining the rules, and neural networks which provide a statistical learning approach but have difficulty in capturing the necessary domain knowledge within the model, were not adequate for today's class of problems. An alternative approach which was able to integrate domain knowledge with statistical learning was required and Chris's approach was to use a combination of approaches:
  1. Bayesian Learning which uses probability distributions to quantify the uncertainty of the data. The distributions are amended once 'real data' is applied to the model which results in a reduction in the uncertainty.
  2. Probabilistic Graphical Models which enables domain knowledge to be captured in directed graphs with each node having a probability distribution.
  3. Efficient inference which ensures efficiency in computation
To explain the approach, Chris sensibly used real-life case studies to demonstrate the application of the theory in three very diverse applications.

His first example was of Bayesian Ranking system to be used in producing a global ranking from noisy partial rankings. The conventional approaches is to use the Elo rating system which is a method for calculating the relative skill levels of players in two-player games. The Elo system could not handle team games or more than 2 players. As part of the launch of the Xbox 360 Live online playing solution, Microsoft developed the TrueSkill algorithm to match opponents of similar skill levels. The TrueSkill algorithm converges far faster than Elo by managing the uncertainty in a more efficient way; it also operates quickly so that users can find suitable opponents in a few seconds out of a user population of many million. Further details on TrueSkill(TM) are available at http://research.microsoft.com/en-us/projects/trueskill/

The next example was for a website serving adverts and how to determine which advert to show based on the probability of being clicked and the value of click. The proposed approach was to use gausian probability in order to assign a weight to a number of features which is used to determine the ranking. However it is important to ensure that the system continually learns in order to re-evaluate the ranking to ensure that the solution accurately reflects the dynamics of the adverts. If this was not the case, it would be very difficult for a new advert to be be served.

The final example was the Manchester Asthma and Allergy Study which is working with a comprehensive data set acquired over 11 years. The data set is continually being augmented with new types of data (recently genetic data has been added) and the study has been successful at establishing the important variables and features and their relationships. By defining a highly structured model of the domain knowledge, it has been possible to assign each variable a probability distribution. By placing the data at the heart of the study and applying some machine learning techniques, a number of key observations are now being reported which might not have been apparent if more traditional statistical techniques had been used.

As a closing remark, Chris promoted a product from Microsoft Research (Infer.net) which provides a framework for further experimentation in developing Bayesian models for a variety of machine learning problems.

As is now traditional with the Turing Lecture, it is presented at several locations around the country. A webcast of the version presented at the IET in London is available on the IET TV channel.

Friday, November 27, 2009

Performance Management for a Complex World


A thought provoking presentation was recently hosted by the Manchester branch of the IET on Performance Management in a Fast Moving World presented by Dr Therese Lawlor-Wright (University of Manchester), Elizabeth Jovanovic (EEF) and Andrew Wright from Dynamic Technologies Ltd.

After a brief introduction in to what performance measurement was (determining the quality of execution, competence and effectiveness of a actor or operation compared against a standard) and how it could be applied to both people (typically in the form of appraisals) and processes and systems (typically in the form of project reviews), the presentation made a few key observations about how performance measurements are currently performed.
  • Organisations define performance measurements at the strategic level to see how it can achieve goals, often stated in terms of cost, quality, time and reported against a number of key performance indicators (KPIs). These can then be used to communicate and confirm (or adjust) progress.
  • Any KPIs used must be balanced, fair and transparent
  • The performance measurements must include indicators of the effectiveness of the organisation (the extent at which the strategic objectives are being met) and the efficiency of the organisation (how economically the resources of the organisation are being used to provide the level of performance).
Whilst the measurements can be useful, there was a word of caution in the unwanted effects of performance management systems (cited by De Bruijn (2002))
  • The measurements will take too long to accumulate and will soon be out of date (data lag)
  • Competition between teams or business units can result in a tendency to not share valuable data
  • The measurements can stifle innovation at the expense of efficiency
  • There will be 'game' playing to maximise the 'score'
These effects will dominate in the long term. To counter this, the measurements must be refreshed regularly. This should ideally be every 2 to 3 years, as this gives sufficient time for the data to be used for benchmarking/comparison purposes but short enough to counter the complacency that can arise.

Managing Performance

Having established the measurements to be made, the presentation moved on to the application of the measures in the form of performance management. Performance management needs to be both process-oriented and people-oriented and must be a continuous process (and not just an annual activity which is often the case). By being continuous it helps to clarify expectations, standards and targets and allows for any corrective actions to be addressed as soon as they arise. The most common approach for people-oriented performance management, the appraisal, should link individual targets to organisation targets but must be an opportunity to praise and develop. As expected, the audience was reminded that all objectives must be SMART. However, an alternative method of specifying a target was proposed - 'Positive - Personal - Present' which can be used to change behaviour. It can improve staff morale if done correctly, as it apparently tricks your subconscious mind into acting positively (the targets should be written starting 'I ...'). There was a strong suggestion that contrary to many organisations, performance management must not be linked to remuneration since it results in a warped approach in order to fit in with the inevitable budgetary constraints.

The Complex World

The complex world was defined as fast moving (continual change, increasing hierarchies of complexity) together with increasing challenges (timescales, budgets, mergers, multiple dependencies). These require that performance management is aligned with the business strategy. However, the classical approach to strategic management with a top-down controlling hierarchy was considered unsuitable. For complex systems,a more holistic approach is required with a thoughts being contributed from the bottom upwards.

With a fast changing environment, long-term plans quickly lose touch with reality with inflexible KPIs driving behaviours that fail to respond to the real-world challenges. Inconsistency between the strategy, real-world reality, the KPIs and the objectives inevitably quickly leads to poor performance. Clearly there needs to be an alternative approach to performance measurement which is both flexible and efficient.

Taking some of the ideas from the Agile Manifesto, a more lively and dynamic approach developed by consensus which adapts to the changing environment was proposed. This approach addressed many of the unwanted effects with the traditional performance measurement schemes by being much more efficient and flexible with an empowered organisation with a shared vision. Use of such techniques such as the Balanced Scorecard and the EFQM Excellence Model can clearly help in communicating a comprehensive view of an organisation.

Key Conclusions
  • Strategy must become change oriented with a dynamic response in a controlled manner. The route to the strategic vision may change.
  • Long-term plans leave companies without direction
  • KPIs and objectives must respond to the changing needs
  • The measurement process must be flexible and efficient
  • Good performance must be encouraged by reducing uncertainty
  • Organisations need to move from optimising the 'simple' status quo to optimising 'complex' continuous change

Sunday, September 20, 2009

Open Source Certification







I have just been browsing the relaunched website of the British Computer Society and came across an interesting article on Open Source Certification. Now there are some pretty important and successful open source applications out there, but there is limited experience of 'certification' in the same way that you can be become, for example, Microsoft certified. Red Hat does offer some courses for you to become a Red Hat Certified Engineer (RHCE) but this is an exception for open source applications.

The big question is does it matter? It all depends on your point of view regarding certification. Does the fact that a product is 'certified' make it a better product? Does the fact that an engineer is 'certified' make him a better engineer compared to one who isn't? As in all cases it depends. A certified engineer should certainly have independently demonstrated a degree of competence in using or configuring a product. However certification without experience to back up the qualification is no use to anyone. Similarly a certified product might demonstrate that the product has become too large and cumbersome that it really needs to be entrusted to a select band of engineers who have demonstrated that they understand the product better than those who have just learned to tame the product to met their specific requirements. A certified engineer should also probably be aware of a few tricks and tips which are not widely known.

So should all open source products offer a certification programme? In my view, no. However there is clearly a point at which certification becomes necessary or expected by the customer community. I would suggest that this can occur in a number of cases:
  • When the product is becoming widely accepted as one of the market leaders across multiple platforms.
  • When the product is now developed on 'commercial' lines with a funding line.
In either case, a professional certification programme should be promoted and managed, but recognizing that significant experience of a product should be automatically rewarded (on request) with certification, particularly if the experience has been gained through the formative years of the product.

A similar approach was adopted a few years ago by the BCS when it launched the Chartered IT Professional (CITP) qualification. To date, this has yet to become a widely accepted, recognized (and demanded) qualification for key roles within the IT industry. Until recognized qualifications or certifications within the IT industry become a pre-requisite for certain roles, the certifications people achieve will be little more than another the certificate to put on the wall or in the drawer. Until this is the case open source certification will become little more than a commercial exercise in raising funds for future product developments.

Tuesday, June 23, 2009

Writing poor code requires skill

Recently I read an interesting post on decaying code and the various ways that could be adopted in order to improve code quality. It briefly outlined three approaches to improving code (presumably on a code base which has been evolved due to enhancements and bug fixes):
  • Detect the bad code and fix it. But this is too expensive...
  • Don't write it in the first place. But this requires you (or a tool) to be able to spot bad code (consistently) in the first place.
  • Formal Training. This is fine, but how do you ensure that that 'training' is put into practice correctly? And it is all too easy to fall back into bad habits which won't get spotted.
This got me thinking. Are there any other approaches to prevent bad code (or code smells)? Well I reckon there are. In fact it should take skill to write poor code given the amount of help that there is at constructing software now.

Increase use of automatic code generation

When I started programming last century, there were two languages I could choose to write my software in. One was a high-level language and the other was assembly language. While all but the most time critical code was written in the high-level language, I sometimes found it necessary to look at the generated assembly code to understand why my program was failing or to optimise the code. Now, I never look at the object code as I never question the quality of the object code generated by the compilers. However badly the code is written, the compiler would normally ensure that the generated code is efficient; this often means that engineers can get away with sloppy or bad code as it is often automatically refactored in the background
by the compiler to something more optimal. It also encourages, in my opinion, the lazy programmer who understands that the compiler does all the hard work in terms of 'writing' good code.

The increasing use of model driven development (MDD) as part of a model driven architecture (MDA) as a way of improving software productivity is moving the goalposts again. In its purest form, the development uses visual tools and the generated code (in a high level language) and subsequent object code is never seen (by a human). However, I have yet to see any real world MDD which doesn't involve some algorithmic code still being written by hand in a conventional high-level language. Over time, the increasing use of MDD should result in less 'algorithmic' code being written which will, by implication, reduce the potential for less code to decay.

MDD offers the opportunity for engineers to focus on the design rather than the implementation which should result in more maintainable systems which can readily be adapted to include enhancements and changes throughout their life. As the generated code is not the primary artifact which engineers work with, the quality of the generated code becomes less important. However, a key question to consider is does the use of MDD lead to the potential of (over time) of creating a decaying model and if so, how do you prevent it? And can you perform 'bad' MDD?

Reuse (of code snippets)

There are an increasing number of source code repositories (for example see here) now available on the Internet offering a variety of resources from simple algorithms to reusable components. All of these repositories help in providing a (hopefully!) proven way of solving a particular problem, and should also be written in such a way to be readable in case the snippet needs to be tweaked in the future. Although I cannot advocate that the quality of all snippets will meet the criteria of not being 'bad' code, it is reasonable to expect that the many will be examples of 'good' code. If the code snippets can be used 'as is' without modification the code should remain maintainable; if the code is modified, the style of the original code should normally be preserved so that the code remains 'good'.

Although the repositories do not require that the code passes any quality checks with regards maintainability etc, it should become obvious that the better code will be downloaded more frequently.

Use Multiple Compilers

I have always advocated compiling code with two different compilers as a way of improving code quality. No two compilers are ever the same as each one has different strengths and weaknesses. I have also always promoted 'clean' compilation i.e. ensuring that all code compiles without warnings once the set of compile options have been defined. If the code compiles cleanly with two different compilers, there is an increased probability that the code is well-structured, which IMO implies that the code will be more maintainable. It should also help testing as a number of latent faults can often be removed prior to run-time.

Clearly if the multiple compiler approach is adopted, this must be used at all subsequent code evolutions; if not the code will clearly decay albeit more slowly than if a single compiler has been used.

Conclusion

Writing maintainable software requires skill. With a bit of thought (and resisting the temptation to code the first thing that comes into your head), quality code can be produced using one or more of the techniques outlined which will support future product evolutions in a cost effective manner.

Friday, April 17, 2009

Some thoughts on Kanban in Software Development

I recently watched a presentation given by David Anderson at QCon 2008 on his experience and observations of a kanban system applied in a software engineering environment. I found some of the observations very interesting, particularly the approach that optimising the system for lowest cost didn't actually result in the most efficient system in terms of optimal throughput. As with many great ideas, kanban appears to be a very simple concept; however clearly there is more to it than just a simple approach at managing a flow of post-it notes on a whiteboard.

I was keen to understand more about kanban and how it might be applied in various project scenarios. At the SPA2009 conference, there was a session presented by Karl Scotland on his take on kanban (Kanban, Flow and Cadence) together with a very interactive BoF session on Lean and Kanban. Whilst the comparison of lean, based on a model, originally promoted by Toyota, of optimising production, against kanban, which limits the work in progress was a  simple and very understandable comparison of the two approaches, it was when the debate moved to seeing how both lean and kanban could be applied to software developments that the debate got really interesting. 

Clearly software developments come in all shapes and sizes and the audience clearly represented a good cross section of developments and associated practices. Both approaches are clearly well suited to an agile approach in which features are developed to provide a flow of (increasing) value to the end user. This works particularly well in an iterative development where frequent delivery is encouraged, especially where the features are being evolved based on user feedback. This also requires a tolerant customer who can accommodate some failures. An interesting statement of 'get it right second time' was  promoted as acceptable provided you learn from mistakes. I find that hard to accept as the 'norm' because it is completely ignoring any recognition of embedding quality in a delivered product. It might be acceptable in a prototype development but not in a production environment. Clearly there was something missing. There was! The integration step. Now I consider integration to be a very important (and potentially very expensive) stage, which is often squeezed in terms of time (how do you know you have finished?), and can't be omitted.

So could kanban work in a more traditional waterfall development approach? Maybe. In a traditional approach, the requirements are analysed before design, implementation and testing are performed. Using a kanban system, a flow of features could be pushed through the separate stages of design, implementation and testing before handing over to integration, with limits being imposed at each stage. This would probably require that the 3 life cycle stages are performed by different team members, a not unreasonable expectation in a traditional development. Provided the requirements are relatively stable and the ordering and independence of features is organised to support a sensible integration approach in order to deliver increasing value (or functionality) then this could be an interesting approach. Managing the queues at each stage of the life cycle needs to be carefully managed since the costs of managing each queue must be proportional to the size of the queue otherwise the process becomes very inefficient. 

Now for the challenge. How likely is it that each feature will be take approximately the same time through the life cycle? - probably not. In my experience, it is highly unlikely that each requirement (or feature) will be equivalent in terms of effort expended, particularly when the software forms part of a complex system. This will result in some initial  inefficiency as it would take some time for the queues to become populated with tasks. There is the temptation to process easy requirements first as a way of seeding the work queues quickly. However, from a project management perspective this is probably not what is required as it leaves the difficult tasks (and hence risks) to the end (and most difficult tasks will probably impinge on the easier tasks in some form resulting in some additional unplanned work).

Looking back at the David Anderson presentation, it is clear why a kanban approach worked well in one of the case studies since the approach was only applied to bug fixing rather than  in a new development. I believe that looking at kanban in a multiple life cycle development needs a different approach for development with the queues being placed at different points in the system life cycle (as distinct from the software life cycle) to avoid bottlenecks building up. I would advocate that the same developer should be responsible for design, implementation and testing of each feature so that the queue limits are placed on the exit from analysis (or requirements definition) and entry into integration rather than within the different stages of software development. In a lean or agile development this approach is normal since the developers are performing all stages of the life cycle; in waterfall developments, and particularly large systems, this is not the norm. 

Adopting kanban in waterfall certainly isn't new (see Kenji Hiranbe's article) but there is little evidence currently to demonstrate if it can deliver tangible benefits for all types of project approaches. Are there any case studies to prove or disprove my hypothesis?