Notes from a Software Architect, on a small island

Wednesday, July 27, 2011

How good are your requirements?

I read an interesting post which was trying to determine how much detail a requirement should contain. As with any question like this, it all depends on a number of factors and it is not possible to give a rule which can be religiously followed. Experience over time will determine what works for you - there is always a cost if the balance isn't right. Too few requirements and there is likely to be a very difficult verification phase; too many requirements and it will prolong the development and testing phases.

The first rule of requirements is to ask your customer 'why?'. If they can't explain simply why the requirement is needed then it isn't essential and it should be discounted. If the response is acceptable, the next question is 'How am I going to validate that I have satisfied the requirement'? If you can't agree on how the requirement is to validated, then the requirement clearly needs clarifying, maybe with some constraints explicitly included in the requirement wording. The language of requirements is very important and the differences between shall, should, will, may are VERY important. Any requirement with phrases such as 'such as', 'for example' should always be rejected as they are open-ended and can never be fully satisfied.

As an example, I encountered a requirement recently which included the phrase 'any printer shall be supported'. After discussions, and any customer that doesn't engage in discussions is clearly an indication of a very difficult customer!, it became very clear why the requirement had be written in the way - a third party would be supplying the as yet unspecified printer. However, the discussion did enable the requirement wording to became slightly more achievable by rephrasing as 'a network connected printer supporting postscript'. At least I would have a chance of testing this (and have constrained it to exclude printers with parallel or usb interfaces).

The hardest part of requirements is not the accepting the requirement, it is the validation at the end. How often have you encountered 'This isn't what I wanted'. The only way to avoid this is to remain in constant contact with your customer to validate any assumptions throughout the development so that there aren't any surprises at the end.

Other than experience, I have yet to find a reliable and objective way of assessing the 'goodness' of a requirement set. It would clearly be possible to perform some analysis of the language used, looking for ambiguous phrases for example, but would this be a sufficient measure?

Saturday, June 4, 2011

Computer Forensics

BCS Manchester recently hosted an interesting evening on the growing importance of computer forensics. The session was led by Sam Raincock, an experienced expert in performing forensic analyses. Whilst the session did not reveal the secrets behind a successful analysis (or give hints of how to make an analysis more difficult), it did explore some of the approaches (in general) that can be used in establishing evidence. Whilst a typical forensic investigation does include a significant amount of technical information this only accounts for about 20% of an analysis as the remaining time is concerned with communication with lawyers and report writing. As in all legal cases, it is crucial to review and present all the evidence to piece together a coherent case rather than circumstantancial evidence.

While Computer forensics is primarily the examination of any device that permanently stores data (the range of devices is ever-expanding from the traditional hard disc drives, CD-ROMS and USB memory sticks to mobile phones and cameras), it also includes reviewing system functionality in its goal to try to establish what happened and who did it. It is used in a variety of cases include criminal, civil and fraud cases.

It was stated that 'Every contact leaves a trace' by Edmond Locard, an early pioneer of forensic science. This is very true with all computer usage as every time a file is created, every web page that is browsed, every document that is printed is recorded somewhere although computer usage is unique to everyone.

Some key points that I took away from the session included:

Never assume anything
All humans are unpredictable, and different
Personnel cause more damage than they discover
Do not assume that common sense prevails
The IT department are not forensically trained and don't necessarily understand the value of every piece of data
Forensics is not about data recovery
Ownership of data must be established

A forensic examination is looking at where the offence was allegedly committed, how the event occurred and who performedthe activity. A typical examination can normally be performed on a single device (once a forensic image has been taken) by an appropriate expert and does not normally need to consult with outside agencies (e.g. internet service providers) to obtain specific information. The examination will review such data as cookies, the various system logs, network connections (IP addresses, type of connection particularly whether it was local, remote, fixed, wireless etc). The usage patterns of a computer will reveal a significant amount as every human has particular behaviour traits. The use of the various system logs that reside on a computer or within a network can reveal significant and valuable data; these logs should be actively monitored as they can often be the first sign that something unusual is being performed that may merit investigation. The sooner something is detected, the greater the chance of limiting the damage (or increasing the evidence in order to establisha conviction). In the case of an incident being detected within a business, the primary aim is to return a business to normal as quickly as possible. This is where policies are vitally important; it is equally important that they are actively used, policed and maintained.

Whilst there is no formal qualification required to become a forensic expert (an inquiring mind would probably be useful), it is clearly a growing and important aspect of computing. There are clearly many challenges with the continually evolving usage of computers; the growing importance of the cloud will clearly require different techniques to those employed when examining a physical item such as a laptop. The session left me wondering what traits my computer usage would reveal about me but also wanting to find out more about what is being recorded without me having any knowledge.

Monday, October 11, 2010

Open Source Document and Content Management

BCS Manchester recently hosted a meeting on open source content management and document management in the Public Sector. The speaker. Graham Oakes, explained that the catalyst for this had been an article in the Guardian about the use of commercial software by Birmingham City Council and its attempt at building websites. They had already spent £2.8M and the question was asked ‘Why not use Open Source Software instead?’ Graham stated that there is an awful lot of mis-information around, in particular that the use of OSS does not mean it costs nothing! This led the BCS Open Source Software Group to run a conference in January 2010 looking at the use and adoption of OSS in the public sector.

After briefly outlining what OSS is (source code owned by the community which can derive new works), Graham outlined a typical software stack found in many organisations. At every level, there are good open source solutions available. Content Management normally fits in at the application layer below portals and webservers. Content Management has a number of very strong options including Plone, Hippo, TYPO3, Drupal, Joomla, eZ and Umbraco. Many of these have already been adopted by public sector in developing websites, for example a number of Police forces.Document Management is not as well developed as Content Management and there are fewer options.

Gartner and Forrester enterprise software reviews both report that OSS should be adopted and it is becoming more amenable to use in the UK but it is still necessary to consider the full life-cycle costs. OSS should be considered equivalent to proprietary – public sector should now consider and event contribute to OSS projects. However UK is some way behind other european nations (specifically Netherlands, Germany, Italy and Denmark) with the OSOSS project in Netherlands urging public administration to use more OSS and open standards.

Advantages

The key advantages of adopting OSS within the public sector were identified as

The reason for adoption is low up-front costs. Low costs of initial ownership, but the organisation needs to consider normal software selection processes and consider the risks, requirements for the software etc. OSS should be considered no differently to commercial software. It is still necessary to look at the total cost of ownership (TCO) and an organisation may need to still involve a system integrator in order to deploy effectively.
OSS applications do not constrain the design. The public sector can can use it to start small, think big.
Many OSS is easier to work with because access to source code (only useful if you have skills to use it!) is always available. This can provide an additional option to documentation. OSS is also increasingly important with the cloud as proprietary license models don’t adapt readily.
Good to help the public sector to demonstrate openness (committed to visibility and open to the public)

Apart from the last one, the advantages are not particularly specific for public sector adoption.

Risks

Of course, there is always a downside, commonly called risks in procurement circles. The risks identified included

Getting over the perception that everything is free. This misunderstands the true costs of OSS. Most costs in using software are not in the licenses – it is actually in the content created/managed by the software. Most project costs are less than 10% on technology/license and the migration costs must always be considered as these can often be many more times more expensive than the base software costs.
Misperception around content management and reusability of OSS. Just because some OSS can be used securely, does not mean that all OSS is secure.
Public sector are use to working with large organisations. OSS needs a different approach which the public sector may not readily adopt due to the mismatch of scale. OSS developers often move at a far faster pace than the public sector does (or can). This can be difficult at the procurement stage as the existing procurement model (level playing field) is broken. The procurement approach needs to recognise that OSS is no more less secure than proprietary solutions
Unreasoned decisions still dictate the major procurement decisions. OSS might not be a perfect match to the requirements but may provide a suitable solution.

Conclusions

Graham presented a set of conclusions, which if I am honest are no different to proprietary software.

Remember not all OSS are not the same, different quality levels and capabilities
Always chose carefully – consider usages scenarios
Look beyond licensing costs, user adoption, change management, migration
A team still creates the success over the technology. Choose the right team!
OSS supports evolutionary delivery, try before buy, which encourages innovation and supports agile and lean practices. However, this is good practice for all software.
License fees for software bring costs forward and commit project for the duration (unless trial licenses are available). OSS does not have this commitment and it is (relatively) easy to change OSS software without excessive upfront costs.

So would OSS have solved Birmingham’s problem. No. The problem was not a cost of licences issue; it was not understanding the issue well enough. OSS would have helped to examine the problem in the small before the initial financial commitment was identified which might have produced a more realistic budget.

Tuesday, July 20, 2010

Lessons in measurement and data analysis

Recently I attended a very interesting and entertaining lecture by Peter Comer, from Abellio Solutions, to the BCS Quality Management Specialist Interest Group (NW) on lessons learnt in measurement and data analysis following a recent quality audit of an organisation’s quality management system (QMS).

The talk started by highlighting the requirements for measurement in ISO9001 (section 8). Key aspects that were highlighted included

Measure process within QMS to show conformity with and effectiveness of QMS
Monitoring and measurement of processes, products and customer satisfaction with QMS
Handle and control defects with products and services
Analyse data to determine suitability and effectiveness of QMS
Continual improvements through corrective and preventative actions

It was noted that everyone has a KPI (Key Performance Indicator) to measure the effectiveness of products and services although every organisation will use the KPIs slightly differently.

Peter outlined the context of the audit, which was an internal audit in preparation for a forthcoming external audit. The audit was for a medium sized organisation with small software group working in transport domain. A number of minor non-conformances which were relatively straightforward to correct. However, after the audit an interesting discussion ensued regarding the software development process which stated that they were finding more bugs in bespoke software development than anticipated and a lot harder to fix. Initial suggestions included:

Look at risk management practices. However, the organisation had already done this by reviewing a old (2002) downloaded paper looking at risk characteristics.
Look at alternative approaches to software development.

It was the approach to risk which intrigued Peter. The quality of the paper was immediately considered. What was the quality of the paper? Has it been peer-reviewed? Is it still current and relevant?

Peter then critiqued the paper. The paper proposed a number of characteristics supplemented by designators; it was quickly observed that there was considerable overlap between the designators. The analysis of the data was across a number of different sources although no indication of what the counting rules are (and no indication if they were rigorous and consistent). The designators were not representative of all risk factors that may affect a development and said nothing about their relevance to the size of development. The characteristics focused on cultural issues rather than technical issues - risk characteristics should cover both. Just counting risk occurrences does not demonstrate the impact that the risk could have on the project.

Turning to the conclusions, Peter considered if the conclusions were valid. What would happen if you analysed the data in a different way, would the conclusions be different? Can we be assured that the data has been analysed by someone with prior experience in software development? It was observed that designators were shaped to criteria which is appropriate, but one size doesn’t fit all. Only by analysing the data in a number of different ways can the significance of the data can be established. It can also show if the data is not balanced which can in turn lead to skewed results. In the paper under review, it was clear that qualitative data was being used quantatively.

Peter concluded by stating that by ignoring simple objective measures can lead to the wrong corrective approach which might not be appropriate to their process and product. This is because ‘you don’t know what you don’t know’. It is essential to formally define what to count (this is a metric) with an aim to make the data to be objective. Whatever the method for collection, it must be stated to ensure that it is consistent.

The talk was very informative and left much food for thought. I have always aimed to try and automate the collection process to try and make this consistent. However this does nothing if the data is interpreted incorrectly or inconsistently. It is also difficult to know if you are collecting the right data but that is what experience is for!

Wednesday, March 17, 2010

Creating Intelligent Machines

I have just attended the excellent IET/BCS 2010 Turing Lecture 'Embracing Uncertainty: The New Machine Intelligence' at the University of Manchester which was given this year by Professor Chris Bishop who is the Chief Research Scientist at Microsoft Research in Cambridge and also Chair of Computer Science at the University of Edinburgh. The lecture allowed Chris to share his undoubted passion for machine learning, and although there were a number of mathematical aspects mentioned during the talk, Chris managed to ensure everyone was able to understand the key concepts being described.

Chris started by explaining that his interest is in building a framework for building intelligence into computers, something which has been a goal for many researchers for many years. This is now becoming increasingly important due to the vast amounts of data which is now available for analysis. With the amount of data doubling every 18 months, there is an increasing need to move away from purely algorithmic ways of reviewing the data to solutions which are based on learning from the data. This has traditionally been the goal for machine (or artificial) intelligence and despite what Marvin Minsky wrote in 1967 in 'Computation: Finite and Infinite Machines' that "within a generation ... the problem of creating 'artificial intelligence' will substantially be solved", the problem still does not have a satisfactory solution for many classes of problem.

A quick summary of the history of artificial intelligence showed that expert systems, which were good at certain applications but required significant investment in capturing and defining the rules, and neural networks which provide a statistical learning approach but have difficulty in capturing the necessary domain knowledge within the model, were not adequate for today's class of problems. An alternative approach which was able to integrate domain knowledge with statistical learning was required and Chris's approach was to use a combination of approaches:

Bayesian Learning which uses probability distributions to quantify the uncertainty of the data. The distributions are amended once 'real data' is applied to the model which results in a reduction in the uncertainty.
Probabilistic Graphical Models which enables domain knowledge to be captured in directed graphs with each node having a probability distribution.
Efficient inference which ensures efficiency in computation

To explain the approach, Chris sensibly used real-life case studies to demonstrate the application of the theory in three very diverse applications.

His first example was of Bayesian Ranking system to be used in producing a global ranking from noisy partial rankings. The conventional approaches is to use the Elo rating system which is a method for calculating the relative skill levels of players in two-player games. The Elo system could not handle team games or more than 2 players. As part of the launch of the Xbox 360 Live online playing solution, Microsoft developed the TrueSkill algorithm to match opponents of similar skill levels. The TrueSkill algorithm converges far faster than Elo by managing the uncertainty in a more efficient way; it also operates quickly so that users can find suitable opponents in a few seconds out of a user population of many million. Further details on TrueSkill(TM) are available at http://research.microsoft.com/en-us/projects/trueskill/

The next example was for a website serving adverts and how to determine which advert to show based on the probability of being clicked and the value of click. The proposed approach was to use gausian probability in order to assign a weight to a number of features which is used to determine the ranking. However it is important to ensure that the system continually learns in order to re-evaluate the ranking to ensure that the solution accurately reflects the dynamics of the adverts. If this was not the case, it would be very difficult for a new advert to be be served.

The final example was the Manchester Asthma and Allergy Study which is working with a comprehensive data set acquired over 11 years. The data set is continually being augmented with new types of data (recently genetic data has been added) and the study has been successful at establishing the important variables and features and their relationships. By defining a highly structured model of the domain knowledge, it has been possible to assign each variable a probability distribution. By placing the data at the heart of the study and applying some machine learning techniques, a number of key observations are now being reported which might not have been apparent if more traditional statistical techniques had been used.

As a closing remark, Chris promoted a product from Microsoft Research (Infer.net) which provides a framework for further experimentation in developing Bayesian models for a variety of machine learning problems.

As is now traditional with the Turing Lecture, it is presented at several locations around the country. A webcast of the version presented at the IET in London is available on the IET TV channel.

Friday, November 27, 2009

Performance Management for a Complex World

A thought provoking presentation was recently hosted by the Manchester branch of the IET on Performance Management in a Fast Moving World presented by Dr Therese Lawlor-Wright (University of Manchester), Elizabeth Jovanovic (EEF) and Andrew Wright from Dynamic Technologies Ltd.

After a brief introduction in to what performance measurement was (determining the quality of execution, competence and effectiveness of a actor or operation compared against a standard) and how it could be applied to both people (typically in the form of appraisals) and processes and systems (typically in the form of project reviews), the presentation made a few key observations about how performance measurements are currently performed.

Organisations define performance measurements at the strategic level to see how it can achieve goals, often stated in terms of cost, quality, time and reported against a number of key performance indicators (KPIs). These can then be used to communicate and confirm (or adjust) progress.
Any KPIs used must be balanced, fair and transparent
The performance measurements must include indicators of the effectiveness of the organisation (the extent at which the strategic objectives are being met) and the efficiency of the organisation (how economically the resources of the organisation are being used to provide the level of performance).

Whilst the measurements can be useful, there was a word of caution in the unwanted effects of performance management systems (cited by De Bruijn (2002))

The measurements will take too long to accumulate and will soon be out of date (data lag)
Competition between teams or business units can result in a tendency to not share valuable data
The measurements can stifle innovation at the expense of efficiency
There will be 'game' playing to maximise the 'score'

These effects will dominate in the long term. To counter this, the measurements must be refreshed regularly. This should ideally be every 2 to 3 years, as this gives sufficient time for the data to be used for benchmarking/comparison purposes but short enough to counter the complacency that can arise.

Managing Performance

Having established the measurements to be made, the presentation moved on to the application of the measures in the form of performance management. Performance management needs to be both process-oriented and people-oriented and must be a continuous process (and not just an annual activity which is often the case). By being continuous it helps to clarify expectations, standards and targets and allows for any corrective actions to be addressed as soon as they arise. The most common approach for people-oriented performance management, the appraisal, should link individual targets to organisation targets but must be an opportunity to praise and develop. As expected, the audience was reminded that all objectives must be SMART. However, an alternative method of specifying a target was proposed - 'Positive - Personal - Present' which can be used to change behaviour. It can improve staff morale if done correctly, as it apparently tricks your subconscious mind into acting positively (the targets should be written starting 'I ...'). There was a strong suggestion that contrary to many organisations, performance management must not be linked to remuneration since it results in a warped approach in order to fit in with the inevitable budgetary constraints.

The Complex World

The complex world was defined as fast moving (continual change, increasing hierarchies of complexity) together with increasing challenges (timescales, budgets, mergers, multiple dependencies). These require that performance management is aligned with the business strategy. However, the classical approach to strategic management with a top-down controlling hierarchy was considered unsuitable. For complex systems,a more holistic approach is required with a thoughts being contributed from the bottom upwards.

With a fast changing environment, long-term plans quickly lose touch with reality with inflexible KPIs driving behaviours that fail to respond to the real-world challenges. Inconsistency between the strategy, real-world reality, the KPIs and the objectives inevitably quickly leads to poor performance. Clearly there needs to be an alternative approach to performance measurement which is both flexible and efficient.

Taking some of the ideas from the Agile Manifesto, a more lively and dynamic approach developed by consensus which adapts to the changing environment was proposed. This approach addressed many of the unwanted effects with the traditional performance measurement schemes by being much more efficient and flexible with an empowered organisation with a shared vision. Use of such techniques such as the Balanced Scorecard and the EFQM Excellence Model can clearly help in communicating a comprehensive view of an organisation.

Key Conclusions

Strategy must become change oriented with a dynamic response in a controlled manner. The route to the strategic vision may change.
Long-term plans leave companies without direction
KPIs and objectives must respond to the changing needs
The measurement process must be flexible and efficient
Good performance must be encouraged by reducing uncertainty
Organisations need to move from optimising the 'simple' status quo to optimising 'complex' continuous change

Sunday, September 20, 2009

Open Source Certification

I have just been browsing the relaunched website of the British Computer Society and came across an interesting article on Open Source Certification. Now there are some pretty important and successful open source applications out there, but there is limited experience of 'certification' in the same way that you can be become, for example, Microsoft certified. Red Hat does offer some courses for you to become a Red Hat Certified Engineer (RHCE) but this is an exception for open source applications.

The big question is does it matter? It all depends on your point of view regarding certification. Does the fact that a product is 'certified' make it a better product? Does the fact that an engineer is 'certified' make him a better engineer compared to one who isn't? As in all cases it depends. A certified engineer should certainly have independently demonstrated a degree of competence in using or configuring a product. However certification without experience to back up the qualification is no use to anyone. Similarly a certified product might demonstrate that the product has become too large and cumbersome that it really needs to be entrusted to a select band of engineers who have demonstrated that they understand the product better than those who have just learned to tame the product to met their specific requirements. A certified engineer should also probably be aware of a few tricks and tips which are not widely known.

So should all open source products offer a certification programme? In my view, no. However there is clearly a point at which certification becomes necessary or expected by the customer community. I would suggest that this can occur in a number of cases:

When the product is becoming widely accepted as one of the market leaders across multiple platforms.
When the product is now developed on 'commercial' lines with a funding line.

In either case, a professional certification programme should be promoted and managed, but recognizing that significant experience of a product should be automatically rewarded (on request) with certification, particularly if the experience has been gained through the formative years of the product.

A similar approach was adopted a few years ago by the BCS when it launched the Chartered IT Professional (CITP) qualification. To date, this has yet to become a widely accepted, recognized (and demanded) qualification for key roles within the IT industry. Until recognized qualifications or certifications within the IT industry become a pre-requisite for certain roles, the certifications people achieve will be little more than another the certificate to put on the wall or in the drawer. Until this is the case open source certification will become little more than a commercial exercise in raising funds for future product developments.