SCS97_paper

How to Improve

Safety Critical Systems Standards

Norman Fenton

Centre for Software Reliability

City University

London

Abstract

An effective standard for safety critical software systems is one that should help both developers and assessors of such systems. For developers it should be clear what is required in order to conform to the standard, while for assessors it should be possible to determine objectively compliance to the standard. The existing set of standards do not pass this basic quality test. We provide a framework for improving such standards. We view a standard as a collection of requirements. For each requirement we first identify the process, product, or resource that is the primary focus. Next we consider the clarity of the requirement in respect of the ease with which it is possible to assess conformance to it. We describe guidelines for interpreting requirements to make this task more objective. The method is applied extensively to the IEC 1508 Safety Critical Standard.

1 Introduction and Background

Between 1990 and 1994 researchers at CSR City University were involved in a collaborative project (SMARTIE) whose primary objective was to propose an objective method for assessing the efficacy of software engineering standards [Pfleeger et al 1994]. The method was based on the simple principle that a software standard is effective if, when used properly, it improves the quality of the resulting software products cost-effectively. We considered evidence from the literature and also conducted a small number of empirical studies of specific company standards. We found no evidence that any of the existing standards are effective according to our criteria. This will come as no surprise to anybody who has sought quantitative evidence about the effectiveness of any software engineering method or tool. However, what concerned us more was that, in general, software engineering standards are written in such a way that we could never determine whether they were effective or not.

There was certainly no shortage of standards to review. We came across over 250 standards (from various international and national bodies) that we considered to fall within the remit of software engineering. The common feature of all of them was that they define some aspect of perceived ‘best practice’ relevant for developing or assuring high quality software systems or systems with software components. Unfortunately, there is no consensus about what constitutes best practice, and it follows that there is no consensus as to how to distinguish those best practice techniques that should always be applied. Thus, for standards of similar names and objectives we came across very different models of software quality and the software development process. This was especially true of the safety critical software standards; of which IEC SC65A [IEC 1992] and DEF-STAN 00-55 [MOD 1991] were two significant examples.

We discovered the following general problems in the standards we reviewed:

Heavy over-emphasis on process rather than product. Traditional engineering standards assure product quality by specifying properties that the product itself must satisfy. This includes the specification of extensive product testing. Software standards almost entirely neglect the product and concentrate on the development process. Unfortunately, there is no guarantee that a ‘good’ process will lead to a good product, and there is no consensus that the processes mandated in many of the standards are even ‘good’.

Imprecise requirements: The traditional notion of a standard is of a set of mandatory requirements. Such requirements must be sufficiently precise in definition so that conformance can be determined objectively by appropriate tests. Where no such precision is possible, and hence where mandatory enforcement is impossible, standards bodies traditionally defined documents as ‘codes of practice’ or ‘guidelines’. In this respect software engineering is subjected to a proliferation of ‘standards’ which are at best guidelines that could never be mandated realistically Another basic property of any good standard is that an independent assessor should be able to determine if the standard has been applied or not. We found that a vast majority of requirements are presented in such a way that it is impossible to determine conformance in any objective sense. Thus in general it would be impossible to determine whether or not the standard has been applied. This makes a mockery of many of the assumed benefits of standardisation.

Non-consensus recommendations: Many of the standards prescribe, recommend, or mandate the use of various technologies which have not themselves been validated objectively. The standards may therefore be mandating methods which are not effective for achieving the aim of high quality systems.

Standards too big. Most standards attempt to address the complete system development life-cycle. This results in extremely large documents containing sets of un-related requirements, of which many will be irrelevant in a given application. Such standards are almost impossible to apply, and generally stay ‘left on the shelf’.

In this paper we propose a framework for improving standards. The approach (which is based much on the SMARTIE philosophy) is applicable to any software standards, but is especially pertinent to the safety critical ones. The latter can be viewed as simply the most demanding of the software standards; if you remove the safety integrity requirements material from such standards then they can be applied to any software system with high quality requirements. Our framework for interpreting standards is to view a standard as a collection of requirements that developers have to comply with and to which assessors have be able to determine conformance. In Section 2 we discuss the notion of clarity and objectivity in these respects. Our objective is to provide recommendations on how to rationalise and refine standards in such a way that we move toward the scenario where at least the obligations for the assessor are clear and objective. In Section 3 we explain how to classify requirements according to whether they focus primarily on one of three categories: process, product, or resource. Using this classification, we show how the safety critical standards concentrate on process and resource requirements at the expense of clear product requirements. We explain how to shift the focus toward the product requirements. In Section 4 we explain how requirements could be interpreted in such a way that there is greater objectivity, especially for the assessor.

Our emphasis is on how we can interpret and use standards despite their current weaknesses. We do not question the importance of standards to safety critical systems development. However, clearly some standards are better than others and some requirements are more important than others, even though a priori we do not know which. Thus in Section 5 we discuss the need for assessing the effectiveness of standards, and describe the basic principles behind a measurement-based procedure.

Throughout the paper we concentrate on the recently issued, and highly significant IEC1508 [IEC 1995] (of which Parts 1 and 3 are relevant) as an example of applying our method. This standard is the updated version of IEC SC65A.

2 Clarity of Requirements in Standards

A standard is a collection of individual requirements. Our main concern is to consider the clarity of each mandatory requirement in the following two keys respects:

The developer's obligations for compliance: is it clear what is required in order to conform to the requirement? If not then the standard cannot be used in a consistent and repeatable way.

An assessor's obligation for determining conformance: is it possible to determine conformance to a requirement reasonably objectively? If not then we may not be able to trust the assessor's results.

Generally, obligation (2) will follow from (1). For example, in IEC 1508, Part 1, there are a number of requirements concerning the Safety Plan. Of these 6.2.2e asserts: that the Safety Plan shall include a ‘description of the safety lifecycle phases to be applied and the dependence between them’. The developer knows that certain specific information must appear in the document. The assessor only has to check that this information is there.

Conversely, however, it is not necessarily true that (1) will follow from (2). For example, for the software safety lifecycle we have:

Requirement 7.1.6: "Each phase shall be terminated by a verification report"

Obligation (2) is clear. The assessor has, strictly speaking, only to check the existence of a specific report for each specified phase. However, the developer's obligations for the requirement is unclear; a subsequent requirement (in the software verification section) sheds little light on what constitutes an acceptable verification report:

Requirement 7.9.2.4: "A Software Verification report shall contain the evidence to show that the phase being verified has, in all respects, been satisfactorily completed."

Unfortunately, in a key standard like IEC 1508 most requirements are unclear in both respects. For example, requirement 7.4.6.1a asserts that:

"The source code shall be shall be readable, understandable and testable"

It is unclear what is expected of developers, while an assessor could only give a purely subjective view about conformance.

In traditional engineering standards it is widely accepted that the necessary clarity for both obligations (1) and (2) have to be achieved for all requirements [Fenton et al 1993]. Partly because of the immaturity of the discipline, software engineering standards do not have this clarity. Our objective here is to provide recommendations on how to rationalise and refine standards in such a way that we move toward the scenario where at least the obligations for the assessor are clear and objective.

3 Classifying requirements in standards

3.1 Processes, Products, and Resources

Our approach to interpreting standards begins by classifying individual requirements according to whether they focus primarily on processes, products, or resources:

A Process is any specific activity, set of activities, or time period within the manufacturing or development project. Examples of process requirements are:

7.1.4: "Quality and safety assurance procedures shall run in parallel with lifecycle activities" (process here is Quality Assurance)

7.4.8.7 "Test cases and their results shall be recorded, which may be in machine readable form for subsequent analysis" (process here is Testing)

7.9.2.12 "The source code shall be verified by static methods to ensure conformance to the Software Module Design Specification, the Coding Manual, and the requirements of the Safety Plan (process here is static analysis)

A Product is any new artefact, deliverable or document arising out of a process. Examples of product requirements are:

7.2.2.5a "The Software Requirements Specification shall be expressed and structured in such a way that it is as clear, unequivocal, verifiable, testable, maintainable and feasible as far as possible commensurate with the safety integrity level" (product here is Requirements Specification document)

7.4.6.1b "The source code shall satisfy the Software Module Design Specification" (product here is source code)

7.4.8.5 "The Software Integration Test Report shall be in a form such that it is auditable" (product here is the Software Integration Test Report)

A Resource is any item forming, or providing input to, a process. Examples include a person, a compiler, and a software test tool. Examples of resource requirements are:

Part 1, 5.2.1 "All persons involved in any life-cycle activity, including management activities, shall have the appropriate training, technical knowledge, experience and qualifications relevant to the specific duties they have to perform" (resource here is people)

7.4.4.3a "The programming language selected shall have a translator/compiler which has either a ‘Certificate of Validation’ to a recognised National/International standard or an assessment report which details its fitness for purpose" (resource here is the programming language compiler)

7.4.4.3b "The programming language selected shall be completely and unambiguously defined or restricted to unambiguously designed features" (resource here is the programming language)

7.7.2.7 "Equipment used for software validation shall be calibrated appropriately and any tools used, hardware or software, shall be shown to be suitable for purpose" (resources here are tools)

Ideally, it should be absolutely clear for each requirement which process, product, or resource is being referred to and which property or attribute of that process, product, or resource is being specified. The example requirements above are reasonably satisfactory in this respect (even though they do not all have the desired clarity discussed in Section 2). However, in many requirements, it is necessary to ‘tease out’ this information. Consider the following examples,

7.4.5.3 "The software should be produced to achieve modularity, testability and maintainability"

Although this refers explicitly to the software production process, this requirement really only has meaning for the resulting product, namely the source code. Moreover, the three specified product attributes are quite different and should be stated as separate requirements (preferably in measurable form as discussed below in Section 4).

7.4.2.5: "The design method chosen shall possess features that facilitate software modification. Such features include modularity, information hiding and encapsulation

Although this requirement refers to two processes (design and modification) its primary focus is a resource, namely the design method. Three very different attributes of the method are specified. The reference to modification is out of place here, since the specified properties are only conjectured to be beneficial when subsequent modifications take place.

7.4.7.1: Each module shall be tested against its Test Specification

This is strictly speaking a combination of two separate requirements (and should be treated as such). One is a product requirement: the existence of a document (Software Module Test Specification) to accompany each module. The other is a process requirement that specifies that a certain type of testing activity has to be carried out. The following requirement also says something about the testing process, but is driven by much more specific properties of the product (and hence we would classify it as a product requirement):

7.7.2.6b: "The software shall be exercised by simulation of i) input signals present during normal operation, ii) anticipated occurrences, and iii) undesired conditions requiring system action."

The above classification of standards' requirements represents only the first stage in our proposed means of interpreting standards. It is important because it forces us to identify the specific object of the requirement, and to naturally seek clarification where this is unclear. As a final example, consider the following requirement:

7.4.2.8: "Where the software is to implement both safety and non-safety functions then all of the software shall be treated as safety-related unless adequate independence between the functions can be demonstrated in the design".

By thinking about our classification we can interpret this rather vague and confusing requirement. First of all we tease out the fact that this is a product requirement, but that there are two levels of product being considered: the software as a whole; and the set of individual functions which are being implemented. We need to break up the requirement into the following sub-requirements:

The individual functions in the software shall be identified and listed in the Software Architecture Specification; safety-related functions shall be marked as such. (This is a product requirement; the product is the Software Architecture Specification.)

An independence check will be performed on each <safety, non-safety> pair of functions identified in (1). (This is a process requirement: how the check is to be performed needs to be further expanded.)

A <safety, non-safety> pair of functions are defined to be independent provided that ... (needs to be further expanded) . Two functions that are not independent are defined to be dependent. (Product requirement.)

The whole system shall be partitioned into two groups of functions: Group A will contain all safety-related functions together with all non-safety related functions which are dependent on at least one safety related function. Group B will contain all remaining non-safety-related functions. The whole software system shall be classified as safety-related if Group B is empty. (Product requirement.)

3.2 Internal and external attributes

For product requirements, we make a distinction between attributes which are internal and those which are external. An internal attribute of product X is one that is dependent only on product X itself (and hence not on any other entity, be it another product, process or resource). For example, where X is source code, size is an internal attribute. An external attribute of a product X is one that is dependent on some entities other than just product X itself. For example, if product X is source code then the reliability of X is an external attribute. Reliability of X cannot be determined by looking only at X; it is dependent on the machine and compiler running X, the person using X, and the mode of use. If any of these are changed then the reliability of X can change. We have already seen numerous examples of external attributes in the above requirements (testability, maintainability, readability). Attributes like modularity (in 7.4.5.3) can, with specific definitions, be regarded as internal [Fenton and Pfleeger 1996].

The distinction between internal and external attributes is now a widely accepted basis for software evaluation. Clearly, external attributes are the ones of primary concern, especially as our ultimate objective here is to determine acceptance criteria for safety critical systems. This means that we have to determine whether the system's external attributes like safety, reliability, and maintainability are acceptable for the system's purpose. In practice, these attributes cannot be measured directly. We may be forced to make a decision about the acceptability of these attributes before the system is even extensively tested. This means that we are forced to look for evidence in terms of internal product attributes, or process and resource attributes. Requirements in standards which simply state that certain desirable external attributes should be present are invariably vacuous and should be removed (since they are nothing more than objectives).

3.3. Balance between types of requirements

The Oxford Encyclopaedic English Dictionary defines a standard as

"an object or quality of measure derived as a basis or example or principle to which others conform or should conform or by which the accuracy or quality of others is judged"

This definition conforms to the widely held intuitive view that standards should focus on specifying measurable quality requirements of products. Indeed, this is the emphasis in traditional engineering standards. This point was discussed in depth in [Fenton et al 1993] which looked at specific safety standards for products (such as pushchairs). These explicitly specify tests for assessing the safety of the products. That is, they provide requirements for an external attribute of the final product. The measurable criteria for the testing process are also specified. There is therefore a direct link between conformance to the standard and the notions of quality and safety in the final product. Standards such as BS4792 [BSI 1984] also specify a number of requirements for internal attributes of the final product, but only where there is a clearly understood relationship between these and the external attribute of safety.

We contrast this approach with software safety standards. Very few requirements in these standards are well-defined product requirements. For example, [Fenton: et al 1993] provided a detailed comparison of the requirements in BS 4792 with those of DEF-STAN 00-55. The latter consists primarily of process requirements (88 out of a total 115 with 14 internal product and 13 resource requirements. There is not a single external product requirement. In contrast, BS 4792 consists entirely of product requirements (28 in total) of which 11 are external.

The distribution of requirements in 00-55 seems fairly typical of software standards studied in SMARTIE. The standard IEC 1508 is slightly different in that there is a very large number of resource requirements, but again we find far more process than product requirements. The difference between requirements in standards such as 00-55 and IEC 1508 compared with those in BS 4792 is that, generally, there is no conclusive evidence that satisfying them will help achieve the intended aim of safer systems. For example, the following are typical internal product requirements from IEC 1508:

7.4.4.6: "The coding standards shall specify good programming practice, proscribe unsafe language features and describe procedures for source code documentation."

7.4.2.11: "The software design shall include, commensurate with the required safety integrity level, self-monitoring of control flow and data movements."

7.4.3.2a: "The Software Architecture Design Description shall be based on a partitioning into components/subsystems, each having an associated Design Specification and Test Specification."

7.4.5.3: "The software should be produced to achieve modularity, testability and maintainability."

Each of these (which would need further clarification to be usable anyway) represent particular viewpoints about internal structural properties that may impact on system safety. Unfortunately, there is no clear evidence that any of them really do [Fenton et al 1994]. The many process and resource requirements in standards such as IEC 1508 have an even more tenuous link with final system safety.

4 Classifying standards' requirements by level of objectivity

The above classification of standards' requirements into process, product, or resource represents only the first stage in interpreting standards. The next stage is to further classify the requirements according to the ease with which we can assess conformance. Our objective is to identify the ‘rogue’ requirements. These are the requirements for which the assessor's obligation (as discussed in Section 2) is unclear; that is, where an assessment of conformance has to be purely subjective. Assuming that a requirement refers to some specific, well-defined process, product or resource, we distinguish four degrees of clarity for each requirement (as shown in Table 1

Code	Interpretation
R	A reference only with no indication of any particular attribute(s) which that entity should possess
*	A reference for which only a subjective measure of conformance is possible
**	A reference for which a partially subjective and partially objective measure of conformance is possible
***	A reference for which a totally objective measure of conformance is possible.

Table 1. Codes for degree of detail given in any requirement

Ideally, the vast majority of requirements should be in categories ‘**’ and ‘***’ (with a small number of necessary ‘R’s for definition). In the IEE pushchair safety standard BS4792 every one of the 28 requirements is in category ‘***’. Although IEC 1508 is more objective than the vast majority of software standards reviewed during SMARTIE (and is indeed a significant improvement on its earlier draft IEC SC65A), many requirements (including most of the examples presented so far) still fall into the ‘R’ and ‘*’ category. This means that conformance to such requirements can only be assessed purely subjectively. It is difficult to justify their inclusion in a safety critical standard. How are we to assess, for example, requirements such as:

7.4.6.1a: "The source code shall be readable, understandable, and testable"

It would be near impossible to convincingly whether it is satisfied or not, so it is effectively redundant. Alternatively, we could attempt to re-write it in a form which enables us to check conformance objectively. As long as there is mutual agreement (between developer and assessor) in the overall value of a requirement (however vague) then this is the option we propose. First of all, we stress that there is a considerable difference between

making a requirement objective, and

being able to assess conformance to a requirement objectively.

Option (a) is generally very difficult and often impossible; in an immature discipline there is even some justification for allowing a level of subjectivity in the requirements. It is only option (b) that is being specifically recommended. The following example explains the key difference between (a) and (b) and shows the different ways we might interpret requirements to achieve (b). There are generally many ways in which this can be done:

Example 1: We consider how we might interpret requirement 7.4.6.1a above in order that we can assess conformance objectively. First of all we note that there are actually three separate product requirements, namely:

Each software module shall be readable;

Each software module shall be understandable;

Each software module shall be testable.

We concentrate on just (i) here. Consider the following alternative versions:

To accompany each software module a report justifying readability will be produced.

To accompany each software module a report justifying readability will be produced. This report will include a section that explains how each variable in the module is named after the real-world entity that it represents.

The ratio of commented to non-commented code in each software module shall be at least 1 to 4, and the total size shall not exceed 1000 LOC.

An independent reviewer, with a degree in Computer Science and 5 years experience of technical editing, shall devote a minimum of 3 hours to reviewing the code in each software module. The reviewer shall then rate the module for readability on the following 5-point ordinal scale: 0 (totally unreadable) 1 (some serious readability problems detected, requiring significant re-write); 2 (only minor readability problems detected, requiring re-write); 3 (only trivial readability problems detected); 4 (acceptable). The module must achieve a rating of 3 or higher.

Each of the above versions can be checked for conformance in a purely objective manner even though a large amount of subjectivity is still implicit in each of the requirements. In the case of A we have only to check the existence of a specific document. This is a trivial change to the original requirement since we have still said nothing about how to assess whether the document adequately justifies whether the module is readable. Nevertheless we have pushed this responsibility firmly onto the developers and not the assessors. Alternative B is a refinement of A in which we identify some specific criteria that must be present in the document (and which might increase our confidence in the readability argument). For alternative C we have only to check that the module has the right ‘measures’. A simple static analysis tool can do this. In alternative D we have only to check that the rating given by the independent reviewer is a 3 or 4 and check that this person does indeed have the specified qualifications and experience.

In each of the alternative versions measurement plays a key, but very simple role. In the case of version D the requirement is based on a very subjective rating measure. Nevertheless we can determine conformance to this requirement purely objectively.

None of the alternative requirements except C is a requirement for which the module itself (a product) is the focus. Alternatives A and B are both requirements of a different product, while Alternative D concentrates on the results of a reviewing process .

Example 1 confirms that being able to assess conformance to a requirement objectively does not mean that the requirement itself is objective. Nor, unfortunately, does it always mean that assessment will be easy. The approach that we are proposing is to move toward identifying measurable criteria to replace ill-defined or subjective criteria. This is consistent with the traditional measurement-based approach of classical engineering disciplines. Texts such as [Fenton and Pfleeger 1996] explain how to move toward quantification of many of the subjective criteria appearing in a standard such as IEC 1508. The following example further illustrates the method:

Example 2: Requirement 7.2.2.5a asserts "To the extent required by the integrity level the Software Safety Requirements Specification shall be expressed and structured in such a way that it is as clear, precise, unequivocal, verifiable, testable, maintainable and feasible as possible commensurate with the safety integrity level". Each of the required attributes here (which need to be treated as separate requirements) are ill-defined or subjective. In the case of ‘maintainable’ there are a number of ways we could interpret this so that we could assess conformance objectively. The most direct way is to specify a mean or maximum time in which a change to the SSRS can be made. Since such measures are hard to obtain it may be preferable to specify certain internal attributes of the SSRS, such as: the electronic medium in which it must be represented; the language in which it has to be written; that it has to be broken up into separately identifiable functions specified using less than 1000 words each; etc. Specification measures such as Albrecht's Function Points [Albrecht 1979] might even be used. A radically different approach is that of alternative D in Example 1 where we simply specify what expert's rating of maintainability has to be achieved.

5 Measurement Based Standards Evaluation

So far we have concentrated on how we can interpret and use standards despite their many weaknesses. We do not question the general importance and value of standards to safety critical systems development. Nevertheless, there are very wide differences of emphasis in specific safety-critical standards. For example, 00-55 and IEC1508 are totally different in their underlying assumptions about what constitutes a good software process; 00-55 mandates the use of formal specification (and is structured around the assumption that formal methods are used), while 1508 mentions it only as a technique which is ‘highly recommended’ at the highest safety integrity level (level 4). Clearly the standards cannot all be equally effective. They are certainly not equally easy to apply or assess. Therefore we have to assume that some standards are better than others and some requirements in standards are more important than others. Unfortunately, a priori we do not know which.

It follows that there is a need for assessing the effectiveness of standards, especially when we consider the massive technological investments which may be necessary to implement them. What we have described so far may be viewed as a ‘front-end’ procedure for standards evaluation. This is like an intuitive quality audit, necessary to establish whether a given standard satisfies some basic criteria. It also enables us to interpret the standard, identify its scope, and check the ease with which it can really be applied and checked. However, for proper evaluation we need to demonstrate that, when strictly adhered to, the use of a standard is likely to deliver reliable and safe systems at an acceptable cost.

The SMARTIE project looked at how to assess standards in this respect [Pfleeger et al 1994]. The basic impediment to proper evaluation is the sheer flabbiness of the relevant standards. Many of the standards address the entire development and testing life-cycle, containing massive (and extremely diverse) sets of requirements. It makes no scientific sense, and is in any case impractical, to assess the effectiveness of such large objects. Thus we use the notion of a mini-standard. Any set of requirements, all of which relate to the same specific entity or have the same specific objective, can be thought of as a standard in its own right, or a mini-standard. Rather than assess an entire set of possibly disparate requirements, we instead concentrate on mini-standards.

The need to decompose standards into manageable mini-standards is a key stage in the evaluation procedure described in [Pfleeger et al 1994]. Many software-related standards are written in a way which makes this decomposition extremely difficult. However, the software part of IEC 1508 is structured in a naturally decomposable way. We can identify seven key mini-standards in the relevant parts of IEC 1508:

Process of Specifying Safety Integrity Levels (Part 1, Section 8)

The Safety Plan (Part 1, Sections 6 and 7, and Part 3, Section 7.1 and 7.2)

Resources (Part 1, Section 5 and Part 3 Section 5 which concentrate on people; and those requirements in Part 3 which describe the requirements of the design method, programming language and other tools)

The Software Requirements Specification (Part 3, Section 7.2)

The Design Process (Part 3, Section 7.4)

The validation and verification and testing process (Part 3, Sections 7.3, 7.7 and 7.9)

The maintenance process (Part 3, Sections 7.6 and 7.8)

The formal obligations for evaluating the efficacy of a mini-standard reduces to measuring the following criteria in a given application of the standard:-

Benefits: What observable benefits are supposed to result from the application of the mini-standard? Specifically, which external attributes of which products, processes or resources are to be improved? For example, is it reliability of code, maintainability of code or designs, productivity of personnel?

Degree of conformance: To what extent have the requirements been conformed to (note that to measure this properly we need to be able to assess conformance objectively)

Cost: What is the cost of applying the mini-standard (over and above normal development costs).

Essentially, a mini-standard successfully passes an evaluation for a specific environment if, in such an environment, it can be shown that the greater the degree of conformance to the standard, the greater are the benefits, providing that such improvements merit the costs of applying the standard.

The problem of over-emphasis on process requirements in safety-critical standards has an important ramification when it comes to the evaluation procedure. Specifically, we found that, for many process requirements, the intended link to a specific benefit is unclear. For example, 00-55 contains the requirement:

30.1.2: "The Design Authority shall use a suitable established and standardised Formal Method or Methods for the Formal Design. Properties that cannot be expressed using the Formal Method or Methods shall be notified to the MOD(PE) PM and a suitable, established design method agreed."

Even if we could determine objectively conformance to such a requirement¾ the appendix of the standard provides some crude guidelines for this¾ it is unclear what the specific intended benefit is. Only from reading the rest of the standard do we discover that an intended major benefit is that it helps to make implementations ‘provable’ (that is it makes possible a mathematical proof of correctness). However, this in itself would be of little interest as a benefit to users. Rather, we have to assume the implicit benefit to be implemented code which is more reliable.

6 Summary and Conclusions

For safety-critical standards to be usable we expect the individual requirements to be clear to:

Developers so that they know what they are required to do; and

Assessors so that they know how to determine conformance.

Unfortunately, many requirements in the relevant standards are not clear in either of these respects (although IEC 1508 shows a significant improvement on its previous incarnation IEC SC65A in many of the specific respects identified here). We have shown how to interpret unclear requirements in both respects, but with special emphasis on the assessors' needs. There is a significant different between:

making a requirement objective, and

being able to assess conformance to a requirement objectively.

While (a) is generally very difficult, we have shown how to achieve (b) in a rigorous manner.

The vast majority of all requirements in existing safety-critical systems standards are unnecessarily unclear. Our approach to interpreting such requirements begins by teasing out the relevant process, product or resource that is the primary focus. In many cases this means breaking down the requirement into a number of parts. This technique alone can often achieve the required level of clarity. We provided numerous examples drawn from IEC 1508 on how to do this.

When the requirements in safety critical standards are classified according to products, processes and resources, we found a dearth of external product requirements (in stark comparison with safety-related standards in traditional engineering disciplines). The emphasis was on process and resource requirements with a smaller number of internal product requirements. This balance seems inappropriate for standards whose primary objectives are to deliver products with specific external attributes, namely safety and reliability.

Finally, we discussed the need for assessing the effectiveness of standards. The sheer size of existing standards makes them too large to assess as coherent objects. Thus we used the notion of mini-standards, whereby we identify coherent subsets of requirements all relating to the same specific process, product, or resource. The identification of mini-standards helps us not only in assessment but also in rationalising and interpreting standards. We proposed a decomposition of IEC 1508 into mini-standards.

We have presented some simple practical advice on how to improve safety-related standards. Unfortunately, the standards-making process is long and tortuous. In many cases this process itself contributes to some of the problems highlighted earlier. Perhaps it is time that the software industry paid for the development of good, timely standards rather than continued to rely on the contributions of individuals who volunteer their effort to standards’ making bodies. While such contributions are, more often than not, heroic and unsung, they are nevertheless entirely ad-hoc. A such we deserve nothing better than the ad-hoc standards we have at present.

7 Acknowledgements

The contents of this report have been influenced by material from the SMARTIE project (funded by EPSRC and DTI) in which the author was involved, and also by an earlier assessment of IEC SC65A that the author performed as part of the ESPRIT project CASCADE project (funded by the CEC). The new work carried out here was partly funded by the ESPRIT projects SERENE and DEVA. The author is indebted to Colum Devine, Miloudi El Koursi, Simon Hughes, Heinrich Krebs, Bev Littlewood, Martin Neil, Swapan Mitra, Stella Page, Shari Lawrence Pfleeger, Linda Shackleton, Roger Shaw and Jenny Thornton for comments that have influenced this work.

8 References

Albrecht A.J, Measuring Application Development, Proceedings of IBM Applications Development joint SHARE/GUIDE symposium. Monterey CA, pp 83-92, 1979.

British Standards Institute, Specification for Safety Requirements for Pushchairs, British Standards Institute BS 4792, 1984.

Fenton NE and Pfleeger SL, Software Metrics: A Rigorous and Practical Approach (2nd Edition), International Thomson Computer Press, 1996.

Fenton NE, Littlewood B, and Page S, Evaluating software engineering standards and methods, in Software Engineering: A European Perspective (Ed: Thayer R, McGettrick AD), IEEE Computer Society Press, pp 463--470, 1993.

Fenton NE, Pfleeger SL, Glass R, Science and Substance: A Challenge to Software Engineers, IEEE Software, 11(4), 86-95, July, 1994.

IEC (International Electrotechnical Commission), Software for computers in the application of industrial safety related systems, IEC 65A, 1992.

IEC (International Electrotechnical Commission), Functional safety of electrical/electronic/programmable systems: generic aspects, IEC 1508, 1995.

Ministry of Defence Directorate of Standardization, Interim Defence Standard 00-55: The procurement of safety critical software in defence equipment; Parts 1-2, Kentigern House 65 Brown Street Glasgow, G2 8EX, UK, 1991.

Pfleeger SL, Fenton NE, Page P, Evaluating software engineering standards, IEEE Computer, 27(9), 71-79, Sept, 1994.