CS 410/510 - Software Engineering class notes

Reference: Sommerville, Software Engineering, 10 ed., Chapter 10

The big picture

For many computer-based systems, the most important system property is the dependability of the system. The dependability of a system reflects the user's degree of trust in that system. It reflects the extent of the user's confidence that it will operate as users expect and that it will not 'fail' in normal use. Dependability covers the related systems attributes of reliability, availability and security. These are all inter-dependent.

System failures may have widespread effects with large numbers of people affected by the failure. Systems that are not dependable and are unreliable, unsafe or insecure may be rejected by their users. The costs of system failure may be very high if the failure leads to economic losses or physical damage. Undependable systems may cause information loss with a high consequent recovery cost.

Causes of failure:

Hardware failure: Hardware fails because of design and manufacturing errors or because components have reached the end of their natural life.
Software failure: Software fails due to errors in its specification, design or implementation.
Operational failure: Human operators make mistakes. Now perhaps the largest single cause of system failures in socio-technical systems.

Dependability properties

Principal properties of dependability:

Principal properties:

Availability: The probability that the system will be up and running and able to deliver useful services to users.
Reliability: The probability that the system will correctly deliver services as expected by users.
Safety: A judgment of how likely it is that the system will cause damage to people or its environment.
Security: A judgment of how likely it is that the system can resist accidental or deliberate intrusions.
Resilience: A judgment of how well a system can maintain the continuity of its critical services in the presence of disruptive events such as equipment failure and cyberattacks.

Other properties of software dependability:

Repairability reflects the extent to which the system can be repaired in the event of a failure;
Maintainability reflects the extent to which the system can be adapted to new requirements;
Survivability reflects the extent to which the system can deliver services whilst under hostile attack;
Error tolerance reflects the extent to which user input errors can be avoided and tolerated.

Many dependability attributes depend on one another. Safe system operation depends on the system being available and operating reliably. A system may be unreliable because its data has been corrupted by an external attack. Denial of service attacks on a system are intended to make it unavailable. If a system is infected with a virus, you cannot be confident in its reliability or safety.

How to achieve dependability?

Avoid the introduction of accidental errors when developing the system.
Design V & V processes that are effective in discovering residual errors in the system.
Design systems to be fault tolerant so that they can continue in operation when faults occur.
Design protection mechanisms that guard against external attacks.
Configure the system correctly for its operating environment.
Include system capabilities to recognize and resist cyberattacks.
Include recovery mechanisms to help restore normal system service after a failure.

Dependability costs tend to increase exponentially as increasing levels of dependability are required because of two reasons. The use of more expensive development techniques and hardware that are required to achieve the higher levels of dependability. The increased testing and system validation that is required to convince the system client and regulators that the required levels of dependability have been achieved.

Socio-technical systems

Software engineering is not an isolated activity but is part of a broader systems engineering process. Software systems are therefore not isolated systems but are essential components of broader systems that have a human, social or organizational purpose.

Equipment: hardware devices, some of which may be computers; most devices will include an embedded system of some kind.
Operating system: provides a set of common facilities for higher levels in the system.
Communications and data management: middleware that provides access to remote systems and databases.
Application systems: specific functionality to meet some organization requirements.
Business processes: a set of processes involving people and computer systems that support the activities of the business.
Organizations: higher level strategic business activities that affect the operation of the system.
Society: laws, regulation and culture that affect the operation of the system.

There are interactions and dependencies between the layers in a system and changes at one level ripple through the other levels. For dependability, a systems perspective is essential.

Emergent properties

Emergent properties are properties of the system as a whole rather than properties that can be derived from the properties of components of a system. Emergent properties are a consequence of the relationships between system components. They can therefore only be assessed and measured once the components have been integrated into a system.

Some examples of emergent properties:

Property	Description
Volume	The volume of a system (the total space occupied) varies depending on how the component assemblies are arranged and connected.
Reliability	System reliability depends on component reliability but unexpected interactions can cause new types of failures and therefore affect the reliability of the system.
Security	The security of the system (its ability to resist attack) is a complex property that cannot be easily measured. Attacks may be devised that were not anticipated by the system designers and so may defeat built-in safeguards.
Repairability	This property reflects how easy it is to fix a problem with the system once it has been discovered. It depends on being able to diagnose the problem, access the components that are faulty, and modify or replace these components.
Usability	This property reflects how easy it is to use the system. It depends on the technical system components, its operators, and its operating environment.

Two types of emergent properties:

Functional properties: These appear when all the parts of a system work together to achieve some objective. For example, a bicycle has the functional property of being a transportation device once it has been assembled from its components.
Non-functional emergent properties: Examples are reliability, performance, safety, and security. These relate to the behavior of the system in its operational environment. They are often critical for computer-based systems as failure to achieve some minimal defined level in these properties may make the system unusable.

Regulation and compliance

Many critical systems are regulated systems, which means that their use must be approved by an external regulator before the systems go into service. Examples: nuclear systems, air traffic control systems, medical devices. A safety and dependability case has to be approved by the regulator. Therefore, critical systems development has to create the evidence to convince a regulator that the system is dependable, safe, and secure.

Regulation and compliance (following the rules) applies to the socio-technical system as a whole and not simply the software element of that system. Safety-related systems may have to be certified as safe by the regulator. To achieve certification, companies that are developing safety-critical systems have to produce an extensive safety case that shows that rules and regulations have been followed. It can be as expensive develop the documentation for certification as it is to develop the system itself.

Redundancy and diversity

Redundancy: Keep more than a single version of critical components so that if one fails then a backup is available.
Diversity: Provide the same functionality in different ways in different components so that they will not fail in the same way.
Redundant and diverse components should be independent so that they will not suffer from 'common-mode' failures.

Process activities, such as validation, should not depend on a single approach, such as testing, to validate the system. Redundant and diverse process activities are important especially for verification and validation. Multiple, different process activities the complement each other and allow for cross-checking help to avoid process errors, which may lead to errors in the software.

Dependable processes

To ensure a minimal number of software faults, it is important to have a well-defined, repeatable software process. A well-defined repeatable process is one that does not depend entirely on individual skills; rather can be enacted by different people. Regulators use information about the process to check if good software engineering practice has been used. For fault detection, it is clear that the process activities should include significant effort devoted to verification and validation.

Dependable process characteristics:

Explicitly defined: A process that has a defined process model that is used to drive the software production process. Data must be collected during the process that proves that the development team has followed the process as defined in the process model.
Repeatable: A process that does not rely on individual interpretation and judgment. The process can be repeated across projects and with different team members, irrespective of who is involved in the development.

Dependable process activities

Requirements reviews to check that the requirements are, as far as possible, complete and consistent.
Requirements management to ensure that changes to the requirements are controlled and that the impact of proposed requirements changes is understood.
Formal specification, where a mathematical model of the software is created and analyzed.
System modeling, where the software design is explicitly documented as a set of graphical models, and the links between the requirements and these models are documented.
Design and program inspections, where the different descriptions of the system are inspected and checked by different people.
Static analysis, where automated checks are carried out on the source code of the program.
Test planning and management, where a comprehensive set of system tests is designed.

Dependable software often requires certification so both process and product documentation has to be produced. Up-front requirements analysis is also essential to discover requirements and requirements conflicts that may compromise the safety and security of the system. These conflict with the general approach in agile development of co-development of the requirements and the system and minimizing documentation. An agile process may be defined that incorporates techniques such as iterative development, test-first development and user involvement in the development team. So long as the team follows that process and documents their actions, agile methods can be used. However, additional documentation and planning is essential so 'pure agile' is impractical for dependable systems engineering.