Software in Practice a series of four lectures on why software projects fail, and what you can do about it - with particular emphasis on safety-critical systems Martyn Thomas Founder: Praxis High Integrity Systems Ltd Visiting Professor of Software Engineering, Oxford University Computing Laboratory Lecture 1: What is the problem with software? The state of practice Scale Complexity What does testing tell us? When I started in 1969 ... IBM 360/65 Computing service for 1000s of users. Now I have more computing power in my ‘phone. The Software Crisis First digital computer, Manchester 1948 First commercial computer, LEO 1951 We are still in the very early stages of software engineering ... … like studying civil engineering when Archimedes was still alive! NATO Software Engineering conferences in 1968 and 1969 to address the growing crisis in software dependability. 1972 Turing Award Lecture The vision is that, well before the 1970s have run to completion, we shall be able to design and implement the kind of systems that are now straining our programming ability at the expense of only a few percent in man-years of what they cost us now, and that besides that, these systems will be virtually free of bugs E W Dijkstra Software in the 21st Century Fifty years on, yet still at the beginning. We are planning drive-by-wire cars, guiding themselves on intelligent roads We are dreaming if we believe we can build such real-world systems safely, with today’s attitudes to software engineering. We have still not achieved Dijkstra’s vision of thirty years ago! Thirty years later… … Most computing system projects fail Project cancellation Major cost or time overrun Much less functionality than planned Security inadequate Major usability problems Excessive maintenance / upgrade costs Serious in-service failure I’ll talk about some specific failures in later lectures most software projects fail Cancelled before delivery 31% Exceed timescales & costs 53% or greatly reduced functionality On time and budget 16% Mean time overrun 190% Mean cost overrun 222% Mean functionality delivered 60% large companies much worse than smaller recent figures better, but still poor source The Chaos Report (1995) http://www.standishgroup.com most computing projects fail Of 1027 projects, 130 (12.7%) succeeded Of those 130: 2.3% were development projects 18.2% maintenance projects 79.5% data-conversion projects of the 500+ development projects in the sample, 3 (0.6%) succeeded. Source: BCS Review 2001 page 62. Why does it happen? Because: scale matters. Small processes don’t scale up process matters. Most developers lack discipline rigour matters. Most developers are afraid of mathematics engineering is conservative, whereas the software industry is ruled by fashion CAA licensing system; C vs Ada at Lockheed Martin; eXtreme this, Agile that ... Who can make things better? You! Scale How many valid paths through 200 line module? We have found around 750,000 How big are modern systems? Windows is ~100M LoC Oracle talk about a “gigaLoC” code base. How many paths is that? How many do you think they have tested? What proportion will ever be executed? A medium-scale system: En Route ATC at Swanwick RS 6000 workstations Control Room Airspace NERC SECTORISATION / EQUIVALENT LATCC SECTOR NAMES DELEGATED TO COPENHAGEN ACC S33 (NORTH SEA) DELEGATED TO ANTRIM (FL165 - 245) DELEGATED TO DUBLIN (FL165 - 245) S7 (WIRRAL) S3/4/ S10 S10 FL240 S3/4 (LAKES) DELEGATED TO TO S7 (NORTH SEA) DUBLIN (FL245 -) S3/4/5 S11 (NORTH SEA) S28/34 (DAVENTRY NORTH) S5 S27/32 (BRECON) (DAVENTRY WEST) DELEGATED TO S13/14 (FL235+) S8 S2/32/25 S2 S12 S12 (CLACTON EAST) (STRUMBLE) S2 /25 S28/26 /32 S1 S13 (HIGH) S5 (BRECON) S2 (LUS EAST) S13 S14 (LOW) S23 (BRISTOL) S1 S26 (LMS EAST) S25 S2 S23 S26 FL55 - FL660 S1 (LUS WEST) S15/16 S15 (DOVER LOW) S25 (LMS WEST) S16 (DOVER HIGH) S2 S26 S2 S20 (HURN S17 S2 S1 S17 WEST) S25 S1/S25 S18 S1 S25/S19 S17 (LYDD) S6 ) W (BERRY HEAD) S20 S1 LO S18 (SEAFORD) S9 RN (LANDS END) S19 (HU (HURN EAST) S21 S20 S18 DELEGATED DELEGATED TO TO SHANNON BREST (FL245+) (FL245+) PUBLICATION DATE: 20 JUN 01 COPIES OF THIS MAP ARE AVAILABLE FROM: OPERATIONAL INFORMATION, ROOM 3322, BOX 12, SWANWICK. \\CAHSWNS01\SWANWICK.GLB$\ATC\NERC SECTORISATION.PDF CHANGE: S2 CORRECTED IN THE VICINITY OF THE WESTCOTT RC. NERC SECTORISATION 20.06.01 NOT FOR OPERATIONAL USE A medium sized system 114 controller workstations 20 supervisory/management positions 10 engineering positions 48-workstation simulator 2 15-workstation test systems 2.5 million lines of software >500 processors Operational data 1,667,381 flights in 2002 Continuous operation, one 3-hour failure (other flight delays caused by NAS failures at West Drayton) Challenges for the future Current ATC safety depends on the controller’s ability to clear their sector with radio only. Future traffic growth requires > 10 a/c on frequency. Controllers would be overloaded So future ATC will depend on automatic systems, which must not fail. Target? At least the avionics standard:10-8 pfh No current air traffic management systems are built to such standards. This could be your job How can we be sure a system works? Assurance: showing that a system works Much harder than just developing a system that works you need to generate evidence that it works what evidence is sufficient? How safe or reliable is a system that has never failed? What evidence does testing provide? How can we do better? How safe is a system that has never failed? If it has run for n hours without failure, and if the operating conditions remain much the same, the best estimate for the probability of failure in the next n hours is 0.5 To show that a system has a pfh of <10-4 with 50% confidence, we need about 14 months of fault-free testing. (10,000 hours is 13.89 months) What evidence does testing provide? “Testing shows the presence, not the absence, of bugs” - Dijkstra We cannot test every path. Testing individual operations or boundary conditions may find faults, but such tests provide no evidence of pfh. Statistical testing, under operational conditions, provides evidence of pfh. But it takes a very long time. Statistical testing To show an MTBF of n hours, with 99% confidence, takes around 10n hours of testing with no faults found. So avionics (10-8 pfh) would need around 109 hours (>100,000 years.) With good prior evidence, e.g. from a strong process, using a Bayesian approach may reduce this to ~10,000 years Actual testing is trivially short by comparison. Summary Developing reliable software is difficult because of the size and complexity of real-life systems. The software industry is very young, amateurish and immature. Most significant projects overrun dramatically (and unnecessarily) or totally fail. In future lectures, I will explore why some failures have occurred (Therac, Arianne, LAS, Taurus …) and talk about what you need to know if you are to become a professional amongst all these amateurs.
Pages to are hidden for
"Software in Practice a series of four lectures on why software "Please download to view full document