professional documents
home
Upload
docsters
Upload
Acrobat PDF

Observed Changes in Software Architecture center doc


TDT4735 Software Engineering, Directed Study Observed Changes in Software Architecture Author: Andreas Tørå Hagli Supervisor: Reidar Conradi Co-supervisor: Thomas Østerlie Department of Computer and Information Science, NTNU, Trondheim, Norway andrhag@stud.ntnu.no Assignment title: Observed Changes in Software Architecture Assignment text: SEVO (Software EVOlution) is a project, started early 2004, that studies software evolution, especially the evolution of component-based systems and their architecture. More information on the SEVO homepage (http://www.idi.ntnu.no/grupper/su/sevo.html). The assignment intends to make a literature study over the subject, and to perform an empirical study on a software system. The focus is on identifying the most important factors that can explain and maybe foresee software evolution. Date: 26. November 2004 -ii --iii -Abstract Software development is rapidly changing and software systems are increasing in size and expected lifetime. To cope with this, several new languages and development processes have emerge, but there is also a trend towards more focus on design and software architecture and development with consideration for evolution and future change in requirements. There is a clear need for improvements, research shows that the portion of development cost used for maintenance is increasing and can be as high as 50%. We also see many software systems that grow into uncontrollable complexity where large parts of the system cannot be touched because of risks for unforeseeable consequence. Therefore the need to understand the nature of software evolution, especially on software architecture, to be better prepared when developing future software systems. This pre-diploma thesis (depth project) describes the various parts of the research fields around software evolution and software architecture. It gives a background on the current situation of the research, explores different ideas on why changes occur and presents view and metrics for measuring. Two research questions and three hypotheses where formulated and an empirical study was done on the open source project Portage from the Gentoo Linux project to test them. Keywords: Software evolution, software architecture, software metrics, empirical study -iv --v -Preface This report is written as part of the course TDT4735 System development, in the final year of a mater study at the Department of Computer and Information Service at the Norwegian University of Science and Technology in Trondheim. This pre-diploma thesis (depth project) intends to be a literature study of research relevant to evolution of software architecture and an empirical study. It looks at research done with focus on what causes software to change how empirical study can be used to foresee these changes. The empirical study was intended to look at architectures from two or three industrial examples for comparison, but no data source was found in time. Instead the open source project Portage developed by the Gentoo Linux project was used. It provided valuable data on software evolution though limited data on software architecture, which affected the focus of the empirical study. By writing this report, the author hopes to get a better insight of software evolution and empirical study and use this insight in his diploma thesis. I would like to thank my supervisors, Reidar Conradi and Thomas Østerlie for valuable feedback and support during the project. Andreas Tørå Hagli 26. November, 2004 Trondheim, Norway -vi --vii -Table of Contents 1. Introduction........................................................................................................................... 1 2. Background ............................................................................................................................ 2 2.1 Software Evolution........................................................................................................... 2 2.2 Software Architecture ...................................................................................................... 3 2.3 Evolution of Software Architecture ................................................................................. 4 3. Measuring Evolution.............................................................................................................. 7 3.1 Suggested Untraditional Methods .................................................................................... 7 3.2 Software Metrics .............................................................................................................. 9 3.2.1 Software Metrics for Software Evolution...............................................................10 3.3 Change Types ................................................................................................................. 12 3.4 Origins for Data.............................................................................................................. 12 4. Research Questions and Hypotheses.................................................................................... 14 4.1 Basis for Formulation..................................................................................................... 14 4.2 Why Changes Occur ....................................................................................................... 14 4.3 Metrics Used in this Report............................................................................................ 15 4.4 Research Questions ........................................................................................................ 15 4.5 Hypotheses ..................................................................................................................... 16 5. Research Context .................................................................................................................. 18 5.1 Study Objects ................................................................................................................. 18 5.1.1 Gentoo Linux and Portage ......................................................................................18 5.1.2 Overview of Releases .............................................................................................20 5.1.3 Context and Structure of the Data Sources .............................................................21 6 Results .................................................................................................................................. 24 7. Discussion ............................................................................................................................ 29 7.1 Research validity............................................................................................................ 29 7.1.1 Conclusion validity.................................................................................................29 7.1.2 Internal validity.......................................................................................................29 7.1.3 Construct validity....................................................................................................29 7.1.4 External validity......................................................................................................30 7.2 Lessons learned .............................................................................................................. 31 7.2.1 Change Categorization............................................................................................31 7.2.2 Variation in Activity...............................................................................................31 7.2.3 Analyzing the Data .................................................................................................32 8. Conclusion and Future Work ............................................................................................... 33 9. References ............................................................................................................................ 34 Appendix A.............................................................................................................................. 37 -viii --1 -1. Introduction Most computer systems experience some form of software evolution over time. Lehman's first law of software evolution (Lehman 1974) says that software systems that address problems in the real world must be continually adapted and changed if it is to offer satisfactory service. Functionality is added and quality attributes improved. Efforts are made to maintain control over the system, and one of them is to plan an overall architecture of the system and have its basic structure intact. But, the architecture also has to evolve to accommodate for unforeseeable events and to adapt to changes in its purpose and focus. This report discusses how software evolution affects the architecture of a system. In the software industry today, maintenance is becoming a larger and larger part of the cost of software development. Evans (2001) says that “evolving and maintaining computer systems is expensive. This cost can be anywhere from 50% of total programming effort (Lientz 1980) to 75% of total available effort (McKee 1984). In addition, the proportion of effort devoted to maintenance has been increasing: from 35-40% in the 1970s, through 40-60% in the '80s up to 70-80% in the 1990s (Pfleeger 1991)”. In order prevent such high maintenance cost and focus on business important issues, we must therefore exploring the nature of these changes and learn how to foresee and handle them better. There are various definitions of what activities are included in software evolution. The definition adopted in this paper is all activities and phenomena associated with adding functional and improving non-functional requirements that cause physical and conceptual changes to the structure of the software system. It is important to note that although evolution is often a negative side-effect of a wanted change, it can also be, and is often, a necessity for the system. At the end of this report there is an empirical study where Portage, a Gentoo Linux project, is used to study software evolution. The remainder of this rapport is organized as follows. Chapter 2 gives the background of software architecture evolution as a research field. Chapter 3 discusses different ways and suggestions of ways to measure information relevant to software evolution. Chapter 4 argues for a number of research questions and hypotheses about software evolution. Chapter 5 describes the context for the data used in an empirical study to answer the research questions and test the hypotheses from chapter 4. Chapter 6 presents the results from the empirical study. Chapter 7 discusses there results. Chapter 8 concludes the report and gives suggestion for further work. -2 -2. Background 2.1 Software Evolution Although discussion of software evolution has existed for a long time, it’s only recently that it has become a well known phenomenon in mainstream research. As Lehman (2001b) states “the software evolution phenomenon was first identified as such in early 70s. It is reflected in an intrinsic need for continuing maintenance and development of software use to address an application or solve problem in real world domains. Until recently, however, it did not arouse general interest”. The definition of software evolution is debated, different authors have different ideas on what should be considered as evolution. Some consider evolution to be something that happens during the whole lifetime of the software system while some limits it only to a certain development phase (Rajlich 2001). Some consider software evolution to be the phenomena behind all changes in the software system, while others consider it to involve only abstract changes and thus not include things like correction of earlier mistakes and porting to a new platform. Also, some limit software evolution to the software system itself, while others includes the development process and organizational structure. Lehman et. al. (2001a) says that “the more common approach sees the most important evolution issues as those concerning the methods and means whereby a software system may be implemented from ab initio conception to operational realisation. The focus of this approach is the how of software evolution”. A less frequent approach “is concerned with the what and the why of evolution. It addresses the issues of the nature of the evolution phenomenon, its drivers and impact.” Yet another view on software evolution is used by some authors, “they regard it as being limited to software change and implicitly exclude, for example, defect fixing, functional extension, restructuring.” Still others consider it “a stage in the operational lifetime of a software system, intermediate between initial implementation and servicing”. A central issue in defining software evolution is whether to differ between software evolution and software maintenance. Software maintenance includes the minimum set of activities required in order to keep the software system running as it is with the functionality it currently holds. This includes fixing bugs and security updates and making sure the software system works on new hardware and platforms. If not maintained, the software will be considered outdated and possibly useless after some time. Software maintenance is important and numerous scientific studies of large-scale software systems have shown that the bulk of the total software-development cost is devoted to software maintenance (Mens 2001). Even when software maintenance is not considered part of the software evolution, it is a tightly related phenomenon. Software systems are developed in different phases, e.g. making requirement specification, initial design, writing code, testing and installing. But execution of these is rarely sequential, errors are uncovered and underlying assumptions are changed after installation of the system. The process is one of successive transformation (Lehman 1984). “It is driven by human creative and analytic power as influenced and modified by developing insight and understanding, but feedback from later steps that leads to iteration over earlier steps, together with changes in the external world that must be reflected in the system, also play a role” -3 -(Lehman 2001a). Different areas on the development are in constant interaction, e.g. the process affects the physical software system which again affects the releases and so fourth. Lehman (2001a) describes five areas of software related evolution, each of them interact with, impacts and affect the others. “If software evolution is to be mastered, they must be understood and mastered individually and collectively. They must be planned, driven and controlled.” The five areas are: 1. Implementation of a software system from an initial functional concept to the final, released, installed version. Often the relative benefits of alternatives can’t be established beforehand, and need realistic trials. This evolution is thus driven by feedback mechanisms and evaluation of changes made to the system. 2. At the next level up, there is a sequence of versions, releases and upgrades of a software system. Driven by a release process where changes are made to remove defects and implement improvements and extensions. This is usually referred to as “maintenance”. 3. Applications, or activities, to support the development of software systems that address problems in the real world. There is an unending process to meet new functionality, procedures, need, opportunities and so forth when dealing with a loose user community. 4. The process of software development is the aggregate of all activities the implement one or other of the above levels of evolution. An estimated 60% to 80% of lifetime expenditure on a software system in incurred after first release (Pigoski 1996). It is therefore important to make improvements that produce gains in quality, cost, and reduced development time. 5. Modelling, using a variety of approaches is an essential tool for study, control and improvement of the process. The models facilitate reasoning about it, to explore alternatives or assess the impact of change, for example. The process evolves. So must the model of it. This rapport, a wide definition is used for software evolution. It includes the changes to a software system from the design to the end. It involves changes in the software system, in business requirements and the process of development and organizational structure. Of the five areas defined above, this rapport has special emphasis on the first and second level; the physical software system and its releases. 2.2 Software Architecture As with software evolution, there is no clear definition of architecture, a popular one is the one used by Bass et al. (2003) which states that the software architecture of a program or computer system is the structure or structures of the system, which comprise software elements, the externally visible properties of those elements, and the relationship among them. Architecture is a fairly new topic in software engineering, but has roots far back (Perry et al. 1992). “Software design received a great deal of attention by researchers in the 1970s. This research arose in response to the unique problems of developing large-scale software systems first recognized in the 1960s. The premise of the research was that design is an activity separate from implementation, requiring special notation, techniques, and tools”. During the 1980s software engineering integrated design and design process into the development process and its management and saw great advances in the ability to describe and analyze software system. In the 1990s software architecture emerged as a separate field representing the high-level abstractions of software design. -4 -Increasingly large and complex software and increased knowledge and experience on software development have created a trend in software engineering toward working with software on a high level of abstraction aiding the implementation of a system with guidelines. “Programming projects are becoming ever more complex and being able to design at the software architectural level allows a group of programmers to understand the overall approach that is to be taken on a design. When considering the design of a system at this level, many kinds of mistakes and misunderstandings can be avoided as there is a shared knowledge of what a particular architecture implies” (Evans 2001). Although architecture might seem like a natural step, there is criticism about its immaturity. It takes time for experience to build and successful strategies to be widely adopted. An important reason for having an architecture of a software system is to have higher degree of control through the lifetime of a software system. As Evans (2001) puts it “it is hoped that by considering the architecture of the system at an early stage, the resulting system will be more easily evolved, at this and later phases of its development”. The work being done in the field of software architecture pays a lot of attention to this as Evans continues “there is a large body of literature on the subject and the kinds of architectures that are being used, such as pipes and filters, objects, events and process control paradigms, are becoming better understood”. 2.3 Evolution of Software Architecture New software systems are usually based on pre-existing functioning systems. This means that only the first system can be properly and optimally architectured and thus architecture is mostly a matter of impact analysis and reengineering, and only secondarily about designing (Mikkonen 2001). The consequence of this is that the initial architecture of a system has to be built for unforeseeable evolution by focus on qualities like flexibility, modifiability, scalability and maintainability. To continuously implement new requirement, the software system and its architecture will meet increased complexity over time. Mikkonen (2001) suggests three stages of an architectures life cycle common for commercial product that explains this process. First the system has an evolving architecture, then a controlled architecture and finally it goes towards legacy architecture. At the first stage of a new software system, before a “1.0 release”, flexibility is often the main design concern. The purpose of flexibility is to prepare for sudden changes in requirement, major conceptual changes and experiments. Development process plays a major role here since it determines how requirement are handled and how to support new changes. Frequent releases adding new functionality gives the developer more control for adapting to changes in requirements during development. Whatever development process used, it is important to design the system not only for the current features, but also for unforeseen and predicted functional and non-functional requirements. Once the first release of the software system is completed, the system enters a new stage with a new set of architectural concerns. This includes issues of cost related to hardware usage, development schedule, scalability, performance, standard obeying and so on. Largely irrelevant to the first release, but they also contradicts flexibility. The architectural focus shifts from flexibility for adding of new features to focus on non-functional properties that effects much larger parts of the system and architecture than a new feature that can typically be addressed in a single code block. Balancing between support for flexibility for future features -5 -and non-functional propertie s is hard, but as Mikkonen suggest, can be done by giving them concrete priorities using a scale and thus relative importance. The final stage of a software system is when it goes towards a legacy system, a system that, although in use, is out-of-date and often incompatible with newer systems. This is usually caused by non-technical issues, like political, marketing or financial aspects. Under the stage of controlled architecture, the system steadily produces new releases within schedule. However, non-technical issues can stir up a panic-driven upgrade and force poor upgrade cycles upon the development team. Under stronger time pressure, the development process delays any major changes to the underlying architecture which means that the structure of the system never accommodate for increased domain knowledge and possible future requirement. Knowledge and documentation of the architecture becomes out-of-date and ultimately impossible to change without expecting unforeseen consequences, especially if key developers, with important knowledge of the system, leave the team. It becomes increasingly complex to make modification to a degree where new features need more and more discussion and considerations in order to avoid risks of doing something wrong. Regardless of the technical difficulties, legacy systems are often kept operational for economical reasons. Mikkonen concludes by saying that “software engineering is not about software skills only, but domain understanding is also needed. The challenge of software evolution is to keep these aspects closely and explicitly related, but fundamentally separated”. In the beginning of the product lifecycle, the domain is the original problem setting which is mapped to an implementation with up-to-date architecture and design documentation. However during the lifetime of a software system, the domain tends to become connected with the software architecture, high-level design and individual code blocks of the system and so the software architecture and design documentation becomes out-of-date. This must be remedied by upgrades of underlying architecture and careful design decisions or else the system will become too dependent on committed key developers with knowledge of the internal parts of the system. Requiring support for new business requirements are usually considered a major factor for causing evolution of software architecture. In this context Rajlich (2001) emphasize the role of domain concepts through a case study of a java-based Point-Of-Sale application that sells products in small stores. A point made is that when support for payment using credit cards was added, the concept of payment was already introduced. So it’s the concept of payment that causes the architecture to change, not adding support for credit cards, which is often a matter adding new classes already accounted for by the architecture. So when dealing with how changes in business logic affects the architecture, it’s necessary to think of it in terms of how the domain concept change, not just in terms of added functionality. In the case above, adding the concept of payment cause architectural change while adding support for credit card does not. Another view on evolution of software architecture is described by Dikel et al. (1997), where they state the importance of finding an architectural rhythm. Their study suggests that focusing too on today’s specific features is dangerous since it creates complications with poor architecture and thus higher maintenance cost, but focusing too much on tomorrow’s capabilities by having strict rules for a long-term architecture and lack of priority on completing features on schedule is also dangerous because it cause schedules to slip. Therefore to sustain profitable, it is important to balance between too much and too little architectural focus. -6 -A software system should therefore have regular releases that help coordinate actions and expectations and ensure that all involved – including customers, engineers, suppliers, manager and executives – understand important issues and their own responsibility. For this it is also important to have a predictable rhythm. To find this balance, they suggest following six organizational principles; focus on simplification, adapt architecture to future needs, regular releases, close cooperation with stakeholders, a clear architectural vision and proactively manage risks and opportunities.-7 -3. Measuring Evolution In order to understand, deal with and manage evolution, we need to measure and analyse it in a structured manner. Research of software evolution has traditionally been done through empirical research. “Observations gathered during many years' of measurement and interpretation of industrial software processes provide the primary inputs to theory formation. They include qualitative and quantitative observations of behaviour.” (Lehman 2001b). Software is complex and its usage varies in form of languages, processes, people etc., therefore a single general good way to study software systems is not possible. The choice of method will depend on the context of which the result in needed. Varous alternatives have been suggested. Dag Sjøberg (2001) claims that “there is an increasing understanding in the software engineering community that empirical studies of software evolution are needed to improve existing and develop new processes, methods and tools for software development and maintenance [...] how do we judge whether one kind of structure supports evolution better than another kind of structure without comparing them in a fairly controlled way?” He suggests using experiment involving both students and professionals to supplement to the less resource-demanding alternative of using case studies. As with software development in general, the approach at measuring software evolution boils down to the option of using cases, which are cheaper but harder to generalise and the option of using experiment that require more resources but are easier to generalise and specify. 3.1 Suggested Untraditional Methods In order to do a better job and cope with the increase in complexity of software development, it is important to think of new way to deal with evolution. As Mens (2001) puts it: “better tool support for evolution is essential, since numerous scientific studies of large-scale software systems have shown that the bulk of the total software-development cost is devoted to software maintenance. This is mainly due to the fact that software systems need to evolve continuously to cope with the ever-changing software requirements”. There are several ways of analyzing a set of data. Below are three examples of suggestion for some of the radical new ways. Davey et al. (2001) suggest using software clustering and concept analysis to analyse software evolution. By using these two methods, used for determining the structure of a software system, on different versions, it is possible to discover evolutionary trends. These similar methods provide a structured way to create a visual representation of a system. They use the concept of entities (functions, data types, variables, files etc.) and features (the number of times a function is called or variable accessed) to create diagrams grouping related entities together. This way one can see a representation of cohesive set of entities for a given system. -8 -Figure 1 Visual presentation as suggested by Davey. Software clustering (left) and concept analysis (right) By using software clustering or concept analysis on multiple versions of a system and observe what happens to initially cohesive set, one can try to find specific initial patterns that would cause the architecture to change. Still there is not much research done in the field although the authors plan to try out the methods on an industrial software system. Also the authors mention that there is a metric defined called MoJo that compares two representations of a system (Tzerpos 1999). Nakatani (2001) makes use of statistical visual representation to help the interpretation and evaluate the meanings of object evolutio n. This is done on top of an empirical study measuring the number of lines of code, classes and methods. By using these measures, he presents a statistical model that shows whether the code base has a strong convention and is uniformly organised or whether the code base has more room for programming decisions. Nakatani argues that these two states signalise two different development stages, adding functionality or refractoring. Either the developers will pursue the goal of meeting new requirements or they will work to support extensibility. The development process will evolve by first adding new functionality until they see the need to pay more attention to structure in order to make further extensions. Using his statistical model managers will be able to know more about the actual situation the development is in and the direction it is moving in. “To understand the current development condition, we must observe class states over time. If project managers understand the statistical evolution characteristics described in this paper, they will be able to compare planned activities and implemented activities quantitatively and take proper action” (Nakatani 2001). It should be noted that the case done by Nakatani used an open source project called Jun (2004) in the empirical study that probably had a more ad-hoc design and requirement specification than most industrial projects. Yet another approach on visualization is presented by Knight et al. (2001), three dimensional visualization techniques are used to “transform the way that tacit knowledge gathering and retrieval takes place during many software maintenance activities”. Knight et al. says that when analysing a development process management and organisational issues should be considered, because they tend to have profound and far reaching implications -9 -on the software. Traditional methods often ignore relationship between objects, the history of that object and any one of the projected futures, plus the provision of deadlines. Knight et al. states that there are complications when visualizing any data set in an acceptable way. It is important for the user to be able to explore and navigate the complex environment in which the visualizations are located. There is also the issue of converting suitable data to well-founded information, like metaphors, and the issue of communication increasingly abstract problems accurately. Traditionally techniques such as data mining has been used to present interesting information to the users by using algorithm to find interesting patterns. At the current states such methods pose restriction on the flexibility needed to analyze. For this it is necessary to provide various filtering and overview mechanisms to be used to grasp limited portions of the data and see detail when necessary. Also this ought to be user driven. Knight et al. (2001) says that “an object (any component part of the software system including management influences) has a history through time and it also has any number of possible predicted futures at for any given point in time. These predicted futures are used for management and impact analysis based on set deadlines such as releases of software and system changeovers.” Further Knight et al. suggests a three dimensional visualization for presenting the timeline using the vertical dimension. A number of time ribbons can be placed in the space in predicted (and actual) time to show the progress of individual objects. A user can explore the plane for any moment in time to examine a more detailed view of the situation of the system. Through analysing the timeline, it also possible to see how the different objects impact each other. Figure 2 Visual presentation as suggested by Knight et al. A major benefit from using this and other forms of visualization a different abstraction levels is the possibility for programmers and managers to interact with same data at different levels of abstraction and thus more easily work communicate. This way, key programmers with knowledge of the system will be able to spend less time on management related task and more time working on the actual system. 3.2 Software Metrics A software metric is a measure of some attribute of software or its specification. Example of software metrics are lines of source code and bugs per line of source code. They have long -10 -been studied as a way to assess the quality of large software systems (Fenton 1997). Software metrics are used when a certain property of a software system is of interest, a set of one or more attrib utes are then considered to represent that property and a metric used to extract values for the attributes. Figure 3 The role of a metric Figure 3 describes the term metric. When an attribute is measured, you get a measure. In this process, a metric is defined as the attribute and the applied way of performing the measurement. The metrics describes what the attribute is suppose to be measured as, like a scale or a set of values, and how the measuring is performed, e.g. manually or automatically. Metrics can be used in a number of ways to support software evolution and software engineering in general. It is simple, precise, general and scalable and provides quantitative results and the results can be duplicated and compared. Mens (2001) says that ”improving software quality, performance and productivity is a key objective for any organisation that develops software. Quantitative measurements – and software metrics in particular – can help with this, since they provide a formal means to estimate software quality and complexity”. When using metrics to understand the software system and evolution better it is important to know what you are looking for. Mens (2001) says that “initial experiments have indicated that metrics can detect different types of evolution, such as restructuring and extension. Nevertheless, it remains an open question which types of software evolution can be identified by which metrics. A related question is whether it is possible to reconstruct the motivation behind an evolution step (e.g., why was a certain change made between two successive releases)?” Therefore although metrics might be relatively easy to use, the issue of actually making sense of the output can be more challenging. Gall et al. (1998) separate between the use of software metrics before the evolution has occurred (i.e., predictive), and the after the evolution had occurred (i.e., retrospective). 3.2.1 Software Metrics for Software Evolution Evolution proneness A term suggested by Mens (2001) is evolution-prone parts, which are parts of the software that are likely to be evolved. The reason is not necessarily because of poor structure or quality, but because the software requirements can change or disappear quickly. -11 -A way to detect this in a quantitative manner is to investigate the release history of the software on earlier releases and identify which parts of the software has been most frequently changed. To keep track of the changes, most large software projects use some form of version control system that can be used for analyzing changes between releases. Metrics can be used to give a measure of the amount of change made to a part of the system at a specified granularity, like module, file, class and methods. When deciding granularity, it is important to know the limitation of the tools being used, measuring changes on a class is probably more demanding that measuring changes on a file. Two factors are important for this measuring. Firstly the size of the parts analyzed matters because the relative amount of change is important. Also, the time at which the change was made is important, recent changes are more important for the current status of the system. Different visualization techniques might help dealing with scalability issues to adapt to the cognitive limitations of humans. As Mens writes: “to cope with the scalability issue, typical examples of this approach visualise the measurements. For example, Ball and Eick (Ball 1996) notate code views with colours showing code age, and Jazayeri et al. (1999) use a three-dimensional visual representation for examining a systems software release history. Lanza (2001) combines software visualisation and software metrics as a simple and effective way to recover the evolution of object-oriented software systems” Evolution sensitivity Another useful term used by Mens is evolution-sensitive parts, which are parts of the software that can cause problems upon evolution or where the estimated effort of managing the impact of changes is very high. This typically happens where the important design goals of loose coupling and high cohesion is not met. When different parts of the software are tightly interwoven, changes to one part might have a high impact on other related parts. A type of metrics are used to detect evolution-sensitive parts, is coupling metrics. There are some suggestions for coupling metrics, including coupling between object classes (CBO) (Chidamber 1994), coupling factor (Brito 1995) and response set for a class (PFC) (Chidamber 1994). Another possibility is to use cohesion metrics. However, Kabaili et al. (2001) investigated whether they could used as changeability indicators, and concluded that this is not the case, at least not with the cohesion metrics present at that time. More research remains necessary to find out whether other metrics than cohesion and coupling can be used to detect evolution-sensitive parts of the software. Retrospective analysis of software By comparing two releases of the same software system, Mens states that it is possible to extract information about how the software system has evolved and see what kind of evolution has occurred. As an example, he mentions an retrospective empirical study by Gall et al. (1998) where coupling metrics where used on multiple releases of a large telecommunication switching system. The results could successfully be used to more accurately predict future expected maintenance activities. -12 -Another study performed by Demeyer et al. (1999) where various size and inheritance metrics where used on three releases of a medium-sized object-oriented framework. “From the framework documentation one can deduce that the transition from the first release (1.0) to the second release (2.0) was mostly restructuring, while the transition from the second (2.0) to the third release (2.5) was mainly extension. This restructuring and extension was confirmed by the measurements. During the restructuring phase, a substantial number of classes changed their hierarchy nesting level (i.e. the number of superclasses) and the number of methods defined. This implies that most of the changes were in the middle of a class hierarchy which is indeed typical for a major restructuring. Yet, during the extension phase none of the classes changed their hierarchy nesting level, but a significant amount increased or decreased the number of children. Thus, all changes were made to the leaves of the inheritance hierarchy which is indeed typical for extensions. Consequently, the 1.0 ~ 2.0 restructuring did improve the inheritance structure since the subsequent 2.0 ~ 2.5 transition really exploited the inheritance hierarchy” (Mens 2001) Comparing two releases in can give information like classes added, classes removed, increase and decrease in number of methods and changes in the class hierarchy. This can provide significant help in under standing the evolution that took place. For example can a stable number of classes and an increase in methods mean that there has been an extension in the form of added functionality. Also, new functionality usually means changes in classes at the leafs of the class hierarchy. On the other hand, if changes happens in the middle or top of the class hierarchy it is more likely that restructuring have been done. 3.3 Change Types Mockus et al. (2000) have identified three specific causes for software change found by analyzing historic data from legacy software systems: adding new features (adaptive), correcting faults (corrective) and restructuring code to accommodate future changes (perfective). These are commonly used for empirical studies. Through analysis of the textual description of changes, Mockus et al. have found a list of words commonly associated with different change types; this can be used to more quickly or even automatically determine the change type from a textual description. New code development (adaptive) use keywords such as add, new, modify, update, fault fixes (corrective) use keywords such as fix, problem, incorrect, correct and code improvements use keywords such as cleanup (perfective), unneeded, remove, rework. In open source context, including usage of Bugzilla (described in chapter 5), the difference between adaptive and corrective changes can be unclear. New features are often developed in small incomplete increments and new functionalities (adaptive changes) are often introduced as incomplete “quick” addition followed by reports on suggested improvements formulated as a fix for a mistake. These incremental improvements of newly introduced features are all considered as adaptive changes as long as the developer known about the limitation of the feature. If, on the other hand, the introduction of a new functionality causes an unforeseen problem or if there where unintended errors in the code, the correction of this is considered a corrective change. 3.4 Origins for Data -13 -Before data can be analyzed and hypotheses tested, data has to found. This can be trivial and it can be tricky. It is important to find a software system that is relevant for the research question or hypothesis in mind and it is important to know the context of the data to be analyzed. With permission, using a commercial software system can be a good source. Commercial development usually has a relatively stable workforce, well-defined development strategy and clear business goals. Although a good and defined source for answering questions, companies are often reluctant to give away data on the development of their software system since it is critical to their business. It can be hard to find a suitable commercial software system where the company is willing to share its data. In this case, open source projects can be of great value. There are a large number of open source projects available with a lot of data collected during development to choose from. However they often hold a large number of undocumented and uncontrolled variables. Varying number of developers, lack of schedules and differing or unclear goals are common. Requiring a formal project can limit the suitability of using open source projects, although they do exist. Whatever data origin chosen, it is important to make sure it is generalizable enough to cover the questions of interest and that relevant data is or have been collected during development. Interesting data from a software system can be extracted from several data origins. There are two important sources of interest for software evolution, change requests and the change reports. Also various attributes of the source code is usually very useful. There is a large number different system, standards and routines used in the process of creating the data. Change requests can be reported using a physical paper form or through a separate software system design to manage a vast number of requests. For archiving the changes, there are several different configuration management and version control systems. Some provide easy extraction of data, other require manual work. -14 -4. Research Questions and Hypotheses This report is to look at what factors can explain evolution in software architecture and where it is possible to foresee that evolution will occur. The research questions and hypotheses described in this chapter is constructed for usage in the context of an open source project called Portage described further in chapter 5. This chapter formulates research questions and hypotheses in order to look further at issues relevant to this report in the context of Portage. 4.1 Basis for Formulation The background for the research questions and hypotheses is the ideas from the Goal-Question-Metric (GQM) paradigm (Basili 1994). Simply put, the thought behind is that first a goal is formulated, then one or more concrete measurable questions to help achieve the goal is decided and finally metrics that answer these questions are made. GQM states that data collection should proceed in a top-down rather than a bottom-up fashion. However, Mohagheghi (2004) gives three reasons for why bottom-up studies are useful. Firstly, most data is collected in repositories, not in a GQM paradigm. Secondly, companies that start to use GQM might want to use older data and relate this data to goals (reverse GQM). Thirdly, although a company might measure according to defined goal, the measuring practice itself needs improvement from bottom-up studies. Consequently, a bottom-up approach is used here. 4.2 Why Changes Occur Ideally, at the beginning of software development, when creating an architecture, all concerns for future changes in the systems requirements are considered and accounted for. However, this is rarely the case. The reason for this is that future requirement are not always known, unpredicted new requirement, tight deadlines and changes in the goals and purpose of the software system is common. Software evolution is considered unavoidable and often important, like when the business model is extended and a new marked is met. Although software evolution is often considered unwanted, it should often be considered natural and a necessity. We distinguish between changes intended to add new functionality and changes intended to improve quality attributes. New functionality can be to add supporting payment with credit card, adding print support, a new graph that provide some information or similar. Improved quality can be to improve the overall performance, increase the uptime of a system, increase the security or similar. But changes can’t merely be tracked back to a specific action; often it is important to consider the environment. Obviously the development process and practices plays an important role in preventing or encouraging changes to the software system. Andy Hunt et al. (Venners 2003) note the importance of fixing small unknown errors. His point is that if a development team does not pay careful attention to sustaining a stable high quality, they will “technical debt”, meaning that a known technical problem is not fixed. Postpone fixing these errors often leads -15 -to more technical debt and often abdication of responsibility. The software therefore ends up in a spiral of increased technical debt and decay. The background for Hunts opinions is a theory called the Broken Window Theory based on an experiment done to see what cause neighborhoods with similar demography to evolve in different direction with respect to crime. Hunt explains the study like this: “The researchers did a test. They took a nice car, like a Jaguar, and parked it in the South Bronx in New York. They retreated back to a duck blind, and watched to see what would happen. They left the car parked there for something like four days, and nothing happened. It was not touched. So they went up and broke a little window on the side, and went back to the blind. In something like four hours, the car was turned upside down, torched, and stripped—the whole works.” From this a theory was developed that says that one broken windows lead to more broken windows and further worse criminal act in and exponential fashion. 4.3 Metrics Used in this Report The following metrics are used in the research questions or for the hypotheses: · File size: Data from CVS code repository will be used to determine the total lines of code (including comments). · Change effort: Based on the values extracted from the CVS-log, the amount of lines added to a file plus the amount of lines removed from it will be used to determine the effort of that change. · Change types: All changes for the software system studied will be categorized into one of the three change types (adaptive, corrective and perfective) described in chapter 3. There will also be a category called none, used for changes that are unclear. A change will be defined as an entry to the systems ChangeLog-file (described in chapter 5) and the corresponding textual description will be used as basis for deciding the correct change type. The classification process will be manual and partially based on the algorithm for automatic evaluation described by Mockus (2000), explained in chapter 3. 4.4 Research Questions Motivation for RQ1: The tasks at hand and developers focus change through the lifetime of a software system. As explained earlier, Mikkonen (2001) states that development focus shifts between adding new functionality and restructuring the architecture. A similar pattern also exists between many software releases. New functionality is added at the early stage and when approaching a release concerns for stability, security and user experience are more important. · RQ1: What evolution trend happens in the lifetime of a software system? Metric for RQ1: This question will be answered explorativly, and various attributes will be measured over time. File size, change effort and change types (all of them explained earlier in this chapter) will be used. -16 -Motivation for RQ2: Studies (Mockus 2000) have looked at the share of different change type for various systems. It would therefore be interesting to look at the distribution of those same categories on another particular system. · RQ2: What is the share of different change types? Metric for RQ2: A metrics for change types (explained earlier in this chapter) will be used. 4.5 Hypotheses Motivation for H1: Exploring the nature of different change types can be used to foresee them and prepare for their consequences. · H1: Preventive changes modify more files than corrective and adaptive changes. Metric for H1: To test this, the metrics for change types (explained earlier in this chapter) will be used to separate between preventive changes and other changes. Further data from the ChangeLog-file will be used to se how many files were modified by a change. Three categories will be defined, all changes that have reported modification for one specified file, will be fall into the first category, all files that report to have changed two three or four specified files will be defined as the second category and file that report to have modified more than four files will be included in the last category together with anything changes reporting to have changes anything with a “*”. Changes that do not report any modified files will be ignored. Preventive changes are expected to have a larger portion than corrective and adaptive changes in both the second and third categories to satisfy the hypothesis. Motivation for H2: Based on the statements of Mikkonen (2001), we can say that the need for restructuring increase though development until a restructure. For a system not involved in any major restructuring, the share of perfective changes, to improve the code for further addition of features, should therefore increase though time. · H2: In absence of restructuring focus, the share of preventive changes increases during the lifecycle of software systems. Metric for H2: The metrics for change types (explained earlier in this chapter) will be used. This data will then be used to outline the relative share of perfective changes and corrective and adaptive changes over time to determine whether this share increase or not. For this a case where restructuring has not been a conscious focus is required. The data will then be tested to see whether there is an increasing share of perfective changes. The share will be measured as the average over three months to account for the variation in development activity. Motivation for H3: There is argued that there is an optimal file size for making changes where both an increase and decrease in size cause the effort of change to increase. Data on logical effort is not available for the study in this report, but data on physical effort is. We can therefore see whether file size affect the effort of change with the hypothesis below. · H3: The effort of changes is stable independently of file size. Metric for H3: The metrics for file size and change effort (both explained earlier in this chapter) will be used on the most important files. To compare the effort to the file size, the -17 -final effort is further measured in terms of the average over three months for the same reasons as with H2.-18 -5. Research Context A lot of data is collected during the development of a software system. In order for developers to cooperate smoothly, information is collected to prevent them from stepping on each others toes and to communicate and monitor activities. Portage, the project used in this study contains about 20kloc lines of Python code in about 50 files. In addition there are several shell scripts that wraps various system tools. 1173 change reports on the system are analysed. Also, the two main files in Portage emerge (3484 loc) and portage.py (6554 loc) have been closer analysed and contain 364 and 527 modification respectively. Data for three year between October 2001 to October 2004 was used, modification for emerge and portage.py goes further back to August 2001 as well. 5.1 Study Objects Portage, the package management system for the Linux distribution Gentoo Linux is used. Characteristic for this system is the target users are developer and it should therefore have more and better feedback from its users. It is based on voluntary work and consequently it is cantered around a few stable core developer assisted by a larger developer base. Portage performs critical tasks for the operating system (Gentoo Linux) and therefore it has a high demand for stability and quick updates in case of faults. The development history is special in that public development has lasted three year without large restructuring of the code (although a major restructuring is under work separately from development studied here). Also a simple fundamental architecture of the process for the system exists and has been stable since development stated, but there has never been any architecture for the internal part of the system. Neither has there been any formal architecture through documentation, diagram or similar. 5.1.1 Gentoo Linux and Portage Gentoo Linux Gentoo Linux is a Linux distribution based on voluntary effort. It is a collection of several open source programs packed together to form a fully working operating system. It is unique in that it does not distribute compiled binary-version of the system components, but instead makes each system compile and build the programs itself optimized the hardware and with user-specified parameters. The vast majority of the development is done by third-party developers while about 200 Gentoo developers makes sure all the programs gets integrated with the system. A large part of this work is done by writing ebuilds, which is a script with instruction on how to find the newest version of the software, build it and finally merge (integrate) it with the system. A major part of the development focus on making new and improving the ebuilds. An important characteristic of the Gentoo project is that it is focused on being highly developer friendly to attract highly skilled developers as users to help improving the system. The system is therefore driven by the open source philosophy of “scratching where it itches”, meaning that developers fix and improve the part where they themselves see a need as a user. -19 -Combined with having highly skilled developers, this creates a strong community for improving the system. Portage The Gentoo web page states that “Portage is the heart of Gentoo Linux, and performs many key functions.” (Gentoo 2004). It is basically where most of the in-house development in Gentoo takes place, not including ebuilds. It is the program that takes care of distributing, building and keeping track of all the “packages” of the programs that makes up the Gentoo Linux operating system. Portage contains about 20kloc of Python code and several utilities written is shell script and consists of about 5-10 main developers. It has a recorded public development history stating August 2001. Figure 4 Architecture of Portage and the process of installing a package The process of merging (integrating) a new program using Portage can be described as follows: 1. Search PortDB for the newest ebuild or a given version for a specified package. An ebuild is a script with guidelines for emerge on how to perform the installation. -20 -2. Parse ebuild to determine dependencies; that is required packages needed before installing this particular ebuils. 3. Download source code need for the package from a remote server over HTTP and if necessary download and apply associated patched. 4. Build the source code in temporary location with the specified values from make.conf. 5. Perform post-build actions. 6. Merge the newly build package with the existing system. 7. Create new entry in VarDB with the ebuild, parameters from make.conf and list of files added to the system. Ebuilds are central in the Gentoo project. They are ordered in a three level hierarchy of category, package and version. A number of categories are defined, each containing several packages. For each package, there are also specified different versions. Usually the newest version is preferred, but occasionally an older version of a package is needed, so different versions of a package are available to the system administrator. The PortDB is a database of ebuilds for packages available to the user, either installed or with the potential for been installed or upgraded. In these scripts are instructions for how to build the content of the package, usually a program. The PortDB is updated via a remote database manually be the user to make sure the system know of updates and security fixes. When a system administrator wants to upgrade a part of the system or install a new package, he or she runs the emerge program that checks PortDB to find and run the necessary ebuild(s). It downloads associated source code for that package from a remote server. Next emerge compiles the source code for the package in a temporary location based on the configurations in the files make.globals and make.conf and perform any specified post-build task, like adding users, to the system. Further emerge merges the package with the system by copying the files from the temporary location to the correct location. Finally emerge copy the script and a list of the new files just copied for the package to VarDB to keep track of the state of the system. 5.1.2 Overview of Releases Figure 5 Timeline of important Portage releases The timeline shows the overview of the releases recorded in the projects ChangeLog that are considered to be the important ones. They where considered important mostly based on their release number. There several releases between 2.0 and 2.0.47 (2.0.1, 2.0.2, ... ,2.0.46), but they where so frequent that they should not be considered important since they do not represent a lot of change each. The releases are used for RQ1. -21 -5.1.3 Context and Structure of the Data Sources The Portage system uses the same development technology as the majority of open source projects. The figure below describes this system. There are two main parts, Bugzilla and CVS. Bugzilla keeps track of the state of the system (like reported faults) while CVS keeps track of changes to the code base. Figure 6 The process of a fault being reported and fixed As shown in Figure 6 a change request is reported by a user to Bugzilla, a web-based system that keeps track of the state of bugs. If the bug report is accepted by the responsible Bugzilla administrator, it is published as an open task ready to be performed by an unassigned or assigned developer. A developer, either the deve loper responsible for that part of the system or a voluntary developer, makes the necessary modification to a local version of the code recently updates from CVS. When the developer is done, the code with modification is submitted to CVS, a code repository that keeps track of changes to the code. For any modification, the developer is asked to make a description about the change. Although not forced by the system, the developer also reports the changes made with a description in a file called ChangeLog at the root of the project. Although Bugzilla is created for the purpose described above, it is also commonly used for tracking request for new features the same way as requests for bug fixing. The process for this is the same as described above. -22 -This gives two sources for data mining of the Portage system, the ChangeLog-file which is a historical log with comments on all changes made to the system and the CVS-logs which logs all changes made to a single file in a more detailed form than the ChangeLog. The Bugzilla system also provides valuable data for empirical studies, but is not used here. ChangeLog A changelog is a log of recorded changes made to a project. Commonly used in open source projects with the intention of keeping an overview of changes so tha t developers can follow what has been done by others. In Portage, and most open source projects, it comes in the form of a single text file. Although there are no formal standard for the layout of a ChangeLog-file, a change done by a developer to a file in a project is usually supplemented, possibly automatically, by adding a paragraph to the top of the file with the date of the change, the person that made the change, the files effected and a description of the change. When the maintainer makes a release, he or she adds a new line at the top of the file saying that a new release has been made public. Therefore it is also possible to track the version a given change was included in. Following is a sample text from the ChangeLog-file in the Portage project: *portage-2.0.51 (20 Oct 2004): Everyone loves stable! 20 Oct 2004; Jason Stubbs repoman: Added check for digest entries that aren't used within the corresponding ebuild's SRC_URI. 20 Oct 2004; Jason Stubbs emerge: Added support for EMERGE_WARNING_DELAY defaulting it to 10. Changed all the hardcoded delays to use it. Needed for the catalyst guys as it includes a number of unmerges of system packages. 19 Oct 2004; Nicholas Jones portage.5: patch included to fix a few typos. The data extracted from the ChangeLog into the a table with one change each row containing information on related release, the date of the change, the developer that made the change, the files affected and a description of the change. This was done be writing a Java-program that read through the ChangeLog-file and convert the change entries to a structured format in a new file. The structure of the new file was recognizable and converted into a spreadsheet by a Microsoft Excel/OpenOffice.org. The Java-program is added to the report in Appendix A. CVS-logs CVS is a commonly used and simple version control system. It tracks the history of files by logging differences (added and removed lines) between the files before and after a change with comments from the developer. It stores information about what lines are added where and what lines are removed, the time at which the change was made, whom made it and comments. It stores information of each file separately and thus does not contain information on which files where changed at the same time as part of a single submitted change to the system. An important strength of using CVS and similar systems for data mining is that it -23 -provides consistent data over the duration of development regardless of changes in the development process or active developers. Data from the CVS can be presented through a web-based interface listing a summary of each change. Following is a sample of a change summary displayed in a web-based CVS-interface: Revision 1.530 -(download), view (text) (markup) (annotate) -[select for diffs] Mon Oct 25 11:20:46 2004 UTC (4 days ago) by jstubbs Branch: MAIN Changes since 1.529: +16 -8 lines Diff to previous 1.529 Converted config.pkeywordsdict from {atom:[keyword]} to {cp:{atom:[keyword]}} to prevent a lot of unnecessary calculation. On the first line show the revision number for a file. This is generated automatically by CVS and has no relevance to any release number. The second line shows the date and time and the username of the developer that committed the change. The third line shows the branch used in case it is necessary to keep track of two separate version of the same file at the same time. The fourth line shows from what revision the changes where committed and the number of lines added and the number of lines removed. At the end of the summary a textual description of the change is shown. The data from the CVS-logs where extracted by storing the html code from the web-based interface and convert the data to a structured format by doing search-and-replace in a normal text editor. The structured data could then be converted to data stored in a spreadsheet before analyzed further. Change categorization The changes extracted from the ChangeLog-file where further categorized according to change type. Based on the textual description, each of the 1173 entries where given on of three labels based on what kind of modification was done. The three change type, corrective, adaptive and perfective where described in chapter 3. In addition a category called “None” was made for changes where the intention where unclear. Although Mockus (2000) outline a good description on how to label these characteristics automatically, for this particular experiment it was done manually because it was considered substantially easier to do and to ensure accuracy. -24 -6 Results No statistical tests have been used on the research questions and hypotheses of this study. · RQ1: What evolution trend happens in the lifetime of a software system? 0,0 % 10,0 % 20,0 % 30,0 % 40,0 % 50,0 % 60,0 % 70,0 % 2001Q4 2002Q1 2002Q2 2002Q3 2002Q4 2003Q1 2003Q2 2003Q3 2003Q4 2004Q1 2004Q2 2004Q3 2004Q4 Perfective Corrective Adaptive Figure 7 Share of change types (average for three months) Figure 7 shows how the share of change types change over time. We see that the share of perfective changes have a slight increase from the start towards the end. The share of corrective and adaptive changes varies more. We see how corrective changes have two tops, one in the period between April and June 2002 and another one between January and March 2004. Naturally the share of adaptive changes increases when the share of corrective changes decreases and vice versa. 0,00 % 10,00 % 20,00 % 30,00 % 40,00 % 50,00 % 60,00 % 70,00 % okt.01 des.01 feb.02 apr.02 jun.02 aug.02 okt.02 des.02 feb.03 apr.03 jun.03 aug.03 okt.03 des.03 feb.04 apr.04 jun.04 aug.04 okt.04 Perfective Corrective Adaptive Figure 8 Share of change types (average on releases) -25 -Figure 8 shows how the share of change types cha nge between different releases within the same timeframe as Figure 7. No apparent trend between releases can be visualized in the figure. From the graphs in Figure 7 and Figure 8, we can see that an evolution trend of the share of different change types can be seen over time but not between releases. The share of change types seems to swing between increasement in corrective changes and increasement in adaptive changes over time. No other clear trends where found in the data used. · RQ2: What is the share of different change types? None 4,9 % Perfective 14,7 % Corrective 39,1 % Adaptive 41,3 % Figure 9 Share of different change types in Portage The data extracted from the ChangeLog-file using the metrics described in chapter 4 is presented in Figure 9. It is based on recorded development from three years of development, from October 2001 to October 2004, and a total of 1173 changes. Criteria for the different categories are closer described in chapter 3. -26 -0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 90 % 100 %2001Q4 2002Q1 2002Q2 2002Q3 2002Q4 2003Q1 2003Q2 2003Q3 2003Q4 2004Q1 2004Q2 2004Q3 2004Q4 Perfective Corrective Adaptive Figure 10 Share of change types in Portage over time Figure 10 shows the same data presented on a time scale. It shows the share of change types over a three year period for every three month. · H1: Preventive changes modify more files than corrective and adaptive changes. [Accepted] 0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 90 % 100 % Perfective Corrective Adaptive 1 File 2-4 Files Multiple files Figure 11 Files modified for di fferent change types Figure 11 shows how many files have been modified for the different change types. Perfective changes have more changes that modifies both 2-4 files and multiple files (meaning an unknown number larger than 1; anything noted by ‘*’). 28 (2.4 %) of the changes did not specify what files where modified and where ignored in the data. As the hypothesis says, we can say that perfective changes modify more files than corrective and adaptive changes. -27 -· H2: In absence of restructuring focus, the share of preventive changes increases during the lifecycle of software systems. [Accepted] 0 % 5 % 10 % 15 % 20 % 25 % 30 % 35 % 40 % okt.01 des.01 feb.02 apr.02 jun.02 aug.02 okt.02 des.02 feb.03 apr.03 jun.03 aug.03 okt.03 des.03 feb.04 apr.04 jun.04 aug.04 okt.04 Figure 12 Share of perfective changes Figure 12 shows how the share of perfective change varies in different months. There is a trend line showing how the share is increasing over time. The current development version of Portage has no large restructuring. As stated in the hypothesis, in absence of restruc turing focus, the share of preventive change therefore increase during the lifecycle of software systems. · H3: The effort of changes is stable independently of file size. [Not accepted] 0 1000 2000 3000 4000 5000 6000 7000 8000 aug.01 okt.01 des.01 feb.02 apr.02 jun.02 aug.02 okt.02 des.02 feb.03 apr.03 jun.03 aug.03 okt.03 des.03 feb.04 apr.04 jun.04 aug.04 okt.04 emerge portage.py Figure 13 Size of emerge and portage.py Figure 13 shows changes in the size of the files emerge and portage.py, the two major files in Portage. -28 -0,0 50,0 100,0 150,0 200,0 250,0 300,0 350,0 400,0 450,0 aug.01 okt.01 des.01 feb.02 apr.02 jun.02 aug.02 okt.02 des.02 feb.03 apr.03 jun.03 aug.03 okt.03 des.03 feb.04 apr.04 jun.04 aug.04 okt.04 emerge portage.py Figure 14 Change effort on emerge and portage.py (average on month) Figure 14 shows how the average effort for a change varies over time. Effort is measured in as the sum of the amount of lines added the amount of lines removed. (So if eight lines are added and five lines are removed, the effort is measured as 13.) We see from the two graphs above the effort measured is not stable and we can there for conclude that it is neither stable independently over time. It could still be that the file size does not affect change effort. If unstability in change effort is caused by very large or small random changes having artificially high impact that would have “disappeared” in a larger statistical selection. More likely though, change effort is affected by one or more variables other than file size. -29 -7. Discussion 7.1 Research validity The four threats to validity -conclusion, internal, construct and external -outlined by Cook (1997) is used in this section to describe the validity of the study. 7.1.1 Conclusion validity Conclusion validity concerns issues that affect the ability to draw the correct conclusion about relations between the treatment and the outcome. Whether there is a strong enough statistical significance. The study answered the questions and hypotheses by presenting the data using figures and no statistical tests where used. However, it should be noted that 1173 change entries from the ChangeLog-file and 364 and 527 modification to the files emerge and portage.py where used over a timeframe of three years (plus two extra months for the data from emerge and portage.py). 7.1.2 Internal validity Internal validity concerns issues that may indicate a casual relationship, although there is none. How factors unaccounted for can have affected the results. The following factors can have affected the data: · Unclear textual description in change entries can cause wrong categorization. · Multiple changes reported in one change entry are registered as a single change. · 4.9 % of the change entries lacked clear description and was ignored. · Special events during the development can have caused different results. The first two factors only random errors and do not cause biased results. A large selection therefore limits this source of error. The third factor, however could effect the result if a certain change type has a larger chance of leaving a blank description than other etc. The fourth factor probably has an affect on the data. Fluctuation in activity (described later in this chapter) due to exams etc. cause certain months to be more vulnerable to statistical uncertainty than others, this has been partly accounted for by calculating averages quarterly instead of monthly. Also significant organizational changes in the Gentoo Linux project can have affected the data. 7.1.3 Construct validity Construct validity refer to the extent to which the setting actually reflects the construct under study. Whether poor measures have been used. -30 -Generally this empirical studied measured the system with physical attributes instead of logical, although that would be more appropriate. This was due to limitations in data, tools and time and limits generalizability of the study. The metrics used in the empirical study have weaknesses: · File size: Using lines of code is preferred instead of bytes etc. because it gives a better impression in terms of development work. Not including lines with comments could be considered, but measuring work, not programming code was better suited for the hypothesis (H3). · Change types: The change categories have background in terms of data through the work of Mockus (2000) analyzing 33171 modification requests and though several other studies. There are however concerns on whether manual categorization aided by observation on words frequently used in context with the different changes types can be transferred from the context of commercial company-based software development used by Mockus to the non-commercial open source development used for Portage. Two factors have to be viewed. Firstly, the language might be different. Modification requests used by companies uses a user-to-developer language, while change entries in open source development uses a developer-to-developer language. Secondly, difference in development structure plan a role. Although Mockus’ study shows that the word “add” is frequently used when describing adaptive changes, experience showed that in Portage it is also often used for corrective changes, like “added patch to fix …”. · Effort: Using measures on physical changes to measure effort is limited demonstrated in the example below: Revision 1.531 -(download), view (text) (markup) (annotate) -[select for diffs] Mon Oct 25 14:35:34 2004 UTC (3 days, 20 hours ago) by jstubbs Branch: MAIN Changes since 1.530: +141 -139 lines Diff to previous 1.530 Wrapped entire lock-holding section of fetch() in a try-finally block to ensure that unlockfile is always called. The developer simply added a line above and below a block of code. The log says 141 lines where added and 139 lines where remove since all the 139 lines between the two lines added appeared to be affected. 7.1.4 External validity External validity concerns the ability to generalize results outside the settings used. The correct subject, environment and timing are necessary. It is a clear difference between company-based and open source software development. However, the research questions and hypotheses used in this study is not affected by most of these differences. H1 and H2 should be generalizable to most projects, while RQ2 should only be generalized to similar project in size and purpose. The results of RQ1 and H3 are too weak to consider generalization. -31 -7.2 Lessons learned The empirical study gave valuable first-hand experience to be used in the furthering of the work in a larger study. Pitfalls and mistakes could be made in a small context where the resulting loss of work could be kept low. 7.2.1 Change Categorization The textual description of changes in the ChangeLog-file provided a good data basis for categorizing changes. The description where often fully describing, but still kept short. Due to the gift culture found in open source, giving a good description of the work done seems like an important thing and peer expects it to be used. The process of actually getting from the text file to structured data was however somewhat more complicated. The entries where parsed to a structured form using a Java-program. Although all the change entries in the ChangeLog-file had similar “look”, occasional small differences where found which caused unpredictable errors where manual fixing where required. The structured data parsed from the ChangeLog-file where then converted to a spreadsheet using Microsoft Excel/OpenOffice.org with columns for developer, date, files effected and textual description. Approximately 2% (or about 20) of the data was incorrect after the parsing and was fixed manually. Further, columns for none, perfective, corrective and adaptive where added to categorize the changes. The work of Mockus (2000) on finding frequently used words for the different change types where very useful, textual description stating with added, fixed, removed etc gave a good idea of the probable change type. However, there where three concerns. Firstly, some description contained reports on multiple changes in one change entry. Although these changes are usually related, they cause problems because they describes changes of different types. The first change mentioned was given the most weight since it was assumed most likely to be most important. Secondly, although the word “add” is often used in association with adaptive changes as reported by Mockus, in open source context where the core developers often receive modification to a system by external developers in the form of “patches” containing a list of changes, textual description of changes where a patch is involved often stats with “Added patch from … to …” and should not be mistaken to always be an adaptive change. Thirdly, several textual descriptions used the words “change”, “rewrite” and “move” not included in Mockus’ work. Textual description with these words therefore required a closer look. Also worth noting is that several on the textual description contained a bug number, e.g. they ended with “This should close #12345.”, #12345 refering to a specific entry in Bugzilla. Although Bugzilla is designed as a system for reporting fault, it is commonly used in by open source projects for registering wanted features from users. Therefore a bug number often refers to a request for a feature that results in a adaptive change. 7.2.2 Variation in Activity When plotting data on the activity of development, there was a clear fluctuation in activity between different months. May and November seems to have relatively little activity and -32 -August and February seems to have relatively high activity. Although there are no hard evidence of it, it seems like activity is high in the beginning of a school semester and lower at the end. Open source projects, especially project similar to Gentoo, often attracts college students and could explain this observation. Grouping changes on quarterly instead of monthly seemed to account for this fluctuation. 7.2.3 Analyzing the Data The analyzing of the data was done converting the data to a spreadsheet format and then work with the data in the spreadsheet. This provided an easy way to manipulate data and it had quick means of making various graphs. However, it did create a lot of manual labor of moving and summing up data that could have been easily done with an SQL query or similar. As an example, the author wanted to sort all the 1173 entries from the ChangeLog-file grouped on the 37 months they where distributed in. Although a trivial thing to do in SQL, in the spreadsheet, it was necessary to group all the months separately manually. Other solution where possible, but not easily available. A more flexible solution for analyzing historical data could therefore be using an SQL database or perhaps a data warehouse. -33 -8. Conclusion and Future Work This report has looked at research on software evolution and software architecture. It also looked at how to measure software changes, and formulated at set of research questions and hypotheses based on this. To attempt to answer the research question and test the hypotheses, an empirical study was performed using the open source project Portage, developed by the Gentoo Linux project. The empirical study showed the share of change types made to the software system (RQ2), how preventive changes modifies more files than corrective and adaptive changes (H1) and how absence of restructuring focus cause the share of perfective changes to increase(H2). Although no statistical tests where used, it is the author belief that the results are clear. It also tried to look for evolution trends in development lifetime (RQ1) and see if file size affected change effort (H3), but had limited results. The data collected was limited by measuring physical and not logical changes to the software system. In order to get better and more detailed information on changes, data on logical changes should be used. The study also showed that various open source projects, not only the best known, can be used as data source for studying software evolution by providing a large amount of and relevant data to answer research questions and test hypotheses. There are a number of possible follow-up studies. A tool like Pythius (Pythius 2004) can be used to apply different metrics on projects written in Python, e.g. on Portage. Pythius can measure values like lines of code, non-commented lines of code and the size of all classes in a file. This can be used to get better data to analyze logical changes, e.g. increase in the size of a class, and get a better idea of the kind of evolution that has happened during the development. Data from Bugzilla can also be used as a source of change requests to see what kinds of changes are requested and which of them are made. More advanced metrics to evaluate loose/tightly coupling and hierarchal structures in object-oriented software systems can also be used for analyzing logical change and architectural decay. -34 -9. References Ball, T. and S. G. Eiek. Software visualization in the large. IEEE Computer, 29(4), April 1996. Basili, V.R., Calidiera, G., Rombach, H.D., Goal Question Metric Paradigm, In: Maraciniak, J.J. (ed.): Eccyclopaedia of Software Engineering. New York Wiley 1994, pp. 528-532. Bass L., Clements P., Kazman, R. Software Architecture in Practice. Addison.Wesley Publishing Co.: Readming MA. 2nd edition. 2003. Brito, F. Abreu, M. Goulao, and R. Esteves. Toward the design quality evaluation of objectorieente software systems. In Proc. 5th lnt 7 Conf. Software Quality, pages 44-57, October 1995. Chidamber, S. R. and C. E Kemerer. A metrics suite for object-oriented design. IEEE Trans. Software Engineering, 20(6):476-493, June 1994. Cook, T.D. and Campbell, D.T., Quasi-Experimentation – Design and Analysis Issues for Field Settings, Houghton Mifflin Company, 1979. Cooper, D.R., Schindler, P.S., Business Research Methods, McGraw-Hill International edition, seventh edition, 2001. Davey, J., Burd, E., Clustering and Concept Analysis for Software Evolution, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Demeyer, S. and S. Dueasse. Mettles: Do they really help? In Proc. Languages et ModUles d Objets, pages 69-82. Hermes Science Publications, 1999. Demeyer, S., Mens, T., Wermelinget, M., Towards a Software Evolution Benchmark, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Evans, H., Why Is Distributed System Evolution Not Better Supported?, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Fenton, N., S. L. Pfleeger and R.L. Glass, Science and Substance: a Challenge to Software Engineers. IEEE Software 11(4), July 1994 Fenton, N. and S. L. Pfleeger. Software Metrics: A Rigorous and Practical Approach. International Thomson Computer Press, London, UK, second edition, 1997. Gall, H., K. Hajek, and M. Jazayeri. Detection of logical coupling based on product release history. In International Conference on Software Maintenance (1CSM '98). IEEE Computer Society Press, November 1998. Jazayeri, M., C. Riva, and H. Gall. Visualizing software release histories: The use of color and third dimension. In H. Yang and L. White, editors, Proc. lnt'l Conf. Software Maintenance (ICSM '99). IEEE Computer Society, 1999. -35 -Jun Home page, http://www.sra.co.jp/people/aoki/Jun/Main_e.htm Kabaili, H., R. K. Keller, and E Lustman. Cohesion as changeability indicator in objectorieente systems. In P. Sousa and J. Ebert, editors, Proc. 5th European Conf. SoJtware Maintenance and Reengineering, pages 39-46. IEEE Computer Society Press, 2001. Knight, C., Munro, M., Organisational Trails through Software Evolution, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Lanza, M.. The evolution matrix: Recovering software evolution using software visualization techniques. In Proc. lnt'l Workshop on Principles of Software Evolution (1WPSE2OO1), 2001. Lehman, M. M., Programs, Cities, Students--Limits to Growth, Imp. Col. 1974, Inaug. Lect. Series, Vol.9, 1970-1974, pp. 211 -229; also in Gries, 1978 Lehman, M. M. , J. F. Ramil, Evolution in software and related areas, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Lehman, M. M., An Approach to a Theory of Software Evolution, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Lientz, B. P., Swanson, E. B., Software Maintenance Management, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1980 McKee, J. R.. Maintenance as a function of design. In Proc. 1984 AIPS national Computer Conference, pages 187-93, 1984. Mens, T. and Demeyer, S., Evolution Metrics, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Mikkonon, T., Pruuden, P., Practival Perspectives on Software Evolution and Architectures, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Mohagheghi, Parastoo and Conradi, Reidar: "Exploring Industrial Data Repositories: Where Software Development Approaches Meet", Proc. of the 8th ECOOP Workshop on Quantitative Approaches in Object-Oriented Software Engineering (QAOOSE’04), 15 June 2004, Olso, Norway, Coral Calero, Fernando Brito e Abreu, Geert Poels and Houari A. Sahraoui (Eds.), pp. 61-77. Affiliated with 18th European Conference on Object-Oriented Programming (ECOOP 2004), 14-18 June 2004, Oslo. Monteiro, Eric; Østerlie, Thomas; Rolland, Knut and Røyrvik, Emil. Keeping it going: The everyday practices of open source software, 2004, submitted for reviewing. Nakatani, T., Quantitative observations on object evolution, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria -36 -Perry, D. E. and A. L. Wolf. Foundations for the study of software architecture. ACM SIGSOFT Software Engineering Notes, 17:40--52, October 1992. Pfleeger, S. L.. Software Engineering: The Production of Quality Software. Macmillan Publishing Company, 2 edition, 1991. Pigoski, T. M., Practical Software Maintenance, Wiley, 1996, pp. 384 Pythius Homepage, http://pythius.sourceforge.net/Rajlich, V., Role of Concepts in Software Evolution, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Sjøberg, D.I.K, Arisholm, E., Jørgensen, M., Conducting Experiments on Software Evolution, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Sneed, H. M., Recycling Software Components extracted from Legacy Programs, Proceedings of the 4th international workshop on Principles of software evolution, September 10-11, 2001, Vienna, Austria Svahnberg, M., Bosch, J., Evolution in software product lines: Two cases, Journal of Software Maintenance: Research and Practice, v.11 n.6, p.391-422, Nov. 1999 Tzerpos V., Holt R.C., MoJo: A Distance Metric for Software Clusterings, Proceedings of the Sixth Working Conference on Reverse Engineering, pp 187-193, 1999. Venners, B., Don't Live with Broken Windows, Interview with Andy Hunt and Dave Thomas, http://www.artima.com/intv/fixitP.html -37 -Appendix A The following is the source code used for extracting change data from the ChangeLog-file: 001 package no.athagli.ChangeLogAnalyser; 002 003 import java.io.DataInputStream; 004 import java.io.FileInputStream; 005 import java.io.FileOutputStream; 006 import java.io.IOException; 007 import java.io.PrintStream; 008 009 /** 010 * @author Andreas Tørå Hagli 011 */012 public class Main { 013 014 public static void main(String[] args) { 015 String seperator = "@"; 016 017 //Stream to read file 018 FileInputStream fin; 019 020 try { 021 //Open an input stream 022 fin = new FileInputStream ("portageChangeLog.txt"); 023 DataInputStream file = new DataInputStream(fin); 024 025 FileOutputStream outReleases = null; 026 PrintStream pReleases = null; 027 FileOutputStream outRevisions = null; 028 PrintStream pRevisions = null; 029 030 try { 031 outReleases = new FileOutputStream("releases.csv"); 032 outRevisions = new FileOutputStream("revisions.csv"); 033 034 pReleases = new PrintStream(outReleases); 035 pRevisions = new PrintStream(outRevisions); 036 } 037 catch (Exception e) { 038 e.printStackTrace(); 039 } 040 041 //Read a line of text 042 String s = file.readLine(); 043 while (s != null) { 044 045 if (s.startsWith("*")) { //A release 046 String version = s.substring(1, s.indexOf("(")-1); 047 String date = s.substring(s.indexOf('(')+1,s.indexOf(')')); 048 049 if (version.startsWith(" ")) 050 version = version.substring(1); 051 052 version = version.replaceFirst("Portage-", ""); 053 version = version.replaceFirst("portage-", ""); 054 version = version.replaceFirst("Portage ", ""); 055 version = version.replaceFirst("portage ", ""); 056 version = "portage-"+version; 057 058 date = date.replaceAll("May", "Mai"); 059 date = date.replaceAll("Oct", "Okt"); 060 date = date.replaceAll("Dec", "Des"); 061 062 System.out.println("Version: "+version); -38 -063 System.out.println("Date: "+date); 064 pReleases.println(version + seperator + date); 065 } 066 else if (s.indexOf(";") == 13) { 067 String date = s.substring(2, 13); 068 String user = s.substring(15, s.indexOf(">")+1); 069 String files = ""; 070 String desc = ""; 071 072 //Remove semicolon after e-mail 073 if (s.indexOf(">")+1 == s.indexOf(":")) { 074 s = s.replaceAll(">:", ">"); 075 } 076 077 //Is there are semicolon after the e-mail? 078 if (s.indexOf(">")+1 < s.indexOf(":")) { 079 files = s.substring(s.indexOf(">")+2, s.indexOf(":")); 080 desc = s.substring(s.indexOf(":")+1); 081 s = file.readLine(); 082 while (s!=null && s.startsWith(" ") && s.length()>4) { 083 desc += s.substring(1); 084 s = file.readLine(); 085 } 086 } 087 else { 088 //Is there more text after e-mail 089 if (s.indexOf(">")+2 < s.length()) 090 files = s.substring(s.indexOf(">")+2); 091 092 s = file.readLine(); 093 //Is there not a semicolon on secound line? 094 if (s.indexOf(":") != -1) { 095 files += s.substring(0, s.indexOf(":")); 096 097 desc = s.substring(s.indexOf(":")+1); 098 s = file.readLine(); 099 while (s != null && s.startsWith(" ")) { 100 desc += s.substring(1); 101 s = file.readLine(); 102 } 103 } 104 else { 105 desc = files; 106 107 s = file.readLine(); 108 while (s != null && s.startsWith(" ")) { 109 desc += s.substring(1); 110 s = file.readLine(); 111 } 112 113 files = ""; //No files mentioned 114 } 115 } 116 files = files.replaceAll(",", " "); 117 files = files.replaceAll(" ", " "); 118 files = files.replaceAll(" ", " "); 119 if (files.endsWith(" ")) 120 files = files.substring(0, files.length()-1); 121 if (desc.startsWith(" ")) 122 desc = desc.substring(1); 123 124 //Change to Norwegian 125 date = date.replaceAll("May", "Mai"); 126 date = date.replaceAll("Oct", "Okt"); 127 date = date.replaceAll("Dec", "Des"); 128 129 user = user.replaceAll("@", "{AT}"); 130 files = files.replaceAll("@", "{AT}"); -39 -131 desc = desc.replaceAll("@", "{AT}"); 132 133 System.out.println("Date: "+date); 134 System.out.println("User: "+user); 135 System.out.println("Files: "+files); 136 System.out.println("Desc: "+desc); 137 pRevisions.println(user + seperator + date + seperator + 138 files + seperator + desc); 139 } 140 141 s = file.readLine(); 142 } 143 144 //Close our input stream 145 fin.close(); 146 pReleases.close(); 147 pRevisions.close(); 148 } 149 //Catches any error conditions 150 catch (IOException e) { 151 System.err.println ("Unable to read from file"); 152 e.printStackTrace(); 153 System.exit(-1); 154 } 155 } 156 }
flag this doc
200
19
not rated
0
2/5/2008
English
Preview

Software Architecture

koundinya75 6/23/2008 | 259 | 41 | 0 | technology
Preview

A Software Tool for Risk-based Testing

user002 2/5/2008 | 222 | 27 | 0 | technology
Preview

Software Inspection Management Tool Patent Application by IBM

MissPowerPoint 4/26/2008 | 114 | 5 | 0 | technology
Preview

Code Reuse in Object Oriented Software Development

user002 2/5/2008 | 522 | 34 | 0 | technology
Preview

Satellite-Observed Changes in the Artic

NASAdocs 6/18/2008 | 10 | 0 | 0 | legal
Preview

A Software Process Ontology and its Application

NIST 7/2/2008 | 29 | 0 | 0 | legal
Preview

Integrating WebSphere Application Server and CICS Using the J2EE Connector Architecture

AmnaKhan 4/2/2008 | 21 | 1 | 0 | technology
Preview

Application of Technology in Classroom

Mary_jMenintigar 9/30/2008 | 28 | 0 | 0 | financial
Preview

Technology Integration

jasonpatino 1/15/2008 | 352 | 18 | 0 | technology
Preview

Information Architecture

user002 2/5/2008 | 453 | 44 | 0 | technology
Preview

Oracle Instance Architecture

carthi 1/22/2008 | 815 | 69 | 0 | technology
Preview

Application of Jini technology

Jharan 5/24/2008 | 66 | 2 | 0 | technology
Preview

Advent Software 2006 Annual Report

AnnualReports 2/12/2008 | 441 | 5 | 0 | financial
Preview

Borland Software Corporation 2006

AnnualReports 2/12/2008 | 565 | 7 | 0 | financial
Preview

Progress Software 2006 Annual Report

AnnualReports 2/12/2008 | 177 | 1 | 0 | financial
Preview

meeting the digital challenge

user002 2/5/2008 | 539 | 66 | 0 | technology
Preview

Introduction to Data Mining

user002 2/5/2008 | 1147 | 214 | 2 | technology
Preview

Information Management Framework

user002 2/5/2008 | 896 | 200 | 0 | technology
Preview

Information Management Framework metadata

user002 2/5/2008 | 525 | 81 | 0 | technology
Preview

Information Management Framework Data Quality

user002 2/5/2008 | 664 | 141 | 2 | technology
Preview

Information Management Classification Guideline

user002 2/5/2008 | 569 | 87 | 0 | technology
Preview

Information Management - Privacy and Personal Information Protection Guideline

user002 2/5/2008 | 426 | 39 | 0 | technology
Preview

Information Architecture

user002 2/5/2008 | 453 | 44 | 0 | technology
Preview

How to measure success

user002 2/5/2008 | 426 | 18 | 0 | technology
Preview

HelloPartner Data Model

user002 2/5/2008 | 383 | 18 | 0 | technology
 
review this doc