VIEWS: 429 PAGES: 57 CATEGORY: Biotechnology POSTED ON: 11/7/2009
Special Issue on ICIT 2009 conference - Bioinformatics and Image Volume: Bioinformatics and Image Publishing Date: 7/30/2009 This work is subjected to copyright. All rights are reserved whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illusions, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication of parts thereof is permitted only under the provision of the copyright law 1965, in its current version, and permission of use must always be obtained from UBICC Publishers. Violations are liable to prosecution under the copy right law. UBICC Journal is a part of UBICC Publishers www.ubicc.org � UBICC Journal Typesetting: Camera-ready by author, data conversation by UBICC Publishing Services
UBICC Journal Ubiquitous Computing and Communication Journal Volume 4 · Number 3 · July 2009 · ISSN 1992-8424 Special Issue on ICIT 2009 Conference – Bioinformatics and Image UBICC Publishers © 2009 Ubiquitous Computing and Communication Journal Co-Editor Dr. AL-Dahoud Ali Ubiquitous Computing and Communication Journal Book: 2009 Volume 4 Publishing Date: 07-30-2009 Proceedings ISSN 1992-8424 This work is subjected to copyright. All rights are reserved whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illusions, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication of parts thereof is permitted only under the provision of the copyright law 1965, in its current version, and permission of use must always be obtained from UBICC Publishers. Violations are liable to prosecution under the copy right law. UBICC Journal is a part of UBICC Publishers www.ubicc.org © UBICC Journal Printed in South Korea Typesetting: Camera-ready by author, data conversation by UBICC Publishing Services, South Korea UBICC Publishers Guest Editor’s Biography Dr. Al-Dahoud, is a associated professor at Al-Zaytoonah University, Amman, Jordan. He took his PhD from La Sabianza1/Italy and Kiev Polytechnic/Ukraine, on 1996. He worked at Al-Zaytoonah University since 1996 until now. He worked as visiting professor in many universities in Jordan and Middle East, as supervisor of master and PhD degrees in computer science. He established the ICIT since 2003 and he is the program chair of ICIT until now. He was the Vice President of the IT committee in the ministry of youth/Jordan, 2005, 2006. Al-Dahoud was the General Chair of (ICITST2008), June 23–28, 2008, Dublin, Ireland (www.icitst.org). He has directed and led many projects sponsored by NUFFIC/Netherlands: - The Tailor-made Training 2007 and On-Line Learning & Learning in an Integrated Virtual Environment" 2008. His hobby is conference organization, so he participates in the following conferences as general chair, program chair, session’s organizer or in the publicity committee: ICITs, ICITST, ICITNS, DepCos, ICTA, ACITs, IMCL, WSEAS, and AICCSA Journals Activities: Al-Dahoud worked as Editor in Chief or guest editor or in the Editorial board of the following Journals: Journal of Digital Information Management, IAJIT, Journal of Computer Science, Int. J. Internet Technology and Secured Transactions, and UBICC. He published many books and journal papers, and participated as speaker in many conferences worldwide. UBICC Journal Volume 4, Number 3, July 2009 SPECIAL ISSUE ON ICIT 2009 CONFERENCE: BIOINFORMATICS AND IMAGE 618 Testing of program correctness in formal theory Ivana Berkovic, Branko Markoski, Jovan Setrajcic, Vladimir Brtka, Dalibor Dobrilovic 628 Detecting metamorphic viruses by using arbitrary length of control flow graphs and nodes alignment Essam Al Daoud, Ahid Al-Shbail, Adnan Al-Smadi 634 Reliability optimization using adapted ant colony algorithm under criticality and cost constraints Belal Ahmad Ayyoub, Asim Elsheikh 642 A comprehensive quality evaluation system for PACS Dinu Dragan, Dragan Ivetic 651 A multi-level method for criticality evaluation to provide fault tolerance in multi-agent systems Mounira BOUZAHZAH, Ramdane MAAMRI 658 A modified partition fusion technique of multifocus image for improved image quality Dheeraj Agrawal, Al-Dahoud Ali, J.Singhai 664 Integrating biomedical ontological ontologies – OBR – SCOLIO Ontology Vanja Luković, Danijela Milošević, Goran Devedžić Special Issue on ICIT 2009 Conference - Bioinformatics and Image TESTING OF PROGRAM CORRECTNES IN FORMAL THEORY Ivana Berkovic University of Novi Sad, Technical Faculty “Mihajlo Pupin”, Zrenjanin, Serbia berkovic@tf.zr.ac.yu Branko Markoski University of Novi Sad, Technical Faculty “Mihajlo Pupin”, Zrenjanin, Serbia markoni@uns.ns.ac.yu Jovan Setrajcic University of Novi Sad, Faculty of Sciences, Novi Sad, Serbia bora@if.ns.ac.yu Vladimir Brtka University of Novi Sad, Technical Faculty “Mihajlo Pupin”, Zrenjanin, Serbia vbrtka@tf.zr.ac.yu Dalibor Dobrilovic University of Novi Sad, Technical Faculty “Mihajlo Pupin”, Zrenjanin, Serbia ddobrilo@tf.zr.ac.yu ABSTRACT Within software’s life cycle, program testing is very important, since quality of specification demand, design and application must be proven. All definitions related to program testing are based on the same tendency and that is to give answer to the question: does the program behave in the requested way? One of oldest and bestknown methods used in constructive testing of smaller programs is the symbolic program execution. One of ways to prove whether given program is written correctly is to execute it symbolically. Ramified program may be translated into declarative shape, i.e. into a clause sequence, and this translation may be automated. Method comprises of transformation part and resolution part.This work gives the description of the general frame for the investigation of the problem regarding program correctness, using the method of resolution invalidation.. It is shown how the rules of program logic can be used in the automatic resolution procedure. The examples of the realization on the LP prolog language are given (without limitation to Horn's clauses and without final failure).. The process of Pascal program execution in the LP system demonstrator is shown. Keywords: program correctness, resolution, test information, testing programs 1. INTRODUCTION The program testing is defined as a process of program execution and comparison of observed behaviour to behaviour requested. The primary goal of testing is to find software flaws [1], and secondary goal is to improve self-confidence in testers (persons performing tests) in case when test finds no errors. Conflict between these two goals in visible when a testing process finds no error. In absence of other information, this may mean that the software is either of very high or very poor quality. Program testing is, in principle, complicated UbiCC Journal – Volume 4 No. 3 618 Special Issue on ICIT 2009 Conference - Bioinformatics and Image process that must be executed as systematically as possible in order to provide adequate reliability and quality certificate. Within software lifespan, program testing is one of most important activities since fulfillment of specification requirements, design and application must be checked out. According to Mantos [2], big software producers spend about 40% of time for program testing. In order to test large and complicated programs, testing must be as systematic as possible. Therefore, from all testing methods, only one that must not be applied is ad hoc testing method, since it cannot verify quality and correctness regarding the specification, construction or application. Testing firstly certifies whether the program performs the job it was intended to do, and then how it behaves in different exploitation conditions. Therefore, the key element in program testing is its specification, since, by definition, testing must be based on it. Testing strategy includes a set of activities organized in well-planned sequence of steps, which finally confirms (or refutes) fulfillment of required software quality. Errors are made in all stages of software development and have a tendency to expand. A number of errors revealed may rise during designing and then increase several times during the coding. According to [3], programtesting stages cost three to five times more than any other stages in a software life span. In large systems, many errors are found at the beginning of testing process, with visible decline in error percent during mending the errors in the software itself. There are several different approaches to program testing. One of our approaches is given in [4]. Testing result may not be predicted in advance. On the basis of testing results it may be concluded how much more errors are present in the software. The usual approach to testing is based on requests analyse. Specification is being converted into test items. Apart of the fact that incorrigible errors may occur in programs, specification requests are written in much higher level than testing standards. This means that, during testing, attention must be paid to much more details than it is listed in specification itself. Due to lack of time or money, only parts of the software are being tested, or the parts listed in specification. Structural testing method belongs to another strategy of testing approaches, so-called "white box"" (some authors call it transparent or glass box). Criterion of usual "white box" is to execute every executive statement during the testing and to write every result during testing in a testing log. The basic force in all these testings is that complete code is taken into account during testing, which makes easier to find errors, even when software details are unclear or incomplete. According to [5] testing may be descriptive and prescriptive. In descriptive testing, testing of all test items is not necessary. Instead, in testing log is written whether software is hard to test, is it stable or not, number of bugs, etc... Prescriptive testing establishes operative steps helping software control, i.e. dividing complex modules in several more simple ones. There are several tests of complex software measurements. Important criterion in measurement selection is equality (harmony) of applications. It is popular in commercial software application because it guarantees to user a certain level of testing, or possibility of so-called internal action [6]. There is a strong connection between complexity and testing, and methodology of structural testing makes this connection explicit [6]. Firstly, complexity is the basic source of software errors. This is possible in both abstract and concrete sense. In abstract sense, complexity above certain point exceeds ability of the human mind to do an exact mathematical manipulation. Structural programming techniques may push these barriers, but may not remove them completely. Other factors, listed in [7], claim that when module is more complex, it is more probable that it contains an error. In addition, above certain complexity threshold, probability of the error in the module is progressively rising. On the basis of this information, many software purchasers define a number of cycles (software module cyclicity, McCabe [8] 1) in order increase total reliability. On the other hand, complexity may be used directly to distribute testing attempts in input data by connecting complexity and number of errors, in order to aim testing to finding most probable errors ("lever" mechanism, [9]). In structural testing methodology, this distribution means to precisely determine number of testing paths needed for every software module being tested, which exactly is the cyclic complexity. Other usual criteria of "white box" testing has important flaw that may be fulfilled with small number of tests for arbitrary complexity (using any possible meaning of the "complexity") [10]. The program correctness demonstration and the programming of correct programs are two similar theoretical problems, which are very meaningful in practice [11]. The first is resolved within the program analysis and the second within the program synthesis, although because of the connection that exists between the program analysis and the program synthesis it is noticed the reciprocal interference of the two processes. Nevertheless, when it is a mater of the automatic methods that are to prove the correctness and of the McCabe, measure based on a number and structure of the cycle. 1 UbiCC Journal – Volume 4 No. 3 619 Special Issue on ICIT 2009 Conference - Bioinformatics and Image methods of automatic program synthesis, the difference between them is evident. In reference [12] in describes the initial possibility of automatic synthesis of simple programs using the resolution procedure of automatic demonstration of theorems (ADT), more precisely with the resolution procedure of deduction of answer to request. The demonstration that the request that has a form of (∃x)W(x) is the logical consequence of the axioms that determinate the predicate W and determinate (elementary) program operators provides that the variable x in the response obtains the value that represents the requested composition of (elementary) operators, i.e. the requested program. The works of Z. Mann, observe in detail the problems of program analysis and synthesis using the resolution procedure of demonstration and deduction of the response. The different research tendency is axiomatic definition of the semantics of the program language Pascal in the form of specific rules of the program logic deduction, described in the works [14,15]. Although the concepts of the two mentioned approaches are different, they have the same characteristic. It is the deductive system on predicate language. In fact, it is a mater of realization in the special predicate computation that is based on deduction in formal theory. With this, the problem of program correctness is to be related to automatic checkup of (existing) demonstrations regarding mathematical theorems. The two approaches mentioned above and their modifications are based on that kind of concept. 2. DESCRIPTION OF METHOD FOR ONE PASSAGE SYMBOLIC TESTING PROGRAM The method is based on transformation of given Pascal program, into a sequence of prologue clauses, which comprise axiomatic base for functioning of deductive resolution mechanism in a BASELOG system [10] . For given Pascal program, by a single passage through resolution procedure of BASELOG system, all possible outputs in Pascal program are obtained in a symbolic shape, together with paths leading to every one of them. Both parts, transformation and resolution one, are completely automated and are naturally attached to each other. When a resolution part has finished, a sequence of paths and symbolic outputs is reading out for given input Pascal program. This is a transformation of programming structures and programming operators into a sequence of clauses, being realized by models depending on concrete programming language. Automation covers branching IF-THEN and IF-THEN-ELSE structures, as well as WHILEDO and REPEAT – UNTIL cyclic structures, which may be mutually nested in each other. This paper gives review of possibilities in work with single-dimension sequences and programs within a Pascal program. Number of passages through cyclic structures must be fixed in advance using counter. During the testing process of given (input) Pascal program both parts are involved, transformation and resolution, in a following way: Transformation part • ends function by a sequence of clauses, or • demands forced termination, depending on input Pascal program. Impossibility of generating a sequence of clauses in transformation part points that a Pascal program has no correct syntax, i.e. that there are mistakes in syntax or in logical structure (destructive testing). In this case, since axiomatic base was not constructed, resolution part is not activated and user is prompted to mend a Pascal program syntax. In the case that transformation part finishes function by generating a sequence of clauses, resolution part is activated with following possible outcomes: Ra) ends function giving a list of symbolic outputs and corresponding Pascal program routes, or Rb) ends by message that id could not generate list of outputs and routes, or Rc) doesn't end function and demands forced termination. Ra) By comparing symbolic outputs and routes with specification, the user may • declare a given Pascal program as correct, if outputs are in accordance to specification (constructive testing), or • if a discrepancy of some symbolic expression to specification has been found, this means that there is a semantic error in a Pascal program (destructive testing) at the corresponding route. Rb) Impossibility to generate a list of symbolic expressions in resolution part, which means that there is a logical-structural error in a Pascal program (destructive testing). Rc) Too long function or a (unending) cycle means that there is a logic and/or semantic error in a Pascal program (destructive testing). In this way, by using this method, user may be assured in correctness of a Pascal program or in presence of syntax and/or logic-structure semantic errors. As opposite to present methods of symbolic testing of the programs, important feature of this method is single-passage, provided by specific property of OL – resolution [11] with marked literals, at which a resolution module in BASELOG system is founded. UbiCC Journal – Volume 4 No. 3 620 Special Issue on ICIT 2009 Conference - Bioinformatics and Image 3. DEDUCTION IN FORMAL THEORY AND PROGRAM CORRECTNESS The program verification may lean on techniques for automatic theorem proving. These techniques embody principles of deductive reasoning, same ones that are used by programmers during program designation. Why not use same principles in the automatic synthesis system, which may construct program instead of merely proving its correctness? Designing the program demands more originality and more creativity than proving its correctness, but both tasks demand the same way of thinking. [13] Structural programming itself helped the automatic synthesis of computer programs in the beginning, establishing principles in program development on the basis of specification. These principles should be guidelines for programmers. In the matter of fact, advocates of structural programming were very pessimistic regarding possibility to ever automatize their techniques. Dijkstra went so far to say that we should not automatize programming even if we could, since this would deprive this job from all delight. Proving program correctness is a theoretical problem with much practical importance, and is done within program analyse. Related theoretical problem is the design of correct programs that is solved in another way – within program synthesis. It is evident that these processes are intertwined, since analysis and synthesis of programs are closely related. Nevertheless, differences between these problems are distinct regarding automatic method of proving program correctness and automatic method of program synthesis. If we observe a program, it raises question of termination and correctness, and if we observe two programs we have question of equivalence of given programs. Abstract, i.e. non-interpreted program is defined using pointed graph. From such a program, we may obtain partially interpreted program, using interpretation of function symbols, predicate symbols and constant symbols. If we interpret free variables into partially interpreted program, a realized program is obtained. Function of such a program is observed using sequence executed. Realized program, regarded as deterministic, has one executive sequence, and if it does not exist at all, it has no executive sequence. On the other hand, when the program is partially interpreted, we see several executive sequences. In previously stated program type, for every predicate interpreted it is known when it is correct and when not, which would mean that depending on input variables different execution paths are possible. Considering abstract program, we conclude that it has only one executive sequence, where it is not known whether predicate P or his negation is correct. According to The basic presumptions of programming logic are given in [14]. The basic relation {P}S{Q}is a specification for program S with following meaning: if predicate P at input is fulfilled (correct) before execution of program S, then predicate Q at the output is fulfilled (correct) after execution of program S. In order to prove correctness of program S, it is necessary to prove relation {P}S{Q}, where input values of variables must fulfill predicate P and output variable values must fulfill predicate Q. Since it is not proven that S is terminating, and that this is only presumption, then we may say that partial correctness of the program is defined. If it is proven that S terminates and that relation {P}S{Q} is fulfilled, we say that S is completely correct. For program design, we use thus determined notion of correctness. The basic idea is that program design should be done simultaneously with proving correctness of the program for given specifications[15,16]. First the specification {P}S{Q} is executed with given prerequisite P and given resultant post condition Q, and then subspecifications of {Pi}Si{Qi} type are executed for components Si from which the program S is built. Special rules of execution provide proof that fulfillment of relation {P}S{Q} follows from fulfillment of relations {Pi}Si{Qi} for component programs Si. Notice that given rules in [9] are used for manual design and manual confirmation of program's correctness, without mention about possibility of automatic (resolution) confirmation methods.If we wish to prove correctness of program S, we must prove relation {P}S{Q}, where input values of variables must fulfill the formula P and output values of variables must fulfill the formula Q. This defines only partial correctness of program S, since it is assumed that program S terminates. If we prove that S terminates and that relation {P}S{Q} is satisfied, we say that S is totally correct.Thus designated principle of correctness is used for program designation. Designation starts from specification {P}S{Q} with given precondition P and given resulting postcondition Q.Formula {P}S{Q} is written as K(P, S, Q), where K is a predicate symbol and P,S,Q are variables of first-order predicate calculation. {Pzy} z := y {P} we are writing as K(t(P,Z,Y), d(Z,Y), P)... where t,d are function symbols and P,Z,Y are variables; ...Rules R(τ): P1. . {P}S{R} , R⇒Q. {P}S{Q}. we write.. K(P,S,R) ∧ Im(R,Q) ⇒ K(P,S,Q) UbiCC Journal – Volume 4 No. 3 621 Special Issue on ICIT 2009 Conference - Bioinformatics and Image where Im (implication) is a predicate symbol, and P, S, R, Q are variables; P2 R⇒P, {P}S{Q} {R}S{Q}. we write Im(R,P) ∧ K(P,S,Q) ⇒ K(R,S,Q) P3 {P}S1{R} , {R}S2{Q} {P}begin S1; S2 end {Q} K(P,S1,R) ∧ K(R,S2,Q) ⇒ K(P,s(S1,S2),Q) where s is a function symbol, and P, S1, S2, R, q are variables P4 {P∧B}S1{Q, {P∧~B}S2{Q} {P}if B then S1 else S2{Q} K(k(P,B),S1,Q)∧K(k(P,n(B)),S2,Q) ⇒ K(P,ife(B,S1,S2),Q) where k, n, ife are function symbols P5 {P∧B}S{Q} , P∧~B ⇒ Q {P} if B then S{Q} K(k(P,B),S,Q) ∧ Im(k(P,n(B)),Q) ⇒ K(P,if(B,S),Q) where k, n, if are function symbols P6 {P∧B} S {P } A(S) = R(τ) ∪A(τ). This means that derivation of theorem B within theory τ could be replaced with derivation within special predicate calculus S, whose own axioms A(S)= R(τ) ∪A(τ).Axioms of special predicate calculus S are: A(S)= A(τ) ∪R(τ).We assume that s is a syntax unit whose (partial) correctness is being proven for certain input predicate U and output predicate V. Within theory S is being proved ... ⎥⎯ (∃P)(∃Q)K(P,s,Q) S where s is a constant for presentation of a given program. Program is written in functional notation with symbols: s (sequence), d (assigning), ife (ifthen-else), if (if-then), wh (while), ru (repeat-until). To starting set of axioms A(S), negation of statement is added: Result of negation using resolution procedure is as follows: /Im(Xθ,Yθ,)∨ Odgovor(Pθ,Qθ), where Xθ,Yθ,Pθ,Qθ are values for which successful negation To means that for these values a proof is found. But this does not mean that given program is partially correct. It is necessary to establish that input and output predicates U, V are in accordance with Pθ, Qθ, and also that Im (Xθ,Yθ) is really fulfilled for domain predicates ant terms.Accordance means confirmation that .. is valid. : U ⇒ Pθ, ∧ Qθ ⇒ V) ∧ ( Xθ ⇒ Yθ).there are two ways to establish accordance: manually or by automatic resolution procedure. Realization of these ways is not possible within theory S, but it is possible within the new theory, which is defined by predicates and terms which are part of the program s and input-output predicates U, V. Within this theory U, P, Q, V, X, Y are not variables, but formulae with domain variables, domain terms and domain predicates.This method concerns derivation within special predicate calculus based on deduction within the formal theory. Thus the program's correctness problem is associated with automatic proving of (existing) proofs of mathematical theorems. The formal theory τ is determined with the formulation of (S(τ), F(τ), A(τ), R(τ)) where S is the set of symbols (alphabet) of the theory τ, F is the set of formulas (regular words in the alphabet S), A is the set of axioms of the theory τ (A⊂F), R is the set of rules of execution of the theory τ. Deduction (proof) of the formula B in the theory τ is the final sequence B1, B2, ... , Bn (Bn is B) of formulas of this theory, of that kind that for every element Bi of that sequence it is valid: Bi is axiom, or Bi is deducted with the application of some rules of deduction Ri∈R from some preceding elements {P} while B do S {P∧~B} K(k(P,B),S,P) ⇒ K(P,wh(B,S),k(P,n(B))) where k, n, wh are function symbols P7 {P}S{Q} , Q∧~B ⇒ P {P}repeat S until B {Q∧B} K(P,S,Q) ∧ Im(k(Q,n(B)),P) ⇒ K(P,ru(S,B),k(Q,B)) where k, n, ru are function symbols Transcription of other programming logic rules is also possible. Axiom A(τ): A1 K(t(P,Z,Y),d(Z.Y),P) assigning axiom Formal theory τ is given by (α(τ), F(τ), A(τ), R(τ)), where α is a set of symbols (alphabet) of theory τ, F is a set of formulae (correct words in alphabet α), A is a set of axioms for theory τ(A⊂F), R is a set of derivation rules for theory τ.B is a theorem within theory τ if and only if B is possible to derive within calculus k from set R(τ) ∪A(τ) (k is a first-order predicate calculus).Let S be special predicate calculus (first-order theory) with it's own axioms UbiCC Journal – Volume 4 No. 3 622 Special Issue on ICIT 2009 Conference - Bioinformatics and Image of that sequence. It is said that B is the theorem of the theory τ and we write ⎯ τ B [17]. Suppose S(τ) is a set of symbols of predicate computation and F(τ) set of formulas of predicate computation. In that case, the rules of deduction R( τ) can be written in the form: Bi1∧Bi2∧ ... ∧Bik ⇒ Bi (Ri) where Bik, Bi are formulas from F(τ). Suppose κ predicate computation of first line, than it is valid: R(τ), A(τ) ⎯ B if ⎯ B (1) Regarding technique of proving, most investigations resolution rules of derivation. important derivation rule property. automatic theorems have been done in Resolution is a very with completeness Demonstrate that the mentioned sequence is deduction of formula B in theoryτ. One way of solving this problem is to verify that the given sequence corresponds to definition of deduction in theoryτ. The other way is to use (1), i.e. (2): If we demonstrate that R(τ), A(τ) ⎯ B ∧ B2 ∧ ... ∧ Bn (3) κ τ B is theorem in the theory τ if and only if B is deductible in computation κ from the set R(τ) ∪ A(τ). Suppose S is a special predicate computation (theory of first line) with its own axioms: A(S) = R(τ) ∪ A(τ) , (rules of deduction in S are rules of deduction of computation κ) then it is valid A(S) ⎯ B if ⎯ B , so that (1) can be written: S κ κ that is sufficient for conclusion that B1, B2, ... , Bn is deduction in τ . And also it is sufficient to demonstrate that R(τ), A(τ) ⎯ Bi , for i = 1,2,...,n, with this it is κ ⎯ B if ⎯ B S τ (2) demonstrated (3). Demonstration for (3) can be deducted with the resolution invalidation of the set R(τ)∪A(τ)∪{~B1∨~B2∨ ... ∨~Bn}, or with n invalidations of sets R(τ)∪A(τ)∪{~Bi}. Notice that for the conclusion that B1, B2, ..., Bn is deduction in τ it is not enough to demonstrate R(τ), That means that the deduction of theorem B in theory τ can be replaced with deduction in special predicate computation S, that has its own axioms A(S) = R(τ) ∪ A(τ). Now we can formulate the following task: A(τ) ⎯ (B1 ∧ B 2 ∧ ... ∧ Bn - 1 ⇒ Bn ) ,i.e. it is κ The sequence of formulas has been given B1, B2, ... , Bn (Bn is B, Bi different from B for i<n) of theory τ. Implementation of programs for proving theorems was in the beginning only in mathematics area. When it was seen that other problems could be presented as possible theorems which need to be proven, application possibilities were found for areas as program correctness, program generating, question languages over relation databases, electronic circuits design. As for formal presentation where theorem is being proven, it could be statement calculus, firstorder predicate calculus, as well as higher-order logic. Theorems in statement calculus are simple for contemporary provers, but statement calculus is not expressional enough. Higher-order logic is extremely expressional, but they have a number of practical problems. Therefore a first-order predicate calculus is probably the most used one. not enough to realize resolution invalidation of the set R(τ)∪A(τ)∪{B1, B2, ... , Bn-1}∪{~Bn}, because this demonstrate only that Bn is deductible in τ supposing that in τ is deductible B1∧B2∧...∧ Bn-1 . Always when B1, B2, ..., Bn is really deduction in τ, (B1∧B2∧...∧Bn-1 ⇒ Bn) will be correct, but vice versa is not always valid. It can happen that (B1∧B2∧...∧Bn-1 ⇒ Bn) is deductible in τ, but that B1∧B2∧...∧Bn-1 is not deductible in τ, (see example 1’). And also, the demonstration for R(τ), A(τ) ⎯ Bn , that can be realized with resolution invalidation of the set R(τ)∪A(τ)∪{~Bn}, means that Bn is theorem in τ, i.e. that Bn is deductible in τ, but this is not enough for the conclusion that B1, B2, ..., Bn is deduction in τ (except for the case that κ UbiCC Journal – Volume 4 No. 3 623 Special Issue on ICIT 2009 Conference - Bioinformatics and Image in the invalidation appears invalidation of each Bi, see example 1”). Finally, here it is necessary to underline that not correspondence to set R(τ)∪A(τ)∪{~Bn} does not mean not correspondence to formula ~Bn as it is, but only in the presence of R(τ)∪A(τ). Example 1. Suppose A(τ) is: {r(1,1), r(1,3)} and R( τ) contains three rules of deduction: α:r(m,n)⇒r(n,m); β: r(m,n)⇒r(m+1,n+1) ; γ :r(m,n)∧r(n,p)⇒r(n,p) symmetry correspondence with the next one transitivity Demonstrate that the sequence J is: r(3,1), r(4,2), r(5,3), r(5,1) , r is predicate symbol, one correct deduction of formula r(5,1) in theory τ. It is sufficient to demonstrate: {α,β,γ}, A(τ) ⎯ J . 7.sode, 1.lateral : =(S(2),3)& LEVEL=5;resolvent: ~r(3,1)~r(4,2)/~r(5,3)~r(X1,2)~=(S(X1),5)& 9.side, 1.lateral : =(S(4),5)& LEVEL= 6; resolvent: ~r(3,1)~r(4,2)& 4.side, 4.lateral : ~r(X1,Y1)~=(S(X1),U1)~=(S(Y1),V1)r(U1,V1)& LEVEL= 7; resolvent: ~r(3,1)/~r(4,2)~r(X1,Y1)~=(S(X1),4)~=(S(Y1),2)& 6.side, 1.lateral : =(S(1),2)& LEVEL= 8; resolvent: ~r(3,1)/~r(4,2)~r(X1,1)~=(S(X1),4)& 8.side, 1.lateral : =(S(3),4)& LEVEL= 9; resolvent: ~r(3,1)& 2.side, 1.lateral : r(3,1)& LEVEL= 10; resolvent: & DEMONSTRATION IS PRINTED Example 2 begin p:=x; i:=0; while i<=n do begin i:=i+1; p:=p*i; end; end. Given program is written: s(s(d(p,x),d(i,0)),w(i<=n,s(d(i,i+1),d(p,p*i)))) Constant b is a mark for predicate i<=n constant t1 is i+1, constant t2 is term p*i thus we obtain s(s(d(p,x),d(i,0)),w(b,s(d(i,t1),d(p,t2)))) 1 /O(X1,V1)~K(X1,s(h,g),Y1)~K(Y1,w(b,s(d(i,t1),d( p,t2))),V1)& 8 ~K(Y1,d(p,x),V1)K(Y1,h,V1)& /reserve for shortening the note ~K(Y1,d(i,0),V1)K(Y1,g,V1)& /reserve for shortening the note ~K(X1,Y1,U1)~K(U1,Y2,V1)K(X1,s(Y1,Y2),V1) & /sequence rule ~K(k(X1,V2),U0,X1)K(X1,w(V2,U0),k(X1,ng(V2) ))& /rule for while K(t(X1,Z1,Y1),d(Z1,Y1),X1)& /assigning axiome ~IM(X2,Y1)~K(Y1,U0,V1)K(X2,U0,V1)& /consequence rule κ In the next demonstration x+1 is signed with S(x) and axioms for ‘the next one’ are added: 1 ~r(3,1)~r(4,2)~r(5,3)~r(5,1)& 9 r(1,1)& r(3,1)& ~r(X1,Y1)r(Y1,X1)& ~r(X1,Y1)~=(S(X1),U1)~=(S(Y1),V1)r(U1,V1)& ~r(X1,Y1)~r(Y1,Z1)r(X1,Z1)& =(S(1),2)& =(S(2),3)& =(S(3),4)& =(S(4),5)& Demonstration with invalidation: number of generated resolvents = 934 maximum level = 10 DEMONSTRATION IS PRINTED level on which empty composition is generated = 10 LEVEL=1; central composition :~r(3,1)~r(4,2)~r(5,3)~r(5,1)& 5.side, 3.lateral : ~r(X1,Y1)~r(Y1,Z1)r(X1,Z1)& LEVEL= 2; resolvent: ~r(3,1)~r(4,2)~r(5,3)/~r(5,1)~r(5,Y1)~r(Y1,1)& 2.side, 1.lateral : r(3,1)& LEVEL= 3; resolvent: ~r(3,1)~r(4,2)~r(5,3)& 4.side, 4.lateral : ~r(X1,Y1)~=(S(X1),U1)~=(S(Y1),V1)r(U1,V1)& LEVEL= 4; resolvent: ~r(3,1)~r(4,2)/~r(5,3)~r(X1,Y1)~=(S(X1),5)~= (S(Y1),3)& UbiCC Journal – Volume 4 No. 3 624 Special Issue on ICIT 2009 Conference - Bioinformatics and Image ~IM(Y1,V1)~K(X1,U0,Y1)K(X1,U0,V1)& / consequence rule ~O(X1,V1)& / negation addition 0 0 LP system generates next negation number of resolvents generated = 10 maximal obtained level = 11 DEMONSTRATION IS PRINTED level where the empty item is generated = 11 LEVEL=1; central item :/O(X1,V1)~K(X1,s(h,g),Y1)~K(Y1,w(b,s(d(i,t1),d (p,t2))),V1)& 4.lateral, 2.literal : ~K(k(X1,V2),U0,X1)K(X1,w(V2,U0),k(X1,ng(V2) ))& LEVEL= 2; resolvent: /O(X1,k(X0,ng(b)))~K(X1,s(h,g),X0)/~K(X0,w(b,s (d(i,t1),d(p,t2))),k(X0,ng(b)))~K(k(X0,b),s(d(i,t1),d (p,t2)),X0)& 3.lateral, 3.literal : ~K(X1,Y1,U1)~K(U1,Y2,V1)K(X1,s(Y1,Y2),V1) & LEVEL= 3; resolvent: /O(X1,k(V1,ng(b)))~K(X1,s(h,g),V1)/~K(V1,w(b,s (d(i,t1),d(p,t2))),k(V1,ng(b)))/~K(k(V1,b),s(d(i,t1), d(p,t2)),V1)~K(k(V1,b),d(i,t1),U1)~K(U1,d(p,t2),V 1)& 5.lateral, 1.literal : K(t(X1,Z1,Y1),d(Z1,Y1),X1)& LEVEL= 4; resolvent: /O(X1,k(X0,ng(b)))~K(X1,s(h,g),X0)/~K(X0,w(b,s (d(i,t1),d(p,t2))),k(X0,ng(b)))/~K(k(X0,b),s(d(i,t1), d(p,t2)),X0)~K(k(X0,b),d(i,t1),t(X0,p,t2))& 6.lateral, 3.literal : ~IM(X2,Y1)~K(Y1,U0,V1)K(X2,U0,V1)& LEVEL= 5; resolvent: /O(X1,k(X0,ng(b)))~K(X1,s(h,g),X0)/~K(X0,w(b,s (d(i,t1),d(p,t2))),k(X0,ng(b)))/~K(k(X0,b),s(d(i,t1), d(p,t2)),X0)/~K(k(X0,b),d(i,t1),t(X0,p,t2))~IM(k(X 0,b),Y1)~K(Y1,d(i,t1),t(X0,p,t2))& 5. lateral, 1.literal : K(t(X1,Z1,Y1),d(Z1,Y1),X1)& LEVEL= 6; resolvent: /~IM(k(X0,b),t(t(X0,p,t2),i,t1))/O(X1,k(X0,ng(b))) ~K(X1,s(h,g),X0)& 3.lateral, 3.literal : ~K(X1,Y1,U1)~K(U1,Y2,V1)K(X1,s(Y1,Y2),V1) & LEVEL= 7; resolvent: /~IM(k(V1,b),t(t(V1,p,t2),i,t1))/O(X2,k(V1,ng(b)))/ ~K(X2,s(h,g),V1)~K(X2,h,U1)~K(U1,g,V1)& 2.lateral, 2.literal : ~K(Y1,d(i,0),V1)K(Y1,g,V1)& LEVEL= 8; resolvent: /~IM(k(V0,b),t(t(V0,p,t2),i,t1))/O(X2,k(V0,ng(b)))/ ~K(X2,s(h,g),V0)~K(X2,h,Y1)/~K(Y1,g,V0)~K(Y 1,d(i,0),V0)& 5.lateral, 1.literal : K(t(X1,Z1,Y1),d(Z1,Y1),X1)& LEVEL= 9; resolvent: /~IM(k(X1,b),t(t(X1,p,t2),i,t1))/O(X2,k(X1,ng(b)))/ ~K(X2,s(h,g),X1)~K(X2,h,t(X1,i,0))& 1 lateral, 2.literal : ~K(Y1,d(p,x),V1)K(Y1,h,V1)& LEVEL= 10; resolvent: /~IM(k(X1,b),t(t(X1,p,t2),i,t1))/O(Y1,k(X1,ng(b)))/ ~K(Y1,s(h,g),X1)/~K(Y1,h,t(X1,i,0))~K(Y1,d(p,x), t(X1,i,0))& 5.lateras, 1.literal : K(t(X1,Z1,Y1),d(Z1,Y1),X1)& LEVEL= 11; resolvent: /O(Y1,k(X1,ng(b)))/~K(Y1,s(h,g),X1)/~K(Y1,h,t(X 1,i,0))~K(Y1,d(p,x),t(X1,i,0))& 5. lateral, 1.literal : K(t(X1,Z1,Y1),d(Z1,Y1),X1)& LEVEL= 11; resolvent: DEMONSTRATION IS PRINTED Now we need to prove compliance, i.e. that there is in effect: ( Xθ ⇒ YθZθTθ ) ∧ (U ⇒ Pθ, ∧ Qθ ⇒ V) that is at LEVEL= 12; resolvent: /IM(k(X1,b),t(t(X1,i,t1),p,t2))O(t(t(X1,i,0),p,x),k(X 1,ng(b)))& By getting marks to domain level we obtain: (X1 ∧ (i<=n) ⇒ X1ii+1 pp/i)i (U⇒X1i 0 px) ∧ (X1 ∧ ¬ (i<=n) ⇒ V)7 Putting X1: p = x ⋅ ∏ ( j − 1) j =0 i we obtain following correct implications: UbiCC Journal – Volume 4 No. 3 625 Special Issue on ICIT 2009 Conference - Bioinformatics and Image p = x ⋅ ∏ ( j − 1) => j =0 i p ⋅ i = x ⋅ i ⋅ ∏ ( j − 1) => j =0 i p ⋅ i = x ⋅ (i − 1 + 1) ⋅ ∏ ( j − 1) => j =0 i p ⋅ i = x ⋅ ∏ ( j − 1) j =0 i +1 For i = 0 we obtain: p = x ⋅ ∏ ( j − 1) => j =0 0 i p = x ⋅ ∏ ( j − 1) => j =0 p=x By this the compliance is proven, which is enough to conclude that a given program is (partially) correct (until terminating). 5. INTERPRETATION RELATED TO DEMONSTRATION OF PROGRAM CORRECTNESS Interpret the sequence J: B1, ... , Bn as program S. Interpret the elements A(τ) as initial elements for the composition of program S, and the elements R( τ) as rules for the composition of program constructions. Vice versa, if we consider program S as sequence J, initial elementary program operators as elements A(τ) and rules for composition of program structures as elements R(τ), with this the problem of verification of the correctness of the given program is related to demonstration of correctness of deduction in corresponding formal theory. It is necessary to represent axioms, rules and program with predicate formulas. With all that is mentioned above we defined the general frame for the composition of concrete proceedings for demonstration of program correctness with the deductive method. With the variety of choices regarding axioms, rules and predicate registration for the different composition proceedings are possible. 6. CONCLUSION Software testing is the important step in program development. Software producers would like to predict number of errors in software systems before the application, so they could estimate quality of product bought and difficulties in maintenance process [18]. Testing often takes 40% of time needed for development of software package, which is the best proof that it is a very complex process. Aim of testing is to establish whether software is behaving in the way envisaged by specification. Therefore, primary goal of software testing is to find errors. Nevertheless, not all errors are ever found, but there is a secondary goal in testing, that is to enable a person who performs testing (tester) to trust the software system [19]. From these reasons, it is very important to choose such a testing method that will, in given functions of software system, find those fatal errors that bring to highest hazards. In order to realize this, one of tasks given to programmers is to develop software that is easy to test ("software is designed for people, not for machines") [20]. Program testing is often equalized to looking for any errors [20]. There is no point in testing for errors that probably do not exist. It is much more efficient to think thoroughly about kind of errors that are most probable (or most harmful) and then to choose testing methods that will be able to find such errors. Success of a set of test items is equal to successful execution of detailed test program. One of big issues in program testing is the error reproduction (testers find errors and programmers remove bugs) [21]. It is obvious that there must be some coordination between testers and programmers. Error reproduction is the case when it would be the vest to do a problematic test again and to know exactly when and where error occurred. Therefore, there is no ideal test, as well as there is no ideal product.[22] .Software producers would like to anticipate the number of errors in software systems before their application in order to estimate the quality of acquired program and the difficulties in the maintenance. This work gives the summary and describes the process of program testing, the problems that are to be resolved by testers and some solutions for the efficacious elimination of errors[23]. The testing of big and complex programs is in general the complicated process that has to be realized as systematically as possible, in order to provide adequate confidence and to confirm the quality of given application [24]. The deductions in formal theories represent general frame for the development of deductive methods for the verification of program correctness. This frame gives two basic methods (invalidation of added predicate formula and usage of rules of program logic) and their modifications. The work with formula that is added to the given program implies the presence of added axioms and without them, the invalidation cannot UbiCC Journal – Volume 4 No. 3 626 Special Issue on ICIT 2009 Conference - Bioinformatics and Image be realized. The added axioms describe characteristics of domain predicates and operations and represent necessary knowledge that is to be communicated to the deductive system. The existing results described above imply that kind of knowledge, but this appears to be notable difficulty in practice. ACKNOWELDGEMENTS The work presented in the paper was developed within the IT Project “WEB portals for data analysis and consulting,” No. 13013, supported by the government of Republic of Serbia, 2008. – 2010. 7. REFERENCES [1] Marks, David M. “Testing very big systems” New York:McGraw-Hill, 1992 [2] Manthos A., Vasilis C., Kostis D. “Systematicaly Testing a Real-Time Operating System“ IEEE Trans. Software Eng., 1995 [3] Voas J., Miller W. “Software Testability: The New Verification“ IEEE Software 1995 [4] Perry William E. “Year 2000 Software Testing“ New York: John Wiley& SONS 1999 [5] Whittaker J.A., Whittaker, Agrawal K. “A case study in software reliability measurement“ Proceedinga of Quality Week, paper no.2A2, San Francisko, USA 1995 [6] Zeller A. “Yesterday, my program worked, Today, it does not. Why?”Passau Germany, 2000 [7] Markoski B., Hotomski P., Malbaski D., Bogicevic N. “Testing the integration and the system“, International ZEMAK symposium, Ohrid, FR Macedonia, 2004. [8] McCabe, Thomas J, &Butler, Charles W. “Design Complexity Measurement and Testing “Communications of the ACM 32, 1992 [9] Markoski B., Hotomski P., Malbaski D. “Testing the complex software“, International ZEMAK symposium, Ohrid, FR Macedonia, 2004. [10] Chidamber, S. and C. Kemerer, “Towards a Metrics Suite for Object Oriented Designe”, Proceedings of OOPSLA, July 2001 [11] J.A. Whittaker, “What is Software Testing? And Why Is It So Hard?” IEEE Software, vol. 17, no. 1, 2000, [12] Nilsson N., “Problem-Solving Methods in Artificial Intelligence “, McGraw-Hill, 1980 [13] Manna Z., “Mathematical Theory of Computation “, McGraw-Hill, 1978 [14] Floyd. R.W., “Assigning meanings to programs “, In: Proc. Sym. in Applied Math.Vol.19, Mathematical Aspects of Computer Science, Amer. Math. Soc., pp. 1932., 1980. [15] Hoare C.A.R. “Proof of a program” Find Communications of the ACM 14, 39-45. 1971. [16] Hoare C.A.R, Wirth N., “An axiomatic definition of the programming language Pascal “, Acta Informatica 2, pp. 335-355. 1983 [17] Markoski B., Hotomski P., Malbaski D., Obradovic D. “Resolution methods in proving the program correctness “, YUGER, An international journal dealing with theoretical and computational aspects of operations research, systems science and menagement science, Beograd, Serbia, 2007, [18] Myers G.J., “The Art of Software Testing, New York ” , Wiley, 1979. [19] Chan, F., T. Chen, I. Mak and Y. Yu, “Proportional sampling strategy: Guidelines for software test practitioners “, Information and Software Technology, Vol. 38, No. 12, pp. 775-782, 1996. [20] K. Beck, “Test Driven Development: By Example”, Addison-Wesley, 2003 [21] P. Runeson, C. Andersson, and M. Höst, “Test Processes in Software Product Evolution—A Qualitative Survey on the State of Practice”, J. Software Maintenance and Evolution, vol. 15, no. 1, 2003, pp. 41–59. [22] G. Rothermel et al., “On Test Suite Composition and Cost-Effective Regression Testing”, ACM Trans. Software Eng. and Methodology, vol. 13, no. 3, 2004, pp. 277–33 [23] N. Tillmann and W. Schulte, “Parameterized Unit Tests”, Proc. 10th European Software Eng. Conf., ACM Press, 2005, pp. 253–262. [24] Nathaniel Charlton “Program verification with interacting analysis plugins” Formal Aspects of Computing. London: Aug 2007. Vol. 19, Iss. 3; p. 375 UbiCC Journal – Volume 4 No. 3 627 Special Issue on ICIT 2009 Conference - Bioinformatics and Image Detecting Metamorphic viruses by using Arbitrary Length of Control Flow Graphs and Nodes Alignment Essam Al daoud Zarka Private University, Jordan essamdz@zpu.edu.jo Ahid Al-Shbail Al al-bayt University, Jordan ahid_shbail@yahoo.com Adnan M. Al-Smadi Al al-bayt University, Jordan smadi98@aabu.edu.jo ABSTRACT Detection tools such as virus scanners have performed poorly, particularly when facing previously unknown virus or novel variants of existing ones. This study proposes an efficient and novel method based on arbitrary length of control flow graphs (ALCFG) and similarity of the aligned ALCFG matrix. The metamorphic viruses are generated by two tools; namely: next generation virus creation kit (NGVCK0.30) and virus creation lab for Windows 32 (VCL32). The results show that all the generated metamorphic viruses can be detected by using the suggested approach, while less than 62% are detected by well-known antivirus software. Keywords: metamorphic measurement. 1. INTRODUCTION Virus writers use better evasion techniques to transform their virus to avoid detection. For example, polymorphic and metamorphic are specifically designed to bypass detection tools. There is strong evidence that commercial antivirus are susceptible to common evasion techniques used by virus writers[1]. Metamorphic Virus can reprogram itself. it use code obfuscation techniques to challenge deeper static analysis and can also beat dynamic analyzers by altering its behavior, it does this by translating its own code into a temporary representation, edit the temporary representation of itself, and then write itself back to normal code again. This procedure is done with the virus itself, and thus also the metamorphic engine itself undergoes changes. Metamorphic viruses use several metamorphic transformations, including Instruction reordering, data reordering, inlining and outlining, register renaming, code permutation, code expansion, code shrinking, Subroutine interleaving, and garbage code insertion. The altered code is then recompiled to create a virus executable that looks fundamentally different from the original. For example, The source code of the metamorphic virus Win32/Simile is approximately 14,000 lines of assembly code. The metaphoric engine itself takes up approximately 90% of the virus code, which is extremely powerful[2]. virus, antivirus, control flow graph, similarity W32/Ghost contains many procedures and generates huge number of metamorphic viruses, it can generate at least 10! = 3,628,800 variations[3]. In this paper, we develop a methodology for detecting metamorphic virus in executables. we have initially focused our attention on viruses and simple entry point infection. However, our method is general and can be applied to any malware and any obfuscated entry point. 2. RELATED WORKS Lakhotia, Kapoor, and Kumar believe that antivirus technologies could counter attack using the same techniques that metamorphic virus writers use; identify similar weak spots in metamorphic viruses [4]. Geometric detection is based on modifications that a virus has made to the file structure. Peter Szor calls this method shape heuristics because is far from exact and prone to false positives [5]. In 2005 Ando, Quynh, and Takefuji introduced a resolution based technique for detecting metamorphic viruses. In their method, scattered and obfuscated code is resolved and simplified to several parts of malicious code. Their experiment showed that compared with emulation, this technique is effective for metamorphic viruses which apply anti-heuristic techniques, such as register substitution or UbiCC Journal – Volume 4 No. 3 Ubiquitous Computing and Communication Journal 628 Special Issue on ICIT 2009 Conference - Bioinformatics and Image permutation methods[6]. In 2006 Rodelio and others use code transformation method for undoing the previous transformations done by the virus. Code transformation is used to convert mutated instructions into their simplest form, where the combinations of instructions are transformed to an equivalent but simple form [7]. Mohamed and others use engine-specific scoring procedure that scans a piece of code to determine the likelihood [8]. Bruschi, Martignoni, and Monga proposed a detection method control flow graph matching. Mutations are eliminated through code normalization and the problem of detecting viral code inside an executable is reduced to a simpler problem[9]. Wong and Stamp experimented with Hidden Markov models to try to detect metamorphic malware. They concluded that in order to avoid detection, metamorphic viruses also need a degree of similarity with normal programs and this is something very challenging for the virus writer[10]. 3. THE PROPOSED METHOD This section introduces new procedures to extract partial control flow graph of any binary file. Two main points are considered during the development of the suggested algorithms, first point is to reorder the flow of the code by handling "jmp" and "call" instructions, and second point is to use one symbol for all alternatives and equivalent instructions. The output of Algorithm 1 is stored in the matrix ALCFG and contains arbitrary number of the nodes. Moreover the sequence of the nodes is represented by using symbols to be used in the similarity measurement. Algorithm 1: Construction of Arbitrary length of Control Flow Graph (ALCFG) Input: Disassembled portable executable file (x), the number of the file lines (n), the start location ( j), the required number of the nodes (m). Output: ALCFG m×m matrix and node sequence array NodeSeq contains m nodes Steps: 1- Call prepare op matrix (the size of op matrix is n×4) 2- Call prepare the matrices Labels and JumpTo (the size is c×2 and e×3) 3- Call Construct the matrix ALCFG Algorithm 2: Prepare op matrix (the size of op matrix is n×4) Input: Disassembled portable executable file (x), the number of the file lines (n), the start location (j), the required number of the nodes (m). Output: op matrix of size n×4 (this matrix contains the jump instructions and the labels) 1- Load the matrix op[n][4] from the file x. Where the opcode i, is stored at the row i, the column op[i][1] will be used to store the labels (for simplicity we will consider each label as an opcode), the column op[i][2] will be used to store the instructions (mov ,jmp, add,…). The column op[i][3] will be used to store the first operand, the column op[i][4] will be used to mark the rows that are processed, assume that default value is 0. 2- Delete the rows that do not contain label or jump instructions (jump instructions such as call, ret, jmp, ja, jz, je…). In this step a special action must be consider if the "ret" instruction is preceded directly by push instruction, in this case "ret" is replaced by "jmp" and its operand is replaced by the value which has pushed. 3- Rename all the conditional jump instructions to the names in the Table 1. 4- Add to the end of the matrix a row contains op[n+1][2]="end" 5- Delete the rows that contain inaccessible label (this means that op[i][3] does not equal to this label for all i) 6- Delete the rows that contain unreachable operand (this means that op[i][1] does not equal to this operand for all i) Algorithm 3: Prepare the matrices Labels and JumpTo Input: op matrix of size n×4 Output: The matrix Labels of size c×2 and the matrix JumpTo of size e×3 Do the following while count <= m If op[j][4]=1 then stack2.pop j if j = -1 then stack1.pop j if j= -1 then break else if op[j][2]="call" then stack1.push j+1; j=z+1 where op[z][1]= op[j][3] else if op[j][2]="ret" then stack1.pop j else if op[j][2]= "jmp" then j=z+1 where op[z][1]= op[j][3] else if op[j][2]="A" ,"N", .. or "L" then stack2.push z ,where op[z][1]= op[j][3] JumpTo [e][1]= op[j][3]; JumpTo [e][2]= m; JumpTo [e][3]= op[j][2] m=m+1;e=e+1; j=j+1 else if op[j][1] <> "null" then //label Labels[c][1]= op[j][1]; Labels[c][2]= m c=c+1; m=m+1; j=j+1 else if op[j][2]="end"and m<=count then stack2.pop j UbiCC Journal – Volume 4 No. 3 Ubiquitous Computing and Communication Journal 629 Special Issue on ICIT 2009 Conference - Bioinformatics and Image if j = -1 then break start: ….. pop esi sub esi, $-1-start push esi …. jne tsr_complete shl edi, 9 …. je tsr_complete tsr: int 3 call c000_rw pusha mov ecx, virsize call c000_ro tsr_complete: out …. 80h, al Algorithm 4:Construct the matrix ALCFG Input: The matrix Labels of size c×2 and the matrix JumpTo of size e×3 Output:ALCFG represented as m×m matrix and nodes sequence NodeSeq contains m nodes 1- Fill the upper minor diagonal of matrix ALCFG by 1 2- Fill the array NodeSeq by "K" // labels 3- for each row i in the matrix JumpTo x=JumpTo[i][2]; NodeSeq[x]= JumpTo[i][3] for each row j in the matrix Labels if JumpTo[i][1]= Labels[j][1] then y= Labels[j][2] ; ALCFG[x][y]=1 Figure 1: part from Z0mbie III N A tsr_complete tsr_complete c000_rw c000_ro __cycle_1 __mz 0 0 0 0 0 0 0 0 0 0 0 0 0 - Table 1: the instructions and corresponding symbols Instructions Symbol JE, JZ, JP, JPE JNE,JNZ JNP, JPO JA, JNBE, JG, JNLE JAE,JNB,JNC, JGE, JNL JB, JNAE, JC, JL, JNGE JBE, JNA, JLE,JNG JO, JS JNO, JNS, JCXZ, JECXZ LOOP LABEL GAP A R N D E Q G H I L P K M tsr call call H E tsr_complete restore_program A A N cf8_io __exit restore_program __exit Figure 2: the op matrix 1 2 3 4 9 10 11 12 14 15 17 19 20 tsr_complete tsr_complete __cycle_1 __mz __exit restore_program __exit __exit __exit __mz __exit __exit __cycle_2_next N A H E A A N Q G A N I G All above algorithms can be implemented very fast and can be optimized. The worst case of algorithm 2 is 5n where n is the number of the lines in the disassembled file, the worst case of algorithm 3 is n and the worst case of algorithm 4 is (m 2) 2 where m ≤ n. Therefore; the total complexity of algorithm 1 is O(n)+O(m2). Definition 1: A skeleton signature of a binary file is the nodes sequence NodeSeq and the matrix ALCFG. To illustrate the previous procedures; consider the input is the virus Z0mbie III, where Figure 1 is part from the source code of Z0mbie III, Figure 2 is the op matrix, figure 3 is the Labels matrix and figure 4 is JumpTo matrix of the first 20 nodes of the virus Z0mbie III. Figure 3: The Labels Matrix 5 6 7 8 13 16 18 tsr cf8_io tsr_complete restore_program __cycle_1 __mz __cycle_2 Figure 4: The JumpTo Matrix UbiCC Journal – Volume 4 No. 3 Ubiquitous Computing and Communication Journal 630 Special Issue on ICIT 2009 Conference - Bioinformatics and Image The following is the skeleton signature of Z0mbie III which is consist from the sequence of the first 10 nodes NodeSeq and the matrix ALCFG: NAHEKKKKAA = 1 1 1 1 1 1 1 1 1 1 1 1 Definition 5: The program P is infected by the virus V if and only if ϕ ( ALCFGs , ALCFGv ) = c , where ALCFGs p ALCFGp. For simplicity we will focus on viruses that use simple entry point infection, therefore i=0. However our approach can be applied to any obfuscated entry point Algorithm 5: Check whether the program P is infected by the virus V or not. Input: The program P, the matrix ALCFGV and a threshold T, where V is a virus in the database Output: yes if infected or no if the program is not infected 1- Disassemble the program P (In this study the software IDA Pro 4.8 is used, but this process can be implemented and embedded in one software) 2- Call Algorithm 1 to find ALCFGp and NodeSeqp (in this study the first sub block is processed which is equivalent to the simple entry point. However to check all the possible entry points we have to process all m× m sub block in the matrix ALCFGp) 3- Call Algorithm 6 to find The Percentage c and the sequence A 4- If c ≥ T then Call algorithm 7 to Delete the mismatch nodes and compare the matrices If algorithm 7 retrun 1 then Return "Yes" Else Return "No" Else Return "No" Algorithm 6: The Alignment of two sequences. Alignment ( , ) Input: The sequences NodeSeqS and NodeSeqV Output: The Percentage c and the sequence A, where c represents the percentage of the match node to the total number of the nodes and A contains the index of the mismatched nodes 1- Apply Needleman-Wunsch-Sellers algorithm on the sequences NodeSeqS and NodeSeqV 2- Store the index of mismatch nodes in the array A 3- Find c= number of matched nodes*100/ total number of nodes Algorithm 7: Delete the mismatch nodes and compare. DelMis&Comp(, ) Input: ALCFGS, ALCFGV and the mismatched sequence A. Output: 0 or 1 1- If mismatch with gab then delete the row i and the column i from the matrix ALCFGS for all i in ALCFG 10 ×10 4. SIMILARITY MEASURE FUNCTION To detect the metamorphic viruses that preserve its control flow graph during the propagation, we can simply compare ALCFG matrices, but if the control flow graph is changed during the propagation then a similarity measure function must be used. Unfortunately the current similarity measurement functions such as Euclidean distance, Canberra distance or even measurements based on neural network can not be used; the reason is the random insertion and deletion in the nodes sequence of the generated control flow graph. In this section we propose a new similarity measure function to detect the metamorphic viruses. Consider the following definitions: Definition 2: The diagonal sub-block of size m× m of the matrix ALCFG which has the size n× n is the matrix A and denoted by A p ALCFG, where the first row and column start at i+1<n, the last row and column end at i+m<=n and i is any integer number less than n. Definition 3: Let ALCFGp denotes to ALCFG matrix of size n× n of the program P and ALCFGV denotes to ALCFG matrix of size m× m of the virus V. Definition 4: The matrices ALCFGS and ALCFGV are similar if the following conditions are satisfied: 1- Alignment(NodeSeqS, NodeSeqV)= c ≥T 2- DelMis&Comp(ALCFGS, ALCFGV)=1 We will denote to the similarity measure function by ϕ such that: c if satisfied ϕ ( ALCFG s , ALCFGv ) = 0 else UbiCC Journal – Volume 4 No. 3 Ubiquitous Computing and Communication Journal 631 Special Issue on ICIT 2009 Conference - Bioinformatics and Image the mismatched nodes, and delete the last rows and columns from ALCFGV where the number of the deleted rows and columns equal to the number of the gabs 2- If mismatch with symbol then delete the row i and the column i from the matrices ALCFGS and ALCFGV for all i in the mismatched nodes. 3- Rename ALCFGv d . the program P is infected by a modified version of Z0mbie III and ϕ ( ALCFGs , ALCFGv ) = 90% . 5. IMPLEMENTATION The metamorphic viruses are taken from VX Heavens search engine and generated by two tools; namely: Next Generation Virus Creation Kit (NGVCK0.30) and Virus Creation Lab for Windows 32 (VCL32) [11]. Since the output of the kits was already in the asm format, we used Turbo Assembler (TASM 5.0) for compiling and linking the files to generate exe’s, which are later disassembled using IDA pro 4.9 Freeware Version. Algorithm 4 is implemented by using MATLAB 7.0. The NGVCK0.30 has advanced assembly sourcemorphing engine, and all variants of the viruses generated by NGVCK will have the same functionality, but they have different signatures. In this study; 100 metamorphic viruses are generated by using (NGVCK). 40 viruses are used for analyzing and 60 viruses are used for testing, let us call the first group A1 and the second group T1. After applying the suggested procedures on A1 we note that all the viruses in A1 have just seven different skeleton signatures when T=100 and m=20 and have four different skeletons when T=80 and m=20 and have three different skeletons when T=70 and m=20. T1 group is tested by using 7 antivirus software; the results are obtained by using the on-line service [12]. 100% of the generated viruses are recognized by the proposed method and by McAfee, but none of the viruses are detected by using the rest software. Another 100 viruses are generated by using VCL32, where all of them are obfuscated manually by inserting dead code, transposition the code, reassigning the registers and substituting the instructions. The generated viruses are divided into two groups, A2 and T2, A2 contains 40 viruses for analyzing and T2 contains 60 viruses for testing. Again 100% of the generated viruses are detected by the proposed method, 84% are detected by Norman, 23% are detected by McAfee and 0% are detected by the rest software. Figure 5 describes the average detection percentage of the metamorphic viruses in T1 and T2. the matrices to ALCFGs d and 4- If ALCFGs d = ALCFGv d then Return 1 Else Return 0 The most expensive step in the previous algorithms is Needleman-Wunsch-Sellers algorithm which can be implemented in m2 operation, and the total complexity of all procedures is O(n)+O(m2). Therefore the suggested method is much faster than the previous methods; for example the cost of finding the isomorphic sub graph in [9] is well known NP-complete problem. To illustrate the suggested similarity measure function, assume that we like to the check weather the program P is infected by the virus Z0mbie III or not. Assume that the threshold T=70 and m=10 (note that: to reduce the false positive we must increase the threshold and the number of the processed nodes), the first 10 nodes that are extracted from P and the ALCFG matrix are (the skeleton signature of P): NAHAEKKKKA ALCFG s = 1 1 1 1 1 1 1 1 1 1 1 1 By using algorithm 6 the nodes of P aligned with the nodes of Z0mbie III as following: NAHAEKKKK A NAH -EKKKKA A c= number of matched nodes*100/ total number of nodes=9*100/10=90 >T. The mismatch occur with gabs; therefore column 4 and row 4 must be deleted from ALCFGS, column 10 and row 10 must be deleted from ALCFGV. Since matrices after deletion are identical, we conclude that 6. CONCLUSION The antivirus software trying to detect the viruses by using variant static and dynamic methods. However; all the existing methods are not adequate. To develop new reliable antivirus software some problems must be fixed. This paper suggested new procedures to detect the metamorphic viruses by using arbitrary length of control flow graphs and nodes alignment. The suspected files are disassembled, the opcode encoded, the control flow analyzed, and the UbiCC Journal – Volume 4 No. 3 Ubiquitous Computing and Communication Journal 632 Special Issue on ICIT 2009 Conference - Bioinformatics and Image similarity of the matrices is measured by using a new similarity measurement. The implementation of the suggested approach show that all the generated metamorphic viruses can be detected while less than 62% are detected by other well known antivirus software. 120 100 80 60 40 20 0 AV ee an ft c G er s k nt e cA f rm AV so Cl am ic r o os ed y [8] M. R. Chouchane and A. Lakhotia: Using engine signature to detect metamorphic malware, In WORM '06: Proceedings of the 4th ACM workshop on Recurring malcode, New York, NY, USA, pp. 73-78, (2006). [9] D. Bruschi, L. Martignoni, and M.Monga: Detecting self-mutating malware using control flow graph matching, In DIMVA, pp. 129-143, (2006). [10] W. Wong and M. Stamp: Hunting for metamorphic engines, Journal in Computer Virology, vol 2 (3), pp. 211-229, (2006). [11] http://vx.netlux.org/ last access March (2009). [12] http://www.virustotal.com/ last access March (2009). ma No sp Sy M M Figure 5: The average percentage of the detected viruses from group T1 and T2. REFERENCES [1] M. Christodorescu, J. Kinder, S. Jha, S. Katzenbeisser, and H. Veith: Malware Normalization, Technical Report # 1539 at the Department of Computer Sciences, University of Wisconsin, Madison, (2005). [2] F. Perriot: Striking Similarities: Win32/Simile and Metamorphic Virus Code, Symantec Corporation (2003). [3] E. Konstantinou: Metamorphic Virus: Analysis and Detection Technical Report, RHUL-MA2008-02 Department of Mathematics Royal Holloway, University of London, (2008). [4] A. Lakhotia, A. Kapoor, and E. U. Kumar: Are metamorphic computer viruses really invisible?, part 1. Virus Bulletin, pp 5-7, (2004). [5] P. Szor: The Art of Computer Virus Research and Defense, Addison Wesley Professional, 1 edition, pp. 10-33 (2005). [6] R. Ando, N. A. Quynh, and Y. Takefuji: Resolution based metamorphic computer virus detection using redundancy control strategy, In WSEAS Conference, Tenerife, Canary Islands, Spain, Dec. pp. 16-18. (2005). [7] R. G. Finones and R. T. Fernande: Solving the metamorphic puzzle, Virus Bulletin, pp. 14-19, (2006). Ka UbiCC Journal – Volume 4 No. 3 Ubiquitous Computing and Communication Journal Pr op 633 Special Issue on ICIT 2009 Conference - Bioinformatics and Image RELIABILITY OPTIMIZATION USING ADAPTED ANT COLONY ALGORITHM UNDER CRITICALITY AND COST CONSTRAINTS Belal Ayyoub Al-Balqa‟a Applied University- FET Computer Engineering Dep, Jordan belal_ayyoub@hotmail.com ABSTRACT Reliability designers often try to achieve a high reliability level of systems. The problem of system reliability optimization where complex system is considered. The system reliability maximization subject to component‟s criticality and cost constraints is introduced as reliability optimization problem (ROP). A procedure, which determines the maximal reliability of non series–non parallel system topologies is proposed. In this procedure, system components are chosen to be maximized according to it‟s criticalities. To evaluate the systems reliability, an adapting approach is used by the ant colony algorithm (ACA) to determine the optimal system reliability. The algorithm has been thoroughly tested on bench mark problems from literature. Our numerical experiences show that our approach is promising especially for complex systems. The proposed model proves to be robust with respect to its parameters. Key Words: System reliability, Complex system, Ant colony, Component‟s criticality. Asim El-Sheikh Arab Academy for Banking and Financial Sciences (AABFS) a.elsheikh@aabfs.org 1 INTRODUCTION System reliability can be defined as the probability that a system will perform its intended function for a specified period of time under stated conditions [1]. Many modern systems, both hardware and software, are characterized by a high degree of complexity. To enhance the reliability of such systems, it is vital to define techniques and models aimed at optimizing the design of the system itself. This paper presents a new metaheuristicbased algorithm aimed at tackling the general system reliability problem, where one wants to identify the system configuration that maximizes the overall system reliability, while taking into account a set of resource constraints. Estimating system reliability is an important and challenging problem for system engineers. [2]. It is also challenging since current estimation techniques require a high level of background in system reliability analysis, and thus familiarity with the system. Traditionally, engineers estimate reliability by understanding how the different components in a system interact to guarantee system success. Typically, based on this understanding, a graphical model (usually in the form of a fault tree, a reliability block diagram or a network graph) is used to represent how component interaction affects system functioning. Once the graphical model is obtained, different analysis methods [3–5] (minimal cut sets, minimal path sets, Boolean truth Tables, etc.) can be used to quantitatively represent system reliability. Finally, the reliability characteristics of the components in the system are introduced into the mathematical representation in order to obtain a system-level reliability estimate. This traditional perspective aims to provide accurate predictions about the system reliability using historical or test data. This approach is valid whenever the system success or failure behavior is well understood. In their paper, Yinong Chen, Zhongshi He, Yufang Tian [6],they classified system reliability in to topological and flow reliability. They considered generally that the system consists of a set of computing nodes and a set of components between nodes. They assume that components are reliable while nodes may fail with certain probability, but in this paper we will consider components subject to failure in a topological reliability. Ideally, one would like to generate system design algorithms that take as input the characteristics of system components as well as system criteria, and produce as output an optimal system design, this is known as system synthesis[7], and it is very difficult to achieve. Instead, we consider a system that is already designed then try to improve this design by maximizing the components reliability which will maximize the over all system reliability. In the most theoretical reliability problems the two basic methods of improving the reliability of systems are improving UbiCC Journal – Volume 4 No. 3 634 Special Issue on ICIT 2009 Conference - Bioinformatics and Image the reliability of each component or adding redundant components [8]. Of course, the second method is more expensive than the first. Our paper considers the first method. The aim of this paper is to obtain the optimal system reliability design with the following constrains. : 1: Basic linear-cost-reliability relation used for each component [7]. 2: Criticality of components [9]. The designer should take this in to account before building a reliable system and according to criticality of component increasing reliabilities will go toward the most critical component. Components‟ criticality can be derived from its failure effects to system reliability failure. Which the position of a component will play an important role for its criticality which we called it the index of criticality. 2 SYSTEM RELIABILITY PROBLEM 2.1 Literature view Many methods have been reported to improve system reliability. Tillman, Hwang, and Kuo [10] provide survey of optimal system reliability. They divided optimal system reliability models into series, parallel, series-parallel, parallelseries, standby, and complex classes. They also categorized optimization methods into integer programming, dynamic programming, linear programming, geometric programming, generalized Lagrangian functions, and heuristic approaches. The authors concluded that many algorithms have been proposed but only a few have been demonstrated to be effective when applied to large-scale nonlinear programming problems. Also, none has proven to be generally superior. Fyffe, Hines, and Lee [11] provide a dynamic programming algorithm for solving the system reliability allocation problem. As the number of constraints in a given reliability problem increases, the computation required for solving the problem increases exponentially. In order to overcome these computational difficulties, the authors introduce the Lagrange multiplier to reduce the dimensionality of the problem. To illustrate their computational procedure, the authors use a hypothetical system reliability allocation problem, which consists of fourteen functional units connected in series. While their formulation provides a selection of components, the search space is restricted to consider only solutions where the same component type is used in parallel. Nakagawa and Miyazaki [12] proposed a more efficient algorithm. In their algorithm, the authors use surrogate constraints obtained by combining multiple constraints into one constraint. In order to demonstrate the efficiency of their algorithm, they also solve 33 variations of the Fyffe problem. Of the 33 problems, their algorithm produces optimal solutions for 30 of them. Misra and Sharma [13] presented a simple and efficient technique for solving integer-programming problems such as the system reliability design problem. The algorithm is based on function evaluations and a search limited to the boundary of resources. In the nonlinear programming approach, Hwang, Tillman and Kuo [14] use the generalized Lagrangian function method and the generalized reduced gradient method to solve nonlinear optimization problems for reliability of a complex system. They first maximize complex-system reliability with a tangent cost-function and then minimize the cost with a minimum system reliability. The same authors also present a mixed integer programming approach to solve the reliability problem [15]. They maximize the system reliability as a function of component reliability level and the number of components at each stage. Using a genetic algorithm (GA) approach, Coit and Smith [16], [17], [18] provide a competitive and robust algorithm to solve the system reliability problem. The authors use a penalty guided algorithm which searches over feasible and infeasible regions to identify a final, feasible optimal, or near optimal, solution. The penalty function is adaptive and responds to the search history. The GA performs very well on two types of problems: redundancy allocation as originally proposed by Fyffe, et al., and randomly generated problems with more complex configurations. For a fixed design configuration and known incremental decreases in component failure rates and their associated costs, Painton and Campbell [19] also used a GA based algorithm to find a maximum reliability solution to satisfy specific cost constraints. They formulate a flexible algorithm to optimize the 5th percentile of the mean time-between-failure distribution. In this paper ant colony optimization will be modified and adapted, which will consider the measure of criticality will gives a guidance to the ants for its nest and ranking of critical components will be taken into consideration to choose the most reliable components which then will be improved till reach the optimal system‟s components reliability value. 2.2 Ant colony optimization approach Ant colony optimization (ACO) algorithm [20, 21], which imitate foraging behavior of real life ants, is a cooperative population-based search algorithm. While traveling, Ants deposit an amount of pheromone (a chemical substance). When other ants find pheromone trails, they decide to follow the trail with more pheromone, and while following a specific trail, their own pheromone reinforces the followed trail. Therefore, the continuous deposit of pheromone on a trail shall maximize the probability of selecting that trail by next ants. Moreover, ants shall use short paths to food source shall return to nest sooner and therefore, quickly mark their paths twice, before other ants return. As more ants complete shorter paths, pheromone accumulates UbiCC Journal – Volume 4 No. 3 635 Special Issue on ICIT 2009 Conference - Bioinformatics and Image faster on shorter paths and longer paths are less reinforced. Pheromone evaporation is a process of decreasing the intensities of pheromone trails over time. This process is used to avoid locally convergence (old pheromone strong influence is avoided to prevent premature solution stagnation), to explore more search space and to decrease the probability of using longer paths. Because ACO has been proposed to solve many optimization problems [22],[23], our proposed idea is also to adapt this algorithm to optimize system reliability and specially complex system 3 METHODOLOGY 3.1 Problem definition 3.1 .1 Notation In this section, we define all parameters used in our model. Rs : Reliability of system Pi : Reliability of components i. qi : probability of failure of components (i). Qn : Probability of failure to system n : Total number of components. ICRi : Index of criticality measure. ICRp : index of criticality for path to destination ISTi : Index of structure measure. Ct : Total cost of components. Ci : Cost of component Cc : Cost for improvement P(i)min: Minimum accepted reliability value ACO :start node for ant, : next node chosen. :initial pheromone trail intensity τi τi(old) :pheromone trail intensity of combination before update of τi(new) :pheromone trail intensity of combination after update :problem-specific heuristic of combination η ij : relative importance of the pheromone trail intensity : relative importance of the problemspecific heuristic for global solution :index for component choices from set AC trail persistence for local solution :number of best solutions chosen for offline pheromone update index 3.1.2 Assumption In this section, we present the assumptions under which formulation of our model is presented. 1: There are many different methods used to derive the expression of total reliability of complex system, which are derived in a certain system topology, we state our system expressions according to the methods of papers [3-5]. 2: We used a cost-reliability curve [7] to derive an equation to express each cost components according to its reliability and then the total system cost will be additive in term of components. See Fig. (1). Rs cost at constitute 1 Pi min Cost Ci Ct Figure 1: cost-reliability curve As show in Fig 1. and by equaling the slopes of two triangles we can derive equation number (1) as following: Cc p 1 - p(i)min 1 - p(i)min Ct p 2 - p(i)min 1 - p(i)min Ct ...n . (1) 3: In [9] calculation of ICRi and ISTi derivation equation s (2) and (3) for each components from its structural measure, which given by, (2) Where, (3) 4-Every ICRi must be lower than initial value ai. This value is a minimum accepted level of criticality measure to every component. 5-After the complex system presented mathematically, a set of paths will be available from specified source to destination. those paths will be ranked each one according to its components criticalities. 3.2 Formulation of the problem: The objective function in general, has the form : Maximize, Rs= f (P1,P2,P3,....Pn). subject to the following constrains, 1. ICRi : i =1,2,…n 2. To ensure that the total cost of components not more than proposed cost value the following equation number (4) can be used: :Pi(min) > 0 (4) Note that this set of constrains permits only positive components cost. UbiCC Journal – Volume 4 No. 3 636 Special Issue on ICIT 2009 Conference - Bioinformatics and Image 4 MODEL CONSTRUCTION (6) The update equation will become as follows: (7) 5. A new reliabilities will be generated. 6. Till reach best solution and all ant moved to achieve maximum reliability of the system with minimum cost. 5 EXPERIMINTAL RESULTS In the following examples, we use a bench mark systems configurations like a Bridge, and Delta . 5.1 Bridge problem: 2 3 5 1 4 The algorithm uses an ACO technique with the criticality approach to ensure global converges from any starting point. The algorithm is iterative. At each iteration, the set of ants are identified using some indicator matrices. Below are the main steps of our proposed model . As we see in the Fig. 2 which illustrating a set of steps illustrated below: 1. Ant colony parameters are initialized 2. The criticality of components will be calculated according to derived reliability equation, then will be ranked according to its values 3. Using equation number(5) Ant equation: (5) The probability to choose the next node will be estimated after a random number generated. and until the destination node. The selected nodes will be chosen .According to the criticality components through this path. Input system reliability S D equation Figure 3: Bridge system To find the polynomial for a complex system we must know that it always given at a certain time to be transmitted from source (s) to destination (D), see Fig. 3. The objective function to be maximized has the form: Rs= 1- (q1+q4.q5.p1+q3.q4.p1.p5+q2.q4.p1.p5.p3) Subject to: 1. Randomly initialize Pi and minimum values and generate random number choose n Ants Evaluate ICRi for components & rank rankcomponents Calculate Generate new Pi If random No. < Ant Move Ci * (pi) i 1 3 45 2. Do same until the destination then Select path The ICRi constraint. ICRi calculated : i=1,2,…5.. Update pheromone : - We use the values in the Fig. 3 as initial values for components‟ reliabilities to improve the system: P(1)min=0.9, P(3)min=0.8, NO ants reached destination? Yes Get optimized values P (2)min=0.9, P (4)min=0.7, p(5)min=0.8. Figure 2: Flow diagram adapted ant system 4. Eq. (6): update the pheromone according to the criticality measure. Which can be calculate product of components criticalities‟ value 3. We choose the cost-reliability curve to permit distribution of cost depending on ranking of components according to there criticality. The model was built in such a way that reduce the fail of the most critical components, this is done by increasing the reliability of the most critical components, which tend to maximizes the over all reliability what is our goal. We summarized our results in the following Table (1) and Table UbiCC Journal – Volume 4 No. 3 637 Special Issue on ICIT 2009 Conference - Bioinformatics and Image (2).With initial values of ant colony algorithm as in Table ( 3). Table 1: Reliabilities of the Bridge system. ReliabNew ICRi rank ities values p1 0.9998 1 p2 0.9 3 p3 0.8 4 p4 0.9998 2 p5 0.8 5 Rs 0.9999 Table 2: Costs of the Bridge system . cost Value in units C1 9.9988 C2 8.8888 C3 7.7777 C4 9.9978 C5 7.7777 Ct 44.441 Table 3: ACO initial values 2 3 0.2 1 Q Ants 10 10 5.2 Delta Problem: S 1 T Figure 4: Delta system Using the same procedures as in bridge problem we obtain the following optimization problem for delta system given in Fig 4. Max .Rs= P1+ P1.P2 - P1.P2.P3 Subject to 1. ICRi calculated for i=1, 2,3. 2. Ci * (Pi) i 1 3 4.5 p(1)min=0.7 i=1,2,3. The following two Tables (4) and (5) summarized the results. Table 4: Reliabilities of the Delta s ystem. Computed ICRi Rank value 0.9999 1 P1 P2 P3 Rs 0.7 0.7 0.9999 2 3 5.1.1Comments on results As cleared in Tables 2 and 3 results indicate that according the criticality of components, the improvement will be occurred as the more critical component the more chance to be improved which will highly effect to the system reliability improvement with minimal cost too, this is better than to increase reliability components randomly. Now it is clear also the best path from S to D is to follow component 1 and component 4 . if we have more available cost it will increase the other component reliability according to it‟s criticality ranking. Finally if all components have the same initial reliability values the path through components 1 and 4 have the same chance for path through component 2 1nd 3, and according algorithm which depend on the topological reliability it will goes to improve the higher critical component according to it‟s position in the system. Table 5: Costs of the Delta system. Cost values C1 0.9998 C2 0.4 C3 0.4 Ct 1.799 Beside comments noted in bridge system, delta system have two paths from S to T as shown in the Fig 4. The results shows that it is preferred to increase the component one rather than others this for two reasons, it have most critical value and pheromone value biased toward the path with lower number of components (Path1=P1) according to the equation : UbiCC Journal – Volume 4 No. 3 3 2 638 Special Issue on ICIT 2009 Conference - Bioinformatics and Image 5.4. Mesh Problem: 2 1 3 As we see from results in Tables 6 and 7 components 6 and 7 have the most reliability values according to it‟s criticality and the path chosen through components 6 and 7, and to achieve minimal cost the system take only 4.22 which achieve our objectives S 6 T 7 5.5 Important Comments To study the effect of modifying of ant parameters such as initial pheromone in a delta case and biased to component 2 the results will become as shown in Table 8. The reliably for components was P1=0.2, P2=0.3 and P3=0.3 and values of =10 , =2 and =10 Figure 5: Mesh system This system have more components and large and The objective Function for the mesh system is: Max. Rs=(p6*p7)+(p1*p2*p3* (1-p6))+(p1*p2*p3*p6*(1-p7))+(p1*p4*p7* (1-p2)*(1-p6))+(p1*p4*p7*p2*(1-p6)*(1p3))+(p3*p5*p6*(1-p7)*(1p1))+(p3*p5*p6*p1*(1p7)*(1-p2))(p1*p2*p5*p7*(1-p3)*(1-p4)*(1-p6))(p2*p3*p4*p6*(1-p1)*(1-p5)*(1p7))+(p1*p3*p4*p5*(1-p2)*(1-p6)*(1-p7)); Subject to, 1. ICRi calculated for i=1,2,..n... 5 4 Table 8: Effects of Ant colony parameters Cost values C1 0.7777 C2 0.9997 C3 0.999 Ct 14.777 Computed ICRi Rank value 0.3 1 P1 0.9999 2 P2 0..9999 3 P3 Rs 0.9999 It is clear that the solution biased to the components 2 and 3 path rather than component one, because of there initial pheromone values. i 1 7 Ci * (Pi) 6.6 i=1,2,3.. P(i)min=0.5 Table 6: Reliabilities of the Mesh system Reliabiliti New ICRi rank es values 0.5 5 P1 0.5 4 P2 0.5 3 P3 0.5 7 P4 0.5 6 P5 0.9999 1 P6 0.9999 2 P7 Rs 0.9997 Table 7: Costs of the Mesh system cost Value in units C1 C2 C3 C4 C5 C6 C7 Ct 4.22 0.4444 0.4444 0.4444 0.4444 0.4444 0.9998 0.9997 6 CONCLUSION We propose a new effective algorithm for general reliability optimization problem. Using ant colony. The ant colony algorithm is a promising heuristic method for solving complex combinatorial problems. To solve complex system design problem: 1. We must formulate a system, that is correctly representing the real system with all paths from source to destination by choose an efficient reliability estimation method. 2. To the best of maximization of total reliability and minimization of the total cost of a system take in to consideration the components according to its criticality, then arrange the most critical components gradually. 3. Index of criticality achieve maximum system reliability with minimum cost according to reliability of system topology 4. resolve model without index of criticality maximum reliability and minimum cost but this method ignore the topology of the system. UbiCC Journal – Volume 4 No. 3 639 Special Issue on ICIT 2009 Conference - Bioinformatics and Image 5. The ant colony algorithm improved by the previous experience which was given by the index of criticality which gives to ant an experience to deposit of pheromone on a trail which will maximize the probability of selecting that trail by next ants. Moreover, ants shall use more reliable paths. Our numerical experiences show that our approach is promising especially for complex systems. 7 REFERENCES [1] A. Lisnianski,. H. Ben-Haim, and D. Elmakis: “Multistate System Reliability optimization: an Application”, Levitin, Gregory book , USA, pp.1-20. ISBN 9812383069. (2004) S. Krishnamurthy, AP. Mathur.: On the estimation of reliability of a software system using reliabilities of its components .In: Proceedings of the ninth international symposium on software reliability engineering(ISSRE„97).Albuquerque;.p.146. (1997) T. Coyle, RG. Arno, PS.: Hale. Application of the minimal cut set reliability analysis methodology to the gold book standard network. In the commercial and power systems technical conference;. p. 82–93. industrial (2002) K. Fant, Brandt S. : Null convention logic, a complete and consistent logic for asynchronous digital circuit synthesis. In: the international conference on application specific systems, architectures, and processors (ASAP ‟96); p. 261–73. (1996). ElAlem: " An Application of Reliability Engineering in Complex Computer System and Its Solution Using Trust Region Method", WSES , software and hardware Engineering for 21st century book, pp261,(1999). [10] ATillman,C.Hwang,,K.Way : “Optimization Techniques for System Reliability with Redundancy,A Review”, IEEE Transactions on Reliability, vol. R-26, no. 3, , pp. 148155. August (1977). E. David. Fyffe, W. William. K. L Hines, Nam: “System Reliability Allocation And a Computational Algorithm”, IEEE Transactions on Reliability, vol. R-17, no. 2, , pp. 64-69. June (1968). Y. Nakagawa, S. Miyazaki: “Surrogate Constraints Algorithm for Reliability Optimization Problems with Two Constraints”, IEEE Transactions on Reliability, vol. R-30, no. 2, , pp. 175-180. June (1981). K. Behari Misra, U. Sharma: “An Efficient Algorithm to Solve Integer-Programming Problems Arising in System-Reliability Design ”,IEEE Transactions on Reliability, vol. 40, no. 1, , pp. 81 91. April (1991). C. Lai Hwang, A. Frank Tillman, W. Kuo, : “Reliability Optimization by Generalized Lagrangian - Function and ReducedGradient Methods”, IEEE Transactions on Reliability, vol. R-28, no. 4, pp. 316-319. October (1979). A. Frank Tillman, C.Hwang, W Kuo, : “Determining Component Reliability and Redundancy for Optimum System Reliability”, IEEE Transactions on Reliability, vol. R-26, no. 3, pp. 162- 165. August (1977). D. Coit, Alice E.Smith, “Reliability Optimization of Series-Parallel Systems Using a Genetic Algorithm”, IEEE Transactions on Reliability, vol. 45, no. 2, , pp. 254-260 June,(1996 ). W. David. Coit, Alice E. Smith: “Penalty Guided Genetic Search for Reliability Design Optimization”, Computers and Industrial Engineering, vol. 30, no. 4, pp. 95-904. (1996). W. David Coit, E. Alice Smith, M. David [11] [2] [12] [3] [13] [14] [4] [15] [5] C. Gopal H, Nader A.: A new approach to system reliability. IEEE Trans Reliab;50(1):75–84. (2001). Y. Chen, Z. hongshi:" : Bounds on the Reliability of Systems With Unreliable Nodes & Components". IEEE, Trans. on reliability, vol.53, No. 2, June.(2004). B. A. Ayyoub.:” An application of reliability engineering in computer networks communication” AAST and MT Thesis, p.p17Sep.(1999). S. Magdy, R.d Schinzinger: "On Measures of computer systems Reliability and Critical Components", IEEE, Trans. on Reliability (1988). [18] [9] B. A. Ayyoub. M. Baith Mohamed, [6] [16] [7] [17] [8] UbiCC Journal – Volume 4 No. 3 640 Special Issue on ICIT 2009 Conference - Bioinformatics and Image Tate,: “Adaptive Penalty Methods for Genetic Optimization of Constrained Combinatorial Problems”, INFORMS Journal on Computing, vol. 8, no. 2, Spring, pp. 173-182. (1996). [19] L. Painton, C. James: “Genetic Algorithms in Optimization of System Reliability”, IEEE Transactions on Reliability, vol. 44, no. 2, , pp. 172-178. June (1995) N. Demirel,., Toksar, M.: Optimization of the quadratic assignment problem using an ant colony algorithm, Applied Mathematics and Computation, Vol. 183, optimization ,Applied Mathematics and Computation, Vol. 191, pp. 42--56 (2007). Y. Feng, L. Yu,G.Zhang,: Ant colony pattern search algorithms for unconstrained and bound constrained optimization ,Applied Mathematics and Computation, Vol. 191, pp. 42--56 (2007). M. Dorigo, L. M. Gambardella: “Ant Colony System: A Cooperative Learning Approach to the Travelling Salesman Problem”, IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, , pp. 53-66. April (1997). B. Bullnheimer, F. Richard, H. Christine Strauss, “Applying the Ant System to the Vehicle Routing Problem”, 2nd Metaheuristics International Conference (MIC97), Sophia-Ant polis, France, pp. 21-24. July, (1997). [20] [21] [22] [23] UbiCC Journal – Volume 4 No. 3 641 Special Issue on ICIT 2009 Conference - Bioinformatics and Image A COMPREHENSIVE QUALITY EVALUATION SYSTEM FOR PACS Dinu Dragan, Dragan Ivetic Departmant for Computing and Automatics, Republic of Serbia dinud@uns.as.rs, ivetic@uns.ac.rs ABSTRACT An imposing number of lossy compression techniques used in medicine, represents a challenge for the developers of a Picture Archiving and Communication System (PACS). How to choose an appropriate lossy medical image compression technique for PACS? The question is not anymore whether to compress medical images in lossless or lossy way, but rather which type of lossy compression to use. The number of quality evaluations and criteria used for evaluation of a lossy compression technique is enormous. The mainstream quality evaluations and criteria can be broadly divided in two categories: objective and subjective. They evaluate the presentation (display) quality of a lossy compressed medical image. Also, there are few quality evaluations which measure technical characteristics of a lossy compression technique. In our opinion, technical evaluations represent an independent and invaluable category of quality evaluations. The conclusion is that quality evaluations from each category measure only one quality aspect of a medical image compression technique. Therefore, it is necessary to apply a representative(s) of each group to acquire the complete evaluation of lossy medical image compression technique for a PACS. Furthermore, a correlation function between the quality evaluation categories would simplify the overall evaluation of compression techniques. This would enable the use of medical images of highest quality while engaging the optimal processing, storage, and presentation resources. The paper represents a preliminary work, an introduction to future research and work aiming at developing a comprehensive quality evaluation system. Keywords: medical image quality metrics, medical image compression, PACS 1 INTRODUCTION The second group of compression techniques achieves greater compression ratios, but with data distortion in restored image [6, 7, 8]. Lossy compression provoked serious doubts and opposition from medical staff. The opposition rose from the fact that the loss of data can influence medical image interpretation and can lead to serious errors in treatment of a patient. Therefore, the main research area for lossy compression of medical images is finding of the greatest compression ratio that still maintains diagnostically important information. The degree of lossy compression of medical images which maintains no visual distortion under normal medical viewing conditions is called “visually lossless” compression [10]. Several studies [8, 11, 12] and standards [13] proved clinical acceptability to use lossy compression of medical images as long as the modality of the image, the nature of the imaged pathology, and image anatomy are taken into account during lossy compression. The medical organization involved has to approve and adopt a lossy compression of medical images applied in PACS. Therefore, it is necessary to provide a Picture Archiving and Communication System (PACS) represents an integral part of modern hospitals. It enables communication, storage, processing, and presentation of digital medical images and corresponding data [1]. Digital medical images tend to occupy enormous amount of storage space [2, 3]. The complete annual volume of medical images in a modern hospital easily reaches hundreds of petabytes and is still on the rise [4]. The increased demand for digital medical images introduced still image compression for medical imaging [5], which relaxes storage and network requirements of a PACS, and reduces the overall cost of the system [3]. In general, all compressed medical images can be placed in two groups: lossless and lossy. The first group is more appealing to physicians, because decompression restores the image completely, without data loss. It achieves modest results and maximum compression ratio of 3:1 [6, 7, 8]. Several studies [9, 10] showed that this is not suitable for PACS, and that at least 10:1 compression ratio has to be achieved. UbiCC Journal – Volume 4 No. 3 642 Special Issue on ICIT 2009 Conference - Bioinformatics and Image quality evaluation of different compression techniques from PACS point of view. During our work on a PACS for a lung hospital, we tried to adopt image compression for medical images which achieves highest compression ratio with minimal distortion within decompressed image. Also, we needed image compression suitable for telemedicine purposes. We consulted the technical studies in search for quality evaluation of image compression technique. The sheer amount of studies is overwhelming [14, 15]. There is no unique quality evaluation which is suitable for various compression techniques and different applications of image compression [16, 17]. In most cases the studies are focused only on presentation (display) quality of the lossy compressed medical image. Technical features of compression technique are usually ignored. This paper represents a preliminary research. Its purpose is to identify all the elements needed to evaluate the quality of a compression technique for PACS. We identified three categories of quality evaluations and criteria: presentation-objective, presentation-subjective, and technical-objective. Overview of technical studies led us to conclusion that quality evaluations from each category measure only one quality aspect of an image compression technique. To perform the complete evaluation of medical image compression technique for PACS, it is necessary to apply a representative of each category. A correlation function between the representatives of each category would simplify the overall evaluation of compression techniques. A 3D evaluation space introduced by the paper is a 3D space defined by this correlation function and quality evaluations used. Our goal is to develop an evaluation tool based on the 3D evaluation space which is expected for 2011. All the elements of the quality evaluation system are identified in the paper. The organization of the paper is as follows: section 2 gives the short overview of the lossy compression techniques used in medical domain; section 3 describes the quality evaluations used to measure the quality of compression techniques; 3D evaluation space is discussed in section 4; section 5 concludes the paper. 2 LOSSY COMPRESSION OF MEDICAL IMAGES Over the past decades an imposing number of lossy compression techniques have been tested and used in medical domain. Industry approved standards have been used as often as the proprietary compressions. On the part of the image affected, they can be categorized in two groups: 1. medical image regions of interest (ROI) are compressed losslessly while the rest of the image background is compressed lossy, 2. the entire medical image is compressed lossy targeting the “visually lossless” threshold. The first group offers selective lossy compression of medical images. Parts of the image containing diagnostically crucial information (ROI) are compressed in a lossless way, whereas the rest of the image containing unimportant data is compressed lossy. This approach enables considerable higher compression ratio than ordinary lossy compression [18, 19]. Larger regions of the medical image contain unimportant data which can be compressed at higher rates [19]. Downfall of this approach is computational complexity (an element of technicalobjective evaluation). Each ROI has to be marked before compression. Even for images of the same modality, ROIs are rarely in the same place. ROIs are identified either manually by qualified medical specialist or automated based on a region-detection algorithm [20]. The goal is to find a perfect combination of automated ROI detection algorithms and selective compression technique. Over the years various solutions for ROI compression of medical images emerged which differ in image modalities used, ROI definitions, coding shames and compression goals [20]. Some of them are: a ROI-based compression technique with two multi-resolution coding schemes reported by Strom [19], a block based JPEG ROI compression and a importance schema coding based on wavelets reported by Bruckmann [18], a motion compensated ROI coding for colon CT images reported by Bokturk [21], a region based discrete wavelet transform reported by Penedo [22], a JPEG2000 ROI coding reported by Anastassopoulos [23]. The second group of lossy compression techniques applies lossy compression over entire medical image. Considerable efforts have been made in finding and applying the visual lossless threshold. Over the years various solutions emerged which differ in goals imposed on a compression technique (for particular medical modality or for a group of modalities), and in compression techniques used (industry standards or proprietary compression techniques). Some of the solutions presented over the years are: a compression using predictive pruned treestructured vector quantization reported by Cosman [17], a wavelet coder based on Set Partitioning in Hierarchical Trees (SPIHT) reported by Lu [24], a wavelet coder exploiting Human Visual System reported by Kai [25], a JPEG coder and waveletbased trelliscoded quantization (WTCQ) reported by Slone [10], a JPEG2000 coder reported by Bilgin [26]. Although the substantial effort has been made to develop a selective lossy compression of medical images, the industry standards that apply lossy compression on the entire medical image are commonly used in PACS. UbiCC Journal – Volume 4 No. 3 643 Special Issue on ICIT 2009 Conference - Bioinformatics and Image 3 QUALITY EVALUATIONS The significant effort has been made to solve the problem of measuring digital image quality with limited amount of success [13, 14]. Various studies tried to develop new metrics or to adopt existing ones for medical imaging [5, 6, 7, 8, 17]. The quality evaluations used can be broadly categorized as [5, 17]: • objective quality evaluations – based on a mathematical or a statistical model, which is easy to compute and rate, • subjective quality evaluations – based on a subjective observer evaluation of restored image, or questionnaires with numerical ratings. These categories can be further sub-categorized, but this falls out of the scope of the paper [5, 17]. The quality evaluations proposed measure presentation (display) quality of the lossy compressed medical image. Therefore, they can be categorized as presentation-objective and presentation-subjective quality evaluation. Although, these quality evaluations have been devised for image quality measurement, they can be also used for evaluation of lossy compression techniques. The quality of the reconstructed image should not be the only criteria for adoption of a compression technique for PACS. The quality evaluation of medical image compressions for PACS is inseparable from technical aspects of the system. The lossy compression can uphold remarkable presentational quality (objective and subjective) of medical images but with high technical demands. In some cases these technical demands are not achievable and in most cases they are too expensive. In many countries this will impose too high price for PACS. Evaluations measuring image compression quality from technical point of view can be categorized as technical-objective quality evaluations. Presentation-objective evaluations Presentation-objective evaluations represent the most desirable way to measure image quality. They are based on a mathematical model, and are usually easy to compute. Their main advantage is objectivity [27]. The numerical distortion evaluations like mean squared error (MSE), Eq. (1), signal-to-noise-ratio (SNR), Eq. (2), or peak-signal-to-noise-ratio (PSNR), Eq. (3), are commonly used [6]. 3.1 (2 b ) 2 MSE (3) These measures fail to measure local degradations and do not provide precise descriptions of image degradations [5, 27]. Still, many studies use this quality evaluations to rate their implementations of lossy medical image compression techniques. Quality of the lossy compressions studied in [9, 24, 25, 26, 28] was measured by these numerical distortion evaluations. For example, Chen [9] used PSNR to evaluate propriety DCT based SPIHT compression, original SPIHT and JPEG2000. The DCT based compression achieved highest PSNR values for the tested medical images, which indicated that it is more suitable for medical imaging then the other two compression techniques. Beside scalar numerical evaluations, graphical evaluations such as Hosaka plots and Eskicioglu charts, and evaluations based on HVS model have been used [14, 15, 29]. Their applicability in medical domain has been reported in [6, 27]. Also, a hybrid presentation-objective metrics have been studied for medical domain. Przelaskowski [27] proposed a vector quality measure reflecting diagnostic accuracy, Eq. (4). HVM = ∑α iVi i =1 6 (4) The values Vi represents one presentationobjective measure. The vector measure was designed to include the formation of a diagnostic quality pattern based on the subjective ratings of local image features. This quality measure represents a way of combining presentation-objective and presentationsubjective evaluations. Evaluation of lossy JPEG2000 compressed medical images found that compression ration of 20:1 is diagnostically acceptable. Presentation-subjective evaluations Presentation-subjective evaluations have been used to evaluate lossy compressed medical images more often than presentation-objective [30]. Presentation-subjective evaluations are based on observer’s subjective perception of reconstructed image quality [5]. The subjective quality of a reconstructed medical image can be rated in many ways [5]. In some studies, observer is presented with several reconstructed versions of the same image. The observer has to guess the image compression level and to order the sample images in order from the least compressed to the most compressed [5, 31]. If the difference between original image and reconstructed image at some level of compression is not distinguishable, then that level of compression is 3.2 ∑∑ [ f (i, j ) − f ′(i, j )]2 / m ⋅ n i =1 j =1 2 σx m n m n (1) MSE 2 ;σ x 1 = m⋅n ∑∑ ( f (i, j) − i =1 j =1 ∑∑ f (i, j ) i =1 j =1 n m m⋅n ) (2) UbiCC Journal – Volume 4 No. 3 644 Special Issue on ICIT 2009 Conference - Bioinformatics and Image diagnostically acceptable [32]. Other studies used qualified observers to interpret reconstructed medical images compressed at various levels. The compression levels on which results were the same as for the original image have been rated as acceptable [5]. Also, some studies used qualified technicians to define a “just noticeable” difference used to select the point at which compression level is not diagnostically usable. The observers have been presented with series of images, each compressed at higher level. They simple had to define the point at which changes became obvious. The studies were based on presumption that one can perceive “changes” in the image long before an image is degraded enough to lose its diagnostic value [5]. When subjectively evaluating medical images, it is not sufficient to say that image looks good. It should be proved that image did not loose the essential information and that it has at least the same diagnostic values as the original medical image [6]. Therefore, beside pure subjective evaluations, semisubjective evaluations of a reconstructed medical image which measure diagnostic accuracy have been used. Observers often rated the presented images on a scale of 1 to 5 [10, 17]. Collected data have been further statistically analyzed highlighting averages and other trends in collected data. Quality of reconstructed medical images is most often measured by semi-subjective evaluation based on Receiver Operating Characteristic (ROC) analysis, which has its origins in theory of signal detection [6, 7, 27, 33]. A filtered version of the signal plus Gaussian noise is sampled and compared to a threshold. If it exceeds the threshold then the signal is declared to be there. As the threshold varies, so does the probability to erroneously declare the signal present or absent. The ROC analyses are based on ROC curves (see Fig. 1), which are a simple complete empirical description of this decision threshold effect, indicating all possible combinations of the relative frequencies of the various kinds of correct and incorrect decisions [6]. The plot is a summary of the trade off between true positive rate (sensitivity) and false positive rate (the complement of specificity). The area under the curve can be used to summarize overall quality or the efficiency of the detection process [6, 7]. The ROC curves are not applied directly to medical imaging. The decision threshold is based on diagnostic accuracy and physician’s judgment. Reconstructed medical images, which either possessed or not an abnormality, were presented to qualified specialist. Observers had to provide a binary decision if abnormality is present or not, along with a quantitative value for their degree of certainty (a number from 1 to 5). A subjective confidence rating of diagnoses is then used as if it were a threshold to adjust for detection accuracy [6]. A resulting diagnostic accuracy is compared with original image and used to define an acceptable Figure 1. Example of ROC curve compression level. The results for the different compressions or compression levels could be used for quality evaluation of compression techniques. The success of ROC analysis depends on the number of test images and observers included in the study. Therefore, the ROC analyses tend to be expensive and time consuming. For example, a typical ROC study would require over 300 images to obtain a reasonable statistical confidence level, five or more radiologists to view these images, and a fulltime statistician to coordinate and analyze the data [6]. Various results were obtained for image quality by presentation-subjective evaluations. Smith [31] reported that lossy JPEG compression of chest radiographs can be set at levels as high as 30:1. Perlmutter [7] reported that lossy wavelet compression of digital mammograms can achieve compression ratio of 80:1 with no influence on diagnostic accuracy. Przelaskowski [33] reported even better results for JPEG2000 compression of digital mammograms of 140:1 compression ratio. The study [33] was based ROC analysis. 3.3 Technical-objective evaluations PACS as a part of a modern hospital becomes a highly interactive environment that is forming a ubiquitous computing environment for medical work [34, 35]. It is not limited to only one medical facility or to a group of closely spaced facilities. PACS often spreads over vast areas including not only the most prominent and richest of medical facilities, but also the facilities in rural and less developed areas [3]. The best, and expensive devices, are not available for this facilities. Also, it is unreal to expect a 100 Mbit connection (minimum for efficient PACS communication [3]) to all sides of such a sparse system. As a part of the mobile health, devices with less storage, processing, and display capabilities are also a common part of a PACS. These devices can UbiCC Journal – Volume 4 No. 3 645 Special Issue on ICIT 2009 Conference - Bioinformatics and Image process only the limited number of medical images and images of limited size [2]. Also, these devices usually use wireless networks which have capabilities far beneath connected ones [2]. Therefore, to view a medical image on these devices it is necessary to have images scaled for the display size of the mobile device. This could have a negative impact on PACS storage space [36], but it is minimized when the scaled medical images are acquired from the same image codestream as the original sized image i.e. when streaming of medical images is used [37, 38]. Image streaming is a process of gradual buildup of an image by resolution or by pixel accuracy [28]. It enables extraction of a lowerresolution image from the codestream. The architecture of modern PACS is described by Fig. 2. Beside high class hospitals, the system contains less equipped hospital in rural areas and medical mobile devices. These are all reasons for adopting lossy compression of medical images for a PACS, but they are also restrictions which one developing a PACS system should consider. They represent technicalobjective criteria for evaluating a medical image compression technique for a PACS. The parameters of the criteria are overall cost of the system equipment, storage and network requirements, the cost for implementation of the compression technique, compression/decompression speed, streaming possibility of the compression technique, image modalities suitable for the compression, and compression ratio achieved under certain quality assumption. Technical studies comparing different compression techniques evaluated several things [39, 40, 41, 42]: • Compression speed [39, 40, 41, 42]. The studies measured the time elapsed while the sample image was compressed to target compression ratio, and the time elapsed during decompression. This time has impact on overall performance of the system because it can cause data transmission delay. Better PACS performance is achieved if decompression time is minimized, because decompression occurs more often than compression. Therefore, the retrieval oriented compression techniques are common for medical imaging. • Memory and processor power used [40]. The study measured the amount of memory and processor power used during compression and decompression process. The values measured inform about the overall complexity of compression technique which influence overall cost of the system. High requirements influence higher cost. • Compression ratio [40]. The influence of compression technique on storage requirements is expressed as achievable compression ratio. It is measured in respect to image presentation quality, like numerical distortion measures, section 3.A. Storage requirements influence the overall cost of PACS. • Functionalities of a compression technique [41]. Most applications require other features beside quality and coding efficiency of the compression technique. The technical-objective quality evaluations in consulted studies did not evaluate functionalities of a compression technique numerically. Rather they use some method of description. Santa-Cruz [41] provided a functionality matrix that indicated the supported features in each compression technique and an appreciation of how well they are fulfilled, Table 1. They compared JPEG2000, JPEG-LS, JPEG, MEPG-4 VTC, and PNG compression techniques. A set of features (functionality) is included in Table 1. A “+” mark indicates whether the functionality is supported. The more “+” marks, Figure 2. Architecture of a Modern PACS UbiCC Journal – Volume 4 No. 3 646 Special Issue on ICIT 2009 Conference - Bioinformatics and Image Table 1: Functionality matrix – various functionalities of different compression techniques are compared [41]. JPEG2000 +++ +++++ +++++ +++ ++ ++ +++ +++ +++ JPEG-LS ++++ + +++++ ++ +++ JPEG + +++ ++ +++++ ++ ++ MPEG-4 VTC ++++ +++ + ++ + +++ + ++ PNG +++ + +++ + +++ lossless compression performance lossy compression performance progressive bitstreams Region of Interest (ROI) coding arbitrary shaped object random access low complexity error resilience non-iterative rate control generality the more efficiently or better is the functionality supported by compression technique. • Error resilience [40, 41]. It is important to measure the error resilience of compressed images sent over network transmission channels. This is tested by transmitting the compressed data over simulated noise channel. The quality evaluation of medical image streaming has not been studied in the consulted literature. Streaming of medical images is important issue for PACS trying to achieve mobile health (and ubiquitous healthcare, also) and it should be considered during quality evaluation. Because it is supported by limited number of compression techniques, quality evaluation should indicate whether the streaming is supported or not. If compression techniques support image streaming, the quality of extracted low-resolution images should be evaluated. An important issue considering technical aspects of medical image compression techniques is weather to use industry wide standards or to develop a proprietary compression technique [43]. The second approach could lead to more efficient compression techniques, but in long term, it would show more costly. It could compromise PACS communication with equipment and networks not supporting the proprietary compression technique [43]. The long term archives of medical images could be compromised if the system transgresses to another compression technique. The use of industry approved standards can reduce the cost and risk of using compression. 4 3D QUALITY EVALUATION SPACE Quality evaluations of lossy compression techniques differ in many ways. They differ in way whether they consider the application for which the compressed image has been used. Some quality evaluations measure only the performance of the compression technique while other measure only the presentation quality of the restored image. Overall, there is no quality evaluation which measures all the elements of a medical image compression technique. When evaluating medical image compression techniques it is important to measure the quality of the restored images. Presentation evaluations (objective and subjective) measure the presentation (perceptual) quality of a restored image. The values obtained are used to compare the quality of the compressed images and to observe which compression technique achieves higher compression ratio under the same quality assumption. It is easier to compute the presentation-objective measure which is usually presented as a scalar or a vector. These values are comparable and it is easy to obtain which compression technique is better – the one heaving bigger value. They fail to measure precise (local) characteristics of the restored image i.e. they do not consider the medical application of the compression technique. On the contrary, the presentationsubjective evaluations consider the medical application of the compression technique, but they are harder to obtain and cost more than presentationobjective evaluations. The presentation-subjective evaluations are harder to interpret and compare, and they are dependable of observer’s knowledge, experience and perception. The advantage of presentation-subjective evaluations is that they are recommended by official medical organizations which consider compression of medical images (like CAR). The presentation quality evaluations fail to measure technical aspects of a compression technique. Beside restored image quality, it is necessary to obtain technical information about compression technique, like: efficiency (compression speed, achievable ratio, transmission possibilities), error resilience, features (image streaming), and implementation cost and maintenance. The technicalobjective quality evaluations measure technical elements of a compression technique. There are several important technical-objective evaluations measuring different features of a compression technique. The issue is how to correlate them to one value. UbiCC Journal – Volume 4 No. 3 647 Special Issue on ICIT 2009 Conference - Bioinformatics and Image Figure 3. The 3D quality evaluation space for compression of medical images To obtain the complete quality evaluation of a medical image compression technique, it is necessary to use all three previously described quality evaluations: presentation-objective, presentationsubjective, and technical-objective. Only then will the observers adopting a medical image compression for PACS have a complete insight of a given compression technique. This will present the medical staff with highest quality medical images while engaging the optimal processing, storage, and presentational resources. The complete quality evaluation could be improved if there is a correlation function between the quality evaluations used, such as the one described by Eq. (5). ev = f (a ⋅ po, b ⋅ ps, c ⋅ to ) objective evaluation, ROC analysis [6] (as it is subjective measure used most often) for presentation-subjective evaluation, and functionality matrix [41] (being the most comprehensive technical evaluation) for technical-objective evaluation. The proposed combination is still under review. 5 CONCLUSION (5) Variables po, ps, and, to represent values obtained by applying presentation-objective, presentation-subjective, and technical-objective quality evaluations. Factors a, b, and c are weighting factors ranging from 0 to 1 used to define the influence of a particular quality evaluation. Value of 0 cancels the influence of a particular evaluation. Ideally, the result of the correlation function should be a scalar which should define the quality of a compression technique in a simple and a comparable way. A higher value indicates a better quality. Unfortunately, it is more realistic to expect that the result of the correlation function would be a vector which defines the quality of a compression technique in a space defined by presentationobjective, presentation-subjective, and technicalobjective evaluations, Fig. 3. Higher vector intensity indicates a better quality. One possible combination for 3D quality evaluation space would be the use of PSNR [6] (as it is the one most often used), Eq. (3), or Przelaskowski [27] vector measure (because combines it several objective measures), Eq. (4), for presentation- This paper represents a preliminary research. We identified three categories of quality evaluations and criteria: presentation-objective, presentationsubjective and technical-objective. To obtain the complete quality evaluation of a medical image compression technique, it is necessary to use all three categories of quality evaluations. The development of a comprehensive evaluation of all the aspects of a compression technique would ease the task of adopting a medical image compression for a PACS. Our future research will include devising a technical-objective correlation function which will uniformly present the results of technical-objective quality evaluations. The major focus of our future research will be devising a correlation function between all the groups of quality evaluations. We strive to achieve quality evaluation space like the one described by the Fig. 3 which would represent an environment for simple and comprehensive evaluation of medical image compression techniques for PACS. In the case of PACS for the lung hospital we did not have time to wait for development of 3D quality evaluation space. Therefore, we adopted the compression technique that in our opinion (which was drawn from numerous technical studies) offered the most - JPEG2000 compression [36, 37, 38, 39, 40, 41, 42]. It would be interesting to see if this decision correlates with the results in the 3D quality evaluation space for compression of medical images. UbiCC Journal – Volume 4 No. 3 648 Special Issue on ICIT 2009 Conference - Bioinformatics and Image ACKNOWLEDGEMENT This research has been conducted under IT project: “WEB portals for data analysis and consulting”, No.13013, financed by the Government of Republic of Serbia. 6 REFERENCES [1] H.K. Huang: Enterprise PACS and image distribution,” Computerized Medical Imaging and Graphics, Vol. 27, No. 2-3 , 2003, pp. 241253 (2003). [2] A.N. Skodras: The JPEG2000 Image Compression Standard in Mobile Health, MHealth: Emerging Mobile Health Systems, SpringerLink, pp. 313-327 (2006). [3] M.K. Choong, R. Logeswaran, and M. Bister: Cost-effective handling of digital medical images in the telemedicine environment, International Journal of Medical Informatics, Vol. 76, No. 9, pp. 646-654 (2007). [4] A. N. Belbachir, and P. M. Goebel: Medical Image Compression: Study of the Influence of Noise on the JPEG 2000 Compression Performance, The 18th International Conference on Pattern Recognition, Vol. 3, No. 3, pp. 893896 (2006). [5] B.J. Erickson: Irreversible compression of medical images, Journal of Digital Imaging, Vol. 15, No. 1, pp. 5-14 (2002). [6] D. Smutek: Quality measurement of lossy compression in medical imaging, Prague Medical Report, Vol. 106, No. 1, pp. 5-26 (2005). [7] S.M. Perlmutter, P.C. Cosman, R.M. Gray, R.A. Olshen, D. Ikeda, C.N. Adams, B.J. Betts, M.B. Williams, K.O. Perlmutter, J. Li, A. Aiyer, L. Fajardo, R. Birdwell, B.L. Daniel: Image quality in lossy compressed digital mammograms, Signal Process, Vol. 59, No.2, pp. 189-210 (1997). [8] E. Seeram: Irreversible compression in digital radiology. A literature review, Radiography, Vol. 12, No. 1, pp. 45-59 (2006). [9] Y.Y. Chen: Medical image compression using DCT-based subband decomposition and modified SPIHT data organization, International Journal of Medical Informatics, Vol. 76, No. 10, pp. 717-725 (2007). [10] R.M. Slone, D.H. Foos, B.R. Whiting, E. Muka, D.A. Rubin, T.K. Pilgram, K.S. Kohm, S.S. Young, P. Ho, D.D. Hendrickson: Assessment of visually lossless irreversible image compression: comparison of three methods by using an imagecomparison workstation, Radiology, Vol. 215, No. 2, pp. 543-553 (2000). [11]D.A. Koff and H. Shulman: An overview of digital compression of medical images: can we use lossy image compression in radiology?, Canadian Association of Radiology Journal, Vol. 57, No. 4, pp. 211-217 (2006). [12]P.R.G. Bak: Will the use of irreversible compression become a standard of practice?, SCAR News Winter 2006 [Online], Vol. 18, pp. 1–11. Available at: www.siimweb.org/assets/6D50192B-239D413E-96CE-04933B9C17F0.pdf . [13]Group of authors (2008, Jun.). CAR Standards for Irreversible Compression in Digital Diagnostic Imaging within Radiology. The Canadian Association of Radiologists. Ottawa, Ontario, Canada [Online]. Available at: www.car.ca/Files/%5CLossy_Compression.pdf [14]M.P. Eckert and A.P. Bradley: Perceptual quality metrics applied to still image compression, Signal Process, Vol. 70, No. 3, pp. 177-200 (1998). [15]A.M. Eskicioglu: Quality measurement for monochrome compressed images in the past 25 years, Proceedings of the Acoustics, Speech, and Signal Processing – ICASSP, pp. 1907-1910 (2000). [16]Z. Wang, A.C. Bovik, and L. Lu: Why is image quality assessment. so difficult?, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing – ICASSP’02, pp. 3313-3316 (2005). [17]P.C. Cosman, R.M. Gray, and R.A. Olshen: Evaluating quality of compressed medical images: SNR, subjective rating, and diagnostic accuracy, Proceedings of the IEEE, pp. 919-932 (1994). [18]A. Bruckmann and A. Uhl: Selective medical image compression techniques for telemedical and archiving applications, Compututers in Biology and Medicine, Vol. 30, No. 3, pp. 153169 (2000). [19]J. Strom and P. Cosman: Medical image compression with lossless regions of interest, Signal Processing, Vol. 59, No. 2, pp. 155-171 (1997). [20]X. Bai, J.S. Jin, and D. Feng: Segmentationbased multilayer diagnosis lossless medical image compression, Proceedings of the PanSydney Area Workshop on Visual information Processing, pp. 9-14 (2004). [21]S. Bokturk, C. Tomasi, B. Girod, and C. Beaulieu: Medical image compression based on region of interest with application to colon CT images, Proceedings of the 23rd Annual International Conference of the IEEE on medical and Biomedical Engineering, pp. 2453- 2456 (2001). [22]M. Penedo, W.A. Pearlman, P.G. Tahoces, M. Souto, and J.J. Vidal: Region-based wavelet coding methods for digital mammography, IEEE Transactions on Medical Imaging, Vol. 22, No. 10, pp. 1288-1296 (2003). UbiCC Journal – Volume 4 No. 3 649 Special Issue on ICIT 2009 Conference - Bioinformatics and Image [23]G. K. Anastassopoulos and A. Skodras: JPEG2000 ROI coding in medical imaging applications, Proceedings of the 2nd IASTED International Conference on Visualisation, Imaging and Image Processing – VIIP2002, pp. 783-788 (2002). [24]Z. Lu, D.Y. Kim, and W.A. Pearlman: Wavelet compression of ECG signals by the set partitioning in hierarchical trees (SPIHT) algorithm, IEEE Transactions on Biomedical Engeeniring, Vol.47, No. 7, pp. 849-856 (2007). [25]X. Kai, Y. Jie, Z.Y. Min, and L.X. Liang: HVSbased medical image compression, Europian Journal of Radiology, Vol. 55, No. 1, pp. 139145 (2005). [26]A. Bilgin, M.W. Marcellin, and M.I. Altbach: Wavelet Compression of ECG Signals by JPEG2000, in Proceedings of the Conference on Data Compression – DCC, pp. 527 (2004). [27]A. Przelaskowski: Vector quality measure of lossy compressed medical images, Computers in Biology and Medicine, Vol. 34, No.3, pp. 193207 (2004). [28]M. Eyadat and I. Muhi: Compression Standards Roles in Image Processing: Case Study, Proceedings of the International Conference on Information Technology: Coding and Computing – ITCC’05, pp. 135-140 (2005). [29]H.R. Sheikh and A.C. Bovik: Image Information and Visual Quality, IEEE Transactions on Image Processing, Vol. 15, No. 2, pp. 430-444 (2006). [30]M. Aanestad, B. Edwin, and R. Marvik: Medical image quality as a socio-technical phenomenon, Methods of Information in Medicine, Vol. 42, No. 4, pp. 302-306 (2003). [31]I. Smith, A. Roszkowski, R. Slaughter, and D. Sterling: Acceptable levels of digital image compression in chest radiology, Australasian Radiology, Vol. 44, No. 1, pp. 32-35 (2000). [32]T.J. Kim, K.H. Lee, B. Kim, K.J. Kim, E.J. Chun, V. Bajpai, Y.H. Kim, S. Han, and K.W. Lee: Regional variance of visually lossless threshold in compressed chest CT images: Lung versus mediastinum and chest wall, Europian Journal of Radiology, Vol. 63, No. 3, pp. 483-488, (2009). [33]A. Przelaskowski: Compression of mammograms for medical practice, Proceedings of the 2004 ACM Symposium on Applied Computing – SAC '04, pp. 249-253 (2004). [34]J.E. Bardram: Hospitals of the future – ubiquitous computing support for medical work in hospitals, The 2nd international workshop on ubiquitous computing for pervasive healthcare applications – UbiHealth 2003, Seattle, Washington, USA, (2003). [35]C. Atkinson, B. Kaplan, K. Larson, H.M.G. Martins, J. Lundell, and M. Harris: Ubiquitous Computing for Health and Medicine, Designing Ubiquitous Information Environments: Sociotechnical Issues and Challenges, London: Kluwer Academic Publishers, pp. 355-358 (2005). [36]D. Dragan and D. Ivetic: An Approach to DICOM Extension for Medical Image Streaming, Proceedings of 19th DAAAM International Symposium 2008, "Intelligent Manufacturing & Automation", Trnava, Slovakia, pp. 215 (2008). [37]J. Mirkovic, D. Ivetic, and D. Dragan: Presentation of Medical Images Extracted From DICOM Objects on Mobile Devices, The 9th International Symposium of Interdisciplinary Regional Research “ISIRR 2007” Hungary – Serbia – Romania, Novi Sad, Serbia (2007). [38]D. Dragan and D. Ivetic: Chapter 3: DICOM/JPEG2000 Client/Server Implementation, "Environmental, Health, and Humanity Issues in Down Danubian Region, Multidisciplinary Approaches", edited by Dragutin Mihailović & Mirjana Vojinović Miloradov, ISBN: 978-981-283-439-3, World Scientific Publishing Co. Pte. Ltd., pp. 25-34 (2009). [39]L. Chen, C. Lian, K. Chen, and H. Chen: Analysis and Architecture Design of JPEG2000, Proceedings of the ICME pp 277-280 (2001). [40]H. Man, A. Docef, and F. Kossentini: Performance Analysis of the JPEG2000 Image Coding Standard, Multimedia Tools and Applications, Vol. 26, pp. 27-57 (2005). [41]D. Santa-Cruz, T. Ebrahimi, J. Askelof, M. Larsson, and C.A. Christopoulos: JPEG 2000 still image coding versus other standards, Proceedings of the SPIE’s 45th annual meeting, Applications of Digital Image Processing XXIII, pp. 446-454 (2000). [42]M.D. Adams, H. Man, F. Kossentini, and T. Ebrahimi: JPEG 2000: The Next Generation Still Image Compression Standard, Contribution to ISO/IEC JTC 1/SC 29/WG 1 N 1734, (2000). [43]D.A. Clunie: Lossless Compression of Grayscale Medical Images - Effectiveness of Traditional and State of the Art Approaches, Proceedings of the SPIE 2000, pp. 74-84, (2000). UbiCC Journal – Volume 4 No. 3 650 Special Issue on ICIT 2009 Conference - Bioinformatics and Image A MULTI-LEVEL METHOD FOR CRITICALITY EVALUATION TO PROVIDE FAULT TOLERANCE IN MULTI-AGENT SYSTEMS Mounira BOUZAHZAH, Ramdane MAAMRI Lire Laboratory, Mentouri University, Constantine, Algeria mbouzahzah@yahoo.fr ABSTRACT The possibility of failure is a fundamental characteristic of distributed applications. The research community in fault tolerance has developed several solutions mainly based on the concept of replication. In this paper, we propose a fault tolerant hybrid approach in multi-agent systems. We have based our strategy on two main concepts: replication and teamwork. Through this work, we have to calculate the criticality of each agent, and then we divide the system into two groups that use two different replication strategies (active, passive). In order to determine the agent criticality, we introduce a multi-level method for criticality evaluation using agent plans and dependence relations between agents. Keywords: agent local criticality, agent external criticality, hybrid approach, the decision agent, the action criticality. . 1 INTRODUCTION active replication is defined as the existence of several replicas that process concurrently all input messages [7]. This article introduces an approach for fault resistance in dynamic multi-agent systems. Our approach is based on the criticality calculation using agent's plan to determine the agent local criticality. The interdependence relations are used to calculate the agent external criticality. According to their criticalities agents will be oriented towards two different groups: the critical group managed by an agent called the supervisor, this group uses the active replication strategy. The other group uses the passive replication strategy and it is managed by an agent called the controller. The whole system is controlled by the decision agent that initializes agents to criticality evaluation and decides which agents are the most critical. Our approach is general because, first, it is hybrid, it uses the passive and the active replication strategies at the same time; and it uses two levels of criticality evaluation (the local level and the external level). Through this approach we calculate the agent criticality dynamically. The rest of this paper is organized as follows: section2 covers the related works in the field of fault tolerance. Section3 gives a description to the proposed approach based on dynamic replication. Section4 describes the general architecture of the system, and finally, Section5 that gives an insight into our future directions and concludes the paper. Multi-agent systems offer a decentralized and cooperative vision of the problems solving, so, they are particularly well adapted to dynamic distributed problems, but they are prone to the same failures that can occur in any distributed software system. A system faults are classified into to main classes: • Software faults: those are caused by burgs in the agent program or in the supporting environment. • Hardware faults: these faults are related to material failures such as: machine crash, communication breakdown… Several researches are addressed to solve the problem of fault tolerance in multi-agent systems using different strategies. The most important ones are based on the concept of replication. There are different strategies to apply replication, the static strategy which decides and applies replication at design time like in [1], [2] and [3]. The dynamic strategy applies replication during the processing time. This strategy introduces the notion of agent criticality. It is used by [4] and [5]. According to the relation between the agent and its replicas there are two different types of replication. The passive replication that is defined as the existence of one active replica that processes all input messages and transmits periodically its current state to the other replicas in order to maintain coherence and to constitute a recovery point in case of failure [6]. The UbiCC Journal – Volume 4 No. 3 651 Special Issue on ICIT 2009 Conference - Bioinformatics and Image 2 RELATED WORKS 3 THE HYBRID APPROACH Here we review some important works dealing with fault tolerance in multi-agent systems. Hagg [2] proposes a strategy for fault tolerance using sentinels. The sentinel agents listen to all broadcast communications, interact with other agents, and use timers to detect agent crashes and communicate link failure. So, sentinels are guardian agents which protect the multi-agent system from failing in undesirable states. They have the authority to monitor the communications in order to react to fault. The main problem within this approach is that sentinels also are subject of faults. Kumar and al [1] introduce a strategy based on Adaptive Agent Architecture. This strategy uses the teamwork to cover a multi-agent system from broker failures. This approach does not deal completely with agent failures since only some agents (the brokers) or part of them can be replicated. A strategy based on transparent replication is proposed by [3]. All messages going to and from a replicated group are funneled through the replicate group message proxy. This work uses the passive replication strategy. These several approaches apply the replication mechanism according to the static strategy which allows replication at design time. But recent applications and mainly those which use the multiagent systems are very dynamic the fact that makes it too difficult to determine the critical agents at the design time. There are other proposed works that other use the dynamic replication strategy such as: Guesssoum and al [4] introduce an automatic and dynamic replication mechanism. They determine the criticality of an agent using various data such as: time processing, the role taken by an agent in the system… This mechanism is specified for adaptive multi-agent systems. They focus their work the platform DIMA [8]. Almeida A. and al [9] propose a method to calculate the criticality of an agent in a cooperative system. They use agent plan as the basic concept in order to determine critical agent. This work uses the framework DARX [10]. These two works use the dynamic replication that allows replication at the processing time. This strategy requires the criticality calculation. The agent criticality is defined as the impact of a local failure of an agent on the whole system [11]. The dynamic strategy is more important than the static one when dealing within dynamic applications, but it must use a mechanism able to determine when it is necessary to duplicate agents. Agents are subject of failure that can cause the whole system failure. We propose an approach to introduce fault tolerance in dynamic multi-agent systems by the use of two main concepts which are: replication and teamwork. Under our approach the two replication strategies are used (active and passive). Since we deal with dynamic multi-agent systems, we will use the dynamic replication, which means that agents are not duplicated at the same time and within the same manner. The question that arises, therefore, is which are the agents to be replicated? 4 THE CRITICALITY EVALUATION The agent criticality denoted CX is defined as the impact of a local failure of the agent X on the dysfunction of the whole system. An agent that causes a total failure of the system will have a strong criticality. The criticality evaluation in our approach is realized at two main levels: • The local level: here we determine the agent criticality using its plan of actions. • The external level: In order to achieve its current goal the agent does not only use its own data but it relies on other agents. So, we try to evaluate the agent external criticality using the relations between agents. Agent Local Criticality In order to calculate the agent local criticality, we defined an agent according to the model proposed by [12]. Each agent is composed of the following elements: • Goals: the goals an agent wants to achieve. • Actions: the actions the agent is able to perform. • Resources: the resources an agent has control on. • Plans: the plan represents the sequence of actions that the agent has to execute in order to achieve a certain goal. The title should be typed in capital letters, using Times New Roman type face with 14 points in size, bold. It should be centered on the first page beginning on the 6th line. 4.1.1 Agent Plan We conceder that each agent knows the actions sequence that he has to execute in order to achieve its current goal. Therefore, we propose the use of a graph to represent the sequence of actions called agent's plan. These plans are established for short terms because the environment considered is dynamic. The graph that we use in this work is 4.1 UbiCC Journal – Volume 4 No. 3 652 Special Issue on ICIT 2009 Conference - Bioinformatics and Image inspired from that proposed by [9]. The agent plan is represented by a graph where the nodes represent actions and edges represent relations between actions. These relations are the logical functions AND and OR. A node n which is connected to k other nodes (n1, n2... nk) using AND edges represents an action that will be achieved only if all its following actions are executed. However, a node n connected to its k followers using OR edges represents an action that is achieved if only one following action is executed. The work proposed in [5] uses a different description concerning the agent plan and it proposes the existence of internal and external actions. However, we are interested to actions which are executed by the agent (local actions), Thus, according to our description an agent X will be represented as follows (Figure 1): Agent X A AND AND B1 OR C1 B2 OR AND Bk • The number of necessary resources that are required for the execution of an action can be also a factor to determine the initial criticality of an action. When an action requires many resources to be executed, it introduces a strong criticality. • Hardware data influence, also, the action initial criticality. • Finally, according to the application field, the designer can determine semantic information that can define the initial criticality of an action. Thus, at the design time each action A has a value called the initial criticality denoted CIA. 4.1.4 Action Dynamic Criticality The dynamic criticality of an action denoted CD is defined as the value attributed to an action according to its position in the agent plan. There is one factor that can influence the action criticality which is the set of its following actions. We use the function MULTIPLICATION to represent the following actions influence on the considered action when they are connected using AND edges. Since we have indicated that when an action A connected to its followers (B1, B2,…, Bk) by AND edges, the achievement of A implies that all its following actions are achieved. If we represent the actions with a group of sets we will have the following result: A= (B1 B2 ... Bk ). CA = CIA + (CB1 * CB2 *...* CBK) C2 Cn Figure 1. Agent X plan. One other function SUM is used to represent the case where one action is connected to its followers by OR edges. If we consider action B2 (figure 1) connected to its followers (C1, C2, …, Cn) by OR edges, in term of sets we will have: B2 = (C1 ∪ C2 ∪...∪ Cn ) 4.1.2 Action Criticality In this paper we propose the use of two types of action’s criticality: the action initial criticality given by the designer, and the action dynamic criticality calculated according to the agent plan. Thus, the criticality of an action A denoted CA is calculated as follows: CA = initial criticality + dynamic criticality CA= CIA + CDA 4.1.3 Action Initial Criticality We admit that a critical agent is the one which executes critical actions. And we propose the following criteria to define the initial criticality of an action: • An action which can be done by several agents can be regarded as being not too critical, but if one other action is done by few agents it will be regarded as a critical one. Thus, B2 criticality is calculated as follows: CB2 = CIB2 + (CC1 + CC2 +...+ CCn) An action which has no follower is called a terminal action. The dynamic criticality of a terminal action equals to 0. This means that the criticality of a terminal action equals to its initial criticality. 4.1.5 Agent Local Criticality Calculation In order to determine the agent local criticality, we admit that each agent knows at an instant t the actions sequence which it has to execute to achieve its current goal. The local criticality of agent CL agent is calculated as follows: CL agent = Sum ( Caction1 +....+ Caction n). This criticality calculation is made directly by the agent. UbiCC Journal – Volume 4 No. 3 653 Special Issue on ICIT 2009 Conference - Bioinformatics and Image Example: Let's calculate the agent local criticality following the agent plan (Figure2): The relation between agents is defined in our model using the following set: Set = {T, P, N} T: represents the relation type, it can be cooperative or adoptive. P: is the relation weight, here it represents the sum of the initial criticalities of the actions that are executed using this relation: P = Sum CI of the actions executed using the relation Agent X A AND B C OR D E N: the number that represent the agents having the same current goal. The external criticality in this case is calculated as follows: Cex agent = p/N In adoptive case N = 1. 4.3 Agent Criticality The agent criticality denoted Cagent is considered as agent propriety, it is calculated by the agent directly using the following relation: Cagent = CL agent + Cex agent 4.4 Determine the Most Critical Agents Each agent must pass the calculated criticality at the instant t to an other agent called the decision agent. This later uses these values to determine the most critical agents. According to usual arithmetic, the median value of N numbers gives an index to divide a unit into two parts. The decision agent uses the following algorithm in order to determine the two groups of agents. Algorithm: decision Begin Sumcriticalities 0 For each agent I do Read Cagent i /* Cagent i the criticality of the agent I*/ /* the sum of agents criticalities calculation*/ Sumcriticalities Sumcriticalities + Cagent i For each agent I do If (Cagent i >= Sumcriticalities / number of the agents) Then GT =1 Else GT=2 /* GT is an agent property, if GT=1 then the agent is affected to the critical group, else it is in the other group*/ End. Finally, agents are oriented towards two different groups. Figure2. Agent X plan Table1. The actions initial criticalities. CIA 2 CIB 1 CIC 3 CID 5 CIE 10 CA = CIA + (CB * CC) CB = CIB = 1 B is a terminal action. CC = CIC + (CD + CE) CD = CID = 5 D is a terminal action CE = CIE = 10 CC = 18 CA = 20 The local criticality of agent X: C LX = (CA + CB + CC + CD + CE ) = 54. 4.2 Agent External Criticality According to the agent definition shown in the previous section the agent possesses a set of plans. Each plan is formed of a sequence of actions that the agent has to execute in order to achieve its current goal. These actions do not necessarily belong to the agent set of actions; therefore, an agent may depend on other agents to carry on a certain plans. There are six different dependence situations identified by [12]. Through this work we are interested to two main dependence relations which are: • The cooperative relation when an agent infers that he and other agents are depending on each other to realize the same current goal. • The adoptive relation the situation when an agent infers that he and other agents are depending on each other to realize different current goals. UbiCC Journal – Volume 4 No. 3 654 Special Issue on ICIT 2009 Conference - Bioinformatics and Image 4.5 Criticality Re-Evaluation The criticality calculated in the previous sections is determined at the instant t; it must be updated throughout the execution since our system is dynamic. We propose a solution based on two strategies: • Time strategy: the decision agent has a clock that gives alarms to re-evaluate agents' criticalities at each fixed time interval t. • Event strategy: There are many events that act on the system and caused criticality revision such as: an agent failure, a machine failure. 4.6 Determine the Agents Groups The concept of teamwork is used by different approaches such as [1] and [2]. Concerning this approach, criticality calculation leads to the creation of two agents' groups. This stage makes it possible to determine a strategy for fault tolerance. • The critical agents' group: uses the active replication. Each critical agent will have only one active replica called the follower. This later is an agent that has the same plan and executes the same action processed by the critical agent but after the reception of a permission message sent from the supervisor. The supervisor is an agent that guarantees the management of the critical group. • The no critical agents' group: this group uses the passive replication strategy. Each no critical agent will have only one passive replica. It is the no critical agent that executes all the actions and transmits its current state. If the active agent is lost its replica is activated by an other agent called the controller which is the group's manager. The criticality revision is done by the decision agent according to two factors: time-driven factor and event-driven factor .When an agent is considered as critical at a given time t. It establishes a contract with the supervisor agent. So, the agent will have an active replica. If at the instant t + t, the reevaluation of the criticality considered the same agent as no critical its contract will be deleted. And one other contract will be established within the controller. 5 SYSTEM ARCHITECTURE DA SUP CONT CG SA NCG Figure3. The system's architecture. DA: The Decision Agent. SUP: The Supervisor. CONT: The Controller. SA: The system's Agents. CG: Critical Group. NCG: Non Critical Group. The system consists of the dynamic multi-agent system and the three added agents: the decision agent that controls the whole system, the supervisor which manages the critical group and the manager of the no critical group called the controller. 5.1 The Decision Agent This agent offers two fundamental services. First it determines critical agents the fact that allows the division of the whole system into two main groups. And it initializes the agents to the process of criticality re-evaluation following the dynamicity of the system. We use the concept of the sequence diagram [13] in order to represent the decision agent's role as follows (Figure 4). DA 1 2 3 4 SA SUP CONT In order to guarantee fault tolerance in dynamic multi-agent systems, we have added three agents that allow error detection and data recovering. The general architecture of the system is given by the following diagram (figure3): Figure 4. The sequence diagram for the decision agent. DA: The Decision Agent. SA: The System's Agent. SUP: The Supervisor. UbiCC Journal – Volume 4 No. 3 655 Special Issue on ICIT 2009 Conference - Bioinformatics and Image CONT: The Controller. 1: The Criticality Evaluation. 2: Pass the Criticality C. 3: Decision. 4: GT= 1. 5: Establish contract with the Supervisor. 6: GT= 2. 7: Establish contract with the Controller. 5.2 The Supervisor This agent allows the active replication. During execution time, the critical agent transmits periodically its current state to the supervisor, this latter gives permission messages in order to validate the replica's execution. The supervisor allows also failure detection. This service makes it possible to detect if an agent is still alive and that it does not function in a synchronous environment [14]. The supervisor achieves this service within the use of a clock that initializes the control messages sent to the critical agents. Each activated (critical replica) has a failure – timer which gives the max time used by the agent to answer. If the agent does not give an answer a failure is detected. Since the failure detection, the supervisor creates a replica and the follower takes up the failed agent. The supervisor's services are represented by the following diagram (Figure 5). The supervisor Critical agent 1 2 3 agent replication using the passive strategy. This agent verifies and detects failure among its group's agents using the same technique employed by the supervisor. Since the detection of failure, the passive replica will be active and an other passive replica will be added. The controller's sequence diagram is represented as follows (Figure 6): The Controller Non critical agent Passive replica 1 2 3 4 6 8 5 7 9 Figure 6. The sequence diagram for the controller. 1: Establish contract. 2: Passive replication process. 3: Current state's message. 4: Controlling message. 5: Yes. 6: Answer. 7: No. 8: T > Max Time. 9: replica activated + Agent recovering. 6 CONCLUSION Active replica 4 5 7 9 6 8 10 Figure5. The sequence diagram for the supervisor 1: Establish contract. 2: Active replication process. 3: Current state's message. 4: Permission message. 5: Controlling message. 6: Yes. 7: Answer. 8: No. 9: T > Max Time. 10: Agent recovering. 5.3 The Controller Is the no critical agent group's manager it allows This article proposes a rich approach for fault resistance in dynamic multi-agent systems based on replication and teamwork. We use the two strategies (active and passive) within the existence of one strong replica at one time; this fact allows the decreasing of charges. In order to guarantee failure detection and system controlling three other agents are added. In further work, we are interesting to propose a more formal model for criticality calculation and to validate our approach trough implementation. 7 REFERENCES [1] S.Kumar, P. R Cohen., H.J. Levesque:The adaptive agent architecture: achieving faulttolerance using persistent broker teams. , The Fourth International Conference on Multi-Agent Systems (ICMAS 2000), Boston, MA, USA, July 7-12, 2000. [2] S. Hagg : A sentinel Approach to Fault Handling UbiCC Journal – Volume 4 No. 3 656 Special Issue on ICIT 2009 Conference - Bioinformatics and Image in Multi-Agent Systems . , Proceedings of the second Australian Workshop on Distributed AI, Cairns, Australia, August 27, 1996. [3] A. Fedoruk, R. Deters: Improving fault – tolerance by replicating agents. , Proceedings AAMAS-02, Bologna, Italy, P. 144-148. [4] Z.Guessoum , J-P.Briot, N.Faci, O. Marin : Un mécanisme de réplication adaptative pour des SMA tolérants aux pannes. , JFSMA, 2004. [5] A. Almeida, S. Aknine, et al : Méthode de réplication basée sur les plans pour la tolérance aux pannes des systèmes multiagents. , JFSMA, 2005. [6] M. Wiesmann, F. Pedone, A. Schiper, et al: Database replication techniques : a three parameter classification". Proceedings of 19th IEEE Symposium on Reliable Distributed Systems (SRDS2000),Nüenberg ,Germany, October 2000 . IEEE Computer Society. [7] O. Marin : Tolerance aux Fautes. , Laboratoire d'Informatique de Paris6, Université PIERRE & MARIE CURIE. [8] N. Faci, Z. Guessoum, O. Marin: DIMAX: A Fault Tolerant Multi - Agent Platform. , SELMAS' 06. [9] A. Almeida, and al: Plan-Based Replication for Fault Tolerant Multi-Agent Systems. , IEEE 2006. [10] O. Marin, P. Sens,"DARX: A Framework For Tolerant Support Of Agent Software. , Proceedings of the 14th International Symposium on Software Reability Engineering, IEEE,2003. [11] A. Almeida, S. Aknine, et al: A Predective Method for Providing Fault Tolerance in MultiAgent Systems. , Proceedings of the IEEE / WIC/ACM International Conference of Intelligent AgentTechnologie (IAT'06). [12] J. S. Sichman, R. Conte, et al: A Social Reasoning Mechanism Based On Dependence Networks. , ECAI 94, 11th European Conference On Artificial Intelligence, 1994. [13] M. Jaton : Modélisation Objet avec UML. , cours,chapitre13. http://www.iict.ch/Tcom/Cours/OOP/Livre/Livre OOPTDM.html. [14] M. Fischer, N. Lynch, M. Patterson: Impossibility of distributed consensus with one faulty process. , JACM, 1985. UbiCC Journal – Volume 4 No. 3 657 Special Issue on ICIT 2009 Conference - Bioinformatics and Image A MODIFIED PARTITION FUSION TECHNIQUE OF MULTIFOCUS IMAGES FOR IMPROVED IMAGE QUALTITY 1,3 Dheeraj Agrawal1, Dr.Al-Dahoud Ali2, Dr.J.Singhai3 Department of Electronics and Communication Engineering, MANIT, Bhopal. (M.P.), INDIA 2 Faculty of Science and Information Technology, Al-Zaytoolah university of Amman, Jorden. 1 dheerajagrawal@manit.ac.in,2aldahoud@alzaytoonah.edu.jo,3 j_singhai@rediffmail.com ABSTRACT This paper presents a modified Partition fusion technique for multifocus images for improved image quality. In the conventional partition fusion technique image sub blocks are selected for fused image based on their clearity measures. The clearity measure of an image sub block can be determined by second order derivative of the sub image. The performance of these clearity measures is insufficient in noisy environment. In the modified technique, before dividing the image into sub images, it is filtered through linear phase 2-D FIR low pass digital filter to overcome the effect of noise. The modified technique uses choose max selection rule to select the clearer image block from the differently focused source images. Performance of the modified technique is tested by calculating the value of RMSE. It is found that EOL gives lowest RMSE with unequal block sizes while SF gives lowest RMSE with equal block sizes when used as clearity measure in modified partition fusion technique. Keywords: EOL, RMSE, MI, FIR. 1. INTRODUCTION The images are the real description of objects. When these images are taken from camera there are some limitations of a camera system. One of which is the limitation of depth of focus. Due to this an image cannot be captured in a way that all of its objects are well focused. Only the objects of the image with in the depth of field of camera are focused and the remaining will be blurred. To get an image well focused everywhere we need to fuse the images taken from the same view point with different focus settings. The term image fusion is used for practical methods of merging images from various sensors to provide a composite image which could be used to better identify natural and manmade objects. In the recent research works the researchers have used various techniques for multi-resolution image fusion and multi focus image fusion. . Li’ et al.,(2001-2002) introduced a method based on the selection of clearer image blocks from source images[8,9].In this method, image is first partitioned into blocks then focus measure is used as activity level measurement. Based on activity level, best image block is selected by choosing image block having maximum value of activity for fused image. The advantage of this method is that it can avoid the problem of shiftvariant, caused by DWT. Also according to the analysis of the image blocks selection method, the implementation is computationally simple and can be used in a real-time. The limitation of this method is of its robustness to noise. This method does not perform quit well for noisy images. To overcome this limitation preprocessing of the image has been done with the help of a low pass filter. The measure of clarity plays an important role in this kind of fusion method. A better measure results in a superior fusion performance. However, little work has been done on the image clarity measures in the field of multi-focus image fusion. The image clarity measures, namely focus measures, are deeply studied in the field of autofocusing. The paper also considered the fact that the background information lie in low frequency component of the image; so while using different focusing parameters the method proposed will be able to extract the features of background information when the image is passed by a low pass filter. This paper is organized as follows. A brief description of focus measures is given in Section 2. Proposed modified technique for obtaining low RMSE fused image is discussed in Sections 3 and Sections 4 presents results of the proposed method in comparison with existing methods. 2. FOCUS MEASURES A value which can be used to measure the depth of field from the acquired images can be used as focus measure. Depth of field is maximum for the best focused image and generally decreases as the defocus increases. UbiCC Journal – Volume 4 No. 3 658 Special Issue on ICIT 2009 Conference - Bioinformatics and Image A typical focus measure satisfies following requirements: 1. Independent of image content; 2. monotonic with respect to blur; 3. The focus measure must be unimodal, that is, it must have one and only one maximum value; 4. Large variation in value with respect to the degree of blurring; 5. Minimal computation complexity; 6. robust to noise. The conventional focus measures used to measure the clearity of the images are variance, EOG, EOL, and SF. These focus measures are expressed as following for an M x N image with f(x, y) be the gray level intensity of pixel (x, y). 1. Variance: The simplest focus measure is the variance of image gray levels. The expression for the M × N image f(x, y) is: RF = 1 M N ∑∑ (f ( x, y) − f ( x, y −1))2 M × N x=1 y =2 and CF = 1 M N ∑∑(f (x, y) − f (x −1, y))2 M × N x=2 y=1 f ( m ,n ) - µ 5. Visibility (VI): This focus measure is inspired from human visual system, and is defined as V I= ∑ M m =1 n=1 ∑ N µ α +1 Where µ is the mean intensity value of the image, and α is a visual constant ranging from 0.6 to 0.7. 3. MODIFIED TECHNIQUE FOR LOW RMSE Most of the focus measures are based on the idea of emphasizing high frequency contents of the image and measure their quantity. This comes from an idea that blurring suppresses high frequencies regardless of particular Point Spread Function. [13] Considering the performance of various focus measures, EOL found to be the best among all [8]. Laplacian of an image is determined by second order derivative of the image. The performance of the second order derivative decreases if noise is present in the source images as show in Fig-1 Fig-1 (A), (E) 1 variance = M×N µ = ∑ ∑ (f ( x , y ) − µ ) x =1 y =1 N M N 2 , Where µ is the mean value and is given as 1 M×N ∑ ∑ f ( x, y ) x =1 y =1 M 2. Energy of image gradient (EOG): This focus measure is computed as: M −1 N −1 EOG= Where ∑ ∑ (f x =1 y =1 2 x + f y2 ) Fig-1 (B), (F) f x = f ( x + 1, y ) − f ( x , y ) f y = f ( x , y + 1) − f ( x , y ) 3. Energy of Laplacian of the image (EOL): It is used for analyzing high spatial frequencies associated with image border sharpness is the Laplacian operator. M −1 N −1 Fig-1 (C), (G) EOL= ∑ ∑ x=2 y=2 ( f xx + f yy ) 2 Fig-1 (D), (H) Where fxx +fyy =−f(x −1, y −1) −4f(x −1, y) −f(x −1, y +1) −4f(x, y −1) +20f(x, y) −4f(x, y +1) −f(x +1, y −1) −4f(x +1, y) −f(x +1, y +1) 4. Spatial frequency (SF): Strictly speaking frequency is not a focus measure. It is a modified version of the Energy of image gradient (EOG). Spatial frequency is defined as: SF = RF2 + CF2 Where RF and CF are row and column frequencies respectivly: Fig-1 the performance of second order derivative in presence of various degree of noise. UbiCC Journal – Volume 4 No. 3 659 Special Issue on ICIT 2009 Conference - Bioinformatics and Image Fig-1(A) shows ramp edges profile of an image separating black region and white region. The entire transition from black to white represents a single edge. In fig-1(A) image is free of noise and its grey level profile is sharp and smooth.Fig-1(B-D) are corrupted by additive Gaussian noise with zero mean and standard deviation of 0.1, 1.0 and 10.0 intensity levels respectively and their respective grey level profile shows noise added on the ramp by ripple effects. The images in the second column are the second derivatives of the images on the left. Fig-1(E) shows two impulses representing presence of edge in the image.Fig-1(F-H) shows that as the noise increases in the image the detection of impulses becomes difficult making it nearly impossible to detect the edge in the image. This shows that the focus measure using the second order derivative also fails to decide about the best focused image in noisy environment. Thus for selection of best focused image removal of noise is essential before applying fusion technique to obtain best focused image. The proposed focusing technique uses the linearphase 2-D FIR low pass digital filter to remove the noise from the differently focused images. Filter uses Parks-McClellan algorithm [19], [20].The ParksMcClellan algorithm uses filter with Equiripple or least squares approach over sub-bands of the frequency range and Chebyshev approximation theory to design filters with an optimal fit between the desired and actual frequency responses. The filters are optimal in the sense that the maximum error between the desired frequency response and the actual frequency response is minimized. Filters designed this way exhibit an equiripple behavior in their frequency responses and are sometimes called equiripple filters. Filters exhibit discontinuities at the head and tail of its impulse response due to this equiripple nature.These filters are used in existing fusion algorithm before partitioning the image as shown in fig-3. The source images are passed through 2D FIR low pass filter of order 4 and having characteristic as shown in fig-2. For these low pass filtered images conventional focus measure such as A FIR LPF Activity level measure Variance, Energy of Gradient, Energy of Laplacian, Spatial frequency are computed. Fig.2.Perspective plot of linear phase 2-D FIR Lowpass digital filter Setup for proposed algorithms A schematic diagram for proposed image fusion method is shown in Fig-3.The paper proposes modification for obtaining best focus measure in noisy environment by use of filter at step -2 in the existing algorithms used by Li et. al [8]. The fusion method consists of the following steps: Step 1. Decompose the differently focused source images into blocks. Denote the ith image block of source images by Ai and Bi respectively. Step 2. Filter the images through a 2D FIR low pass filter for removal of noise. Step 3. Compute the focus measure of each block, and denote the results of Ai and Bi by MiA and ,MiB respectively. Step 4. Compare the focus measure of two corresponding blocks Ai , and Bi and construct the ith block Di of the composite image as ⎧A Di = ⎨ i ⎩ Bi Mi > Mi B A Mi > Mi A B Step 5. Compute root mean square error (RMSE) for the composite image with a reference image Portitioned images B FIR LPF Combining by choose max Fused image Fig.3: Schematic diagram for evaluating proposed focusing technique in Multi-focus image fusion UbiCC Journal – Volume 4 No. 3 660 Special Issue on ICIT 2009 Conference - Bioinformatics and Image 4. RESULTS: The experiment is performed on toy image of size 512×512. The multifocus images used for fusion are left focused, right focused and middle focused as shown in Fig 4, 5 and 6 respectively. These multifocus images are filtered through linear phase 2D FIR low pass digital filter to reduce low frequency noise then filtered images are fused using Li’s algorithm for various focus measures. The performance of existing and modified algorithm is compared qualitatively by calculating RMSE of fused images. RMSE is defined as: RMSE= ∑ ∑ {R ( x, y ) − F( x, y )} x =1 y =1 M N 2 M×N Where R and F are reference image and composite image respectively, with size M × N pixels. Table-1 shows the RMSE of fused images using different focus measures and for equal block size of images. Table-2 shows the RMSE of fused images for unequal block size of source images. Table-1 shows that fused images using SF as focus measures gives lowest RMSE values and Table-2 shows that for unequal block size of images EOL perform better then other clearity measures when used in modified partition fusion technique. The analysis of Table-1 shows that RMSE of fused image decreases with increase in the block size of sub image only with SF. Analysis of Table-2 shows that RMSE of fused image decreases with increase in the block size of sub image for all clearity measures because the larger image block gives more information for measuring the clarity of image block. However using a block size too large is undesirable because larger block of sub image may contain two or more objects at different distances from the camera, and consequently will lead to a less clear image. The experimental results in table-1 and table-2 show that the performance of proposed method for all the focus measures improves with reduced RMSE with nearly one forth of RMSE of existing algorithm. Visual analysis is shown form fig-4 to Fig-14. Fig-7 is the reference image taken all parts focused. Fig 8 to Fig 11 shows the fused images while considering different focus measures with existing partition fusion method. Fig 12 to Fig 14 shows the fused images while considering different focus measures with proposed modified algorithm of partition fusion with 2-D low pass filter. 5. CONCLUSION: In this paper modified method of image fusion was used considering various focus measure capabilities of distinguishing clear image blocks form blurred image blocks. Experimental results show that preprocessed, 2-D FIR low pass filtered image in modified method provide better performance in terms of low RMSE than the previous methods of information fusion. Also from the results it is concluded that performance of the image fusion method depends on block size taken during the partitioning of source images. The experiment shows that EOL gives low RMSE with unequal block sizes while SF gives low RMSE with equal block sizes. This is an issue that will be investigated in future on adoption methods for choosing the image block size. Table-1 Evaluation of different focus measures with equal block sizes on basis of RMSE Block size variance 4×4 8×8 16×16 32×32 64×64 4.5814 4.3658 4.7037 4.4221 4.6588 EOG 3.9383 4.0264 4.7720 4.6485 4.6000 Focus measure Partition fusion method EOL 3.6437 3.1466 3.4659 3.0888 3.8727 SF 3.9383 3.9292 4.0517 3.8506 3.8927 VI 4.1383 4.2110 3.9574 3.6183 3.4368 Modified Partition fusion Method Variance EOG EOL SF of of LPF of LPF of LPF LPF images images images images 0.9514 0.9514 0.9301 0.9514 0.9606 0.9373 0.8686 0.9373 1.1872 1.1820 1.1561 0.8827 1.2043 1.1382 1.1531 0.8949 1.2194 1.2248 1.2013 0.8744 Numbers in bold and italic indicate the lowest RMSE obtained over different block sizes UbiCC Journal – Volume 4 No. 3 661 Special Issue on ICIT 2009 Conference - Bioinformatics and Image Table-2 Evaluation of different focus measures with unequal block sizes on basis of RMSE Block size variance 4×8 8×16 16×32 32×64 4.5447 4.4089 4.6329 4.2559 EOG 4.0106 4.0035 4.0159 3.9797 Focus measure Partition fusion method EOL 3.3199 3.1806 3.8220 3.5020 SF 4.0118 4.0407 3.9351 3.8944 VI 4.2340 4.1160 3.9399 3.5630 Modified Partition fusion Method Variance EOG EOL SF of of LPF of LPF of LPF LPF images images images images 0.9626 0.9626 0.9073 0.9626 0.9346 0.9284 0.8843 0.9255 0.9119 0.8923 0.8776 0.8889 0.9066 0.8893 0.8715 0.8874 Numbers in bold and italic indicate the lowest RMSE obtained over different block sizes REFERENCES: [1] Burt, P.J., Andelson, E.H., 1983.The Laplacian pyramid as a compact image code.IEEE Trans. Commun.31, 532-540. [2] Burt,P.J.,Kolezynski,R.J., 1993.Enhanced image capture through fusion.In:Proc.4th Int. Conf. on Computer Vision,Berlin,Germany,pp.173-182. [3] Eltouckhy,H.A., Kavusi,S.,2003.A Computationally Efficient Algorithm for MultiFocus Image Reconstruction.In:Proc. Of SPIE Electronic Imaging.pp.332-341. [4] Eskicioglu,A.M.,Fisher,P.S., 1995.Image quality measures and their performance.IEEE Trans. Commun. 43(12), 2959-2965. [5] Hill,P.,Canagarajah,N.,Bull,D.,2002.Image Fusion using Complex Wavelets.In:Complex British Machime Vision Proc. 13th Conf.University of Cardiff,UK,pp.487-496. [6] Krotokov,E.,., 1987.Focusing.Int.J.Comput.vis.1,223-237 [7] Li,H.Manjunath,B.S., Mitra,S.K.,1995.Multisensor image fusion using wavelet transform.Graph. Models Image Process.57 (3), 235-245. [8] Li,S.,Kwok,J.T.,Wang,Y.,2001.Combination of images with diverse focuses using the spatial frequency.Inf.Fusion 2,169-176. [9] Li,S.,Kwok,J.T.,Wang,Y.,2002.Multi focus image fusion using Artificial Neural Networks.Pattern Recognit.Lett.23,985-997. [10] Ligthart,G.,Groen, F.,1982.A Comparison of different Autofocus Algorithms. In:Proc.Int.Conf.on Pattern Recognition.pp.597600. [11] Miao,Q.,Wang,B.,2005.A Novel Adaptive Multi-focus Image Fusion Algorithm Based on PCNN and sharpness.In:Proc.of SPIE,VOL.5778.pp.704-712. [12] Nayar,S.K.,Nakagawa,Y.,1994.Shape from focus.IEEE Trans.Pattern Anal. Mach.Intell.16(8),824-831. [13] Subbarao,M.,Choi,T.,Nikzad,A.,1992.Focusing Techniques.In:Proc.SPIE. Int.Soc. Opt. Eng., 163-174. [14] Toet,A.,Van Ruyven,L.J.,Valeton , J.M.,1989.Merging thermal and visual images by a contrast pyramid.Opt.Eng.28(7),789-792. [15] Unser,M.,1995.Texture classification and segmentation using wavelet frames.IEEE Trans.Image Process.4(11),1549-1560. [16] Yeo, T.,Ong, S.,Jayasooriah,S.R., 1993.Autofocussing for tissue microscopy.Image Vision Comput.11,629-639. [17] Wei Huang, Zhongliang Jing .,2006.Evaluation of Focus Measures in Multi-focus image fusion.Pattern Recognit.pp.lett .28(2007).49350 UbiCC Journal – Volume 4 No. 3 662 Special Issue on ICIT 2009 Conference - Bioinformatics and Image Fig. 4 left focused image Fig. 5 right focused image Fig. 6 middle focused image Fig.7.All focused image (reference image) Fig .8. Fused images Formed from variance Fig.9.Fused images formed From EOG(16×32) Fig.10. Fused images formed from EOL(32x64) Fig.11.Fused images formed from SF (32x 64) Fig.12.Fused images formed from LPF and SF (32×32) Fig.13.Fused images formed from LPF and EOG(16×32) Fig.14.Fused images formed from LPF and EOL(64×64) UbiCC Journal – Volume 4 No. 3 663 Special Issue on ICIT 2009 Conference - Bioinformatics and Image INTEGRATING BIOMEDICAL ONTOLOGIES – OBR-SCOLIO ONTOLOGY Vanja Luković Information technology Danijela Milošević Information technology Goran Devedžić Information technology ABSTRACT This paper analyses a broad scope of research papers dealing with the process of integrating biomedical ontology with the FMA reference ontology. Namely, we want to investigate the capability of this process appliance in development of the OBR-Scolio application ontology for the pathology domain of spine, rather the scoliosis domen. Such ontology is one of the many objectives in the realization of the project named: “Ontological modeling in bioengineering” in the domain of orthopedics and physical medicine. Keywords: Biomedical ontology, Formal ontology, Reference ontology, Application ontology 1 INTRODUCTION Biomedical ontologies are being developed in ever growing numbers, but there is too little attention paid for ontology alignment and integration, in other that results already obtained by the one terminology based application ontology can be utilized in other similar application ontologies. No scientific advance can be obtained with the horizontally integration between two application ontologies, although vertical integration between ontologies in all categories is needed [1]. In this way formal, top level ontologies should provide the validated framework for reference ontologies, which represent the domains of reality studied by the basic biomedical sciences. The latter should then in turn provide the scientifically tested framework for a variety of terminology-based ontologies developed for specific application purposes. In this paper according to [1], we denote how the process of vertical integration of the FMA (Foundational Model of Anatomy) reference ontology [5] with the BFO (Basic Formal Ontology) top-level ontology [3] can support the process of horizontal integration of the two reference ontologies: PRO (Physiology Reference Ontology) [8] and PathRO (Pathology Reference Ontology), forming accordingly the new reference ontology OBR (Ontology of Biomedical Reality), which is therefore federation of the three independent reference ontologies which range over the domains of anatomy, physiology and pathology. Moreover according to [9], we denote the process of vertical integration of the RadLex radiology terminology with the FMA reference ontology, forming this way FMA-RadLex application radiology ontology. This described process is then utilized for forming the OBR-Scolio application ontology for the pathology domain of spine (scoliosis domen) from the OBR reference ontology. 2 BFO ONTOLOGY BFO [3] is a formal, top-level ontology which is based on tested principles for ontology construction. BFO consists of the SPAN and SNAP ontologies. The SPAN ontology relates to occurrents, processing entities (events, actions, procedures, happenings) which unfold over an interval of time. The complementary SNAP ontology relates to continuants, the participants in such processes, which are entities that endure over the time, during the period of their existence. Anatomy is a science that studies biological continuants, while physiology studies biological occurrents. Pathology, on the other hand, is concerned with structural alterations of biological continuants and with perturbations of biological occurrents which together are manifested as diseases. Moreover, BFO draws distinctions also between instances and universals and specifies relations which link them. 3. FMA ONTOLOGY The FMA [5] is reference ontology for anatomy, which according independent evaluations satisfies fundamental requirements for ontological representation of human anatomy [6, 7]. UbiCC Journal - Volume 4 No. 3 664 Special Issue on ICIT 2009 Conference - Bioinformatics and Image Hence, the domain of the FMA is anatomy of the idealized human body. FMA uses the hierarchy of classes of anatomical entities (anatomical universals) which exist in reality through their instances. The root of the FMA’s anatomy taxonomy (AT) is Anatomical entity and its dominant class is Anatomical structure. Anatomical structure is defined as a material entity which has its own inherent 3D shape and which has been generated by the coordinated expression of the organism’s own structural genes. This class includes material objects that range in size and complexity from biological macromolecules to whole organisms. The dominant role of Anatomical structure is reflected by the fact that non-material physical anatomical entities (spaces, surfaces, lines and points) and body are conceptualized in the FMA, in terms of their relationship to anatomical structures. 4. OBR ONTOLOGY Figure 1. Ontology of Biomedical Reality OBR The root of OBR is the universal Biological entity (Fig. 1). A distinction is then drawn between the classes: Biological continuant and Biological occurrent, the definitions of which are inherited from BFO [3]. The class Biological continuant is subdivided into classes: Оrganismal continuant, which includes entities that range over single organisms and their parts and Extra-organismal continuant, which includes entities that range over aggregates of organisms. Accordingly, the class Biological occurrent is subdivided into classes: Оrganismal occurent and Extra-organismal occurent, which include processes associated with single organisms and their parts i.e. processes associated with aggregates of organisms. The class Organismal continuant is subdivided into classes: Independent organismal continuant and Dependent organismal continuant. Extrapolating from the FMA’s principles, Independent organismal continuants have mass and are material, whereas Dependent organismal continuant are immaterial and do not have mass. OBR ontology distinguishes anatomical (normal) from pathological (abnormal) material entities. Accordingly, the class Independent organismal continuant is subdivided into classes: Material anatomical entity and Material UbiCC Journal - Volume 4 No. 3 pathological entity. The class Material anatomical entity is subdivided into classes: Anatomical structure and Portion of canonical body substance, on the basis of the possession or non-possession of inherent 3D shape. Within the class anatomical structure OBR ontology make a distinction between canonical anatomical structures, which exist in the idealized organism, and variant anatomical structures, which result from an altered expression pattern of normal structural genes, without health related consequences for the organism. The class Material pathological entity is subdivided into classes: Pathological structure and Portion of pathological body substance, on the basis of the possession or non-possession of inherent 3D shape, too. Pathological structures are result from an altered expression pattern of normal structural genes, with negative health consequences for the organism. The class Dependent organismal continuant is subdivided into classes: Immaterial anatomical continuant, Immaterial pathological continuant and Physiological continuant. Although the existence of immaterial anatomical and pathological spaces and surfaces and anatomical lines and points depends on corresponding independent continuant entities, they are dependent continuants. Besides them classes: Function, Physiological state and Physiological role and classes: Malfunction, Pathological state and Pathological role also belongs to Dependent organismal continuant, because their entities do not exist without corresponding independent continuant entities. Functions are certain sorts of potentials of independent anatomical continuants for engagement and participation in one or more processes through which the potential becomes realized. Тhe function is a continuant, since it endures through time and exists even during those times when it is not being realized. Whether or not a function becomes realized depends on the physiological or pathological state of the associated independent anatomical continuant. Thereat, physiological and pathological state is a certain enduring constellation of values of an independent continuant’s aggregate physical properties. These physical properties are represented in the Ontology of Physical Attributes (OPA), which provides the values for the physical properties of organismal continuants. Namely, the states of these continuants can be specified in terms of specific ranges of attribute values. The independent continuants that participate in a physiological or pathological process may play different roles in the process (e.g. as agent, cofactor, catalyst, etc.). Such a process may transform one state into another (for example a physiological into another physiological, or into a pathological state). The class Organismal occurent is subdivided into classes: Physiological process and Pathological process. Physiological process courses transformations of one physiological state 665 Special Issue on ICIT 2009 Conference - Bioinformatics and Image into another physiological state, whereas pathological process courses transformation of a physiological state into a pathological state or one pathological state into another pathological state. The relative balance of these processes results either in the maintenance of health or in the pathogenesis of material pathological entities, and thus in the establishment and progression of diseases. Transformation of a pathological state into a physiological, manifest as healing or recovery from a disease, comes about through physiological processes that successfully compete with and ultimately replace pathological processes, namely function is restored. Processes are extended not only in time but also in space by virtue of the nature of their participants. 5. RADLEX TERMINOLOGY The Radiological Society of North America developed a publicly available (RSNA) terminology, RadLex [12], to provide a uniform standard for all radiology-related information. RadLex terminology is organized into a hierarchy (Fig. 2) and subsumes over 7400 terms organized in 9 main categories or types with RadLex term as the root. However RadLex terminology does not yet have a principled ontological framework [14] for these three reasons: 1) 2) 3) being term-oriented, RadLex currently ignores the entities to which its terms project; the lack of a taxonomy grounded in biomedical reality; the ambiguity and mixing of relations (such as is_a, part_of, contained_in) represented by the links between the nodes of the term hierarchy (Fig. 2). 6. DERIVATION THE FMA-RADLEX APPLICATION ONTOLOGY Figure 3: FMA-RadLex (right) derived from the FMA (left) Terms relating to anatomy are represented in the RadLex terminology category Anatomic location, which corresponds to the category Anatomical entity, used by other disciplines of biomedicine. This is not radiology image entity, yet the entity that exists in the reality. Anatomic location is therefore renamed as the FMA root term Anatomical entity (Fig. 3). For the image findings representing radiology images entities the separate ontology should be created. Application ontology from the FMA can be derived either by: 1. Obtaining an entire copy of the FMA and pruning the ontology down to the required specifications - de novo construction. 2. Mapping the existing terminology project to the FMA, carving out the ontology around the mappings and finally incorporating the derivatives into the existing terminology project. The latter method was applied in constructing the anatomy application ontology for RadLex [9]. Hence, high level RadLex terms are first mapped to the corresponding FMA terms, and then their corresponding FMA super-types are imported into the RadLex taxonomy. After that, other terms at different levels of the RadLex tree are mapped to the corresponding FMA terms, and then their corresponding FMA super-types are imported into the RadLex taxonomy super-types. In RadLex anatomy taxonomy the highest level parents of the imported super-types of the FMA are incorporating, as well: Anatomical structure which subsumes 3-D objects that have inherent shape, e.g. body, organ system, and organ, and Immaterial anatomical entity which encompasses types that have no mass property, such as: anatomical space, anatomical surface, anatomical line and anatomical point. Hence, this conclusion can be divided: the operation of construction the same ontology via the In the next section, according [9] is described how a portion of reference ontology, such as the FMA, can be adopted to lend application ontology in which all challenges mentioned above are resolved. Figure 2. RadLex hierarchy in Protégé UbiCC Journal - Volume 4 No. 3 666 Special Issue on ICIT 2009 Conference - Bioinformatics and Image de novo approach, would involve a series of deletion and addition of links (Figure 3, left) from the FMA reference ontology. For example, the is_a link of the class Anatomical structure is deleted from Material anatomical entity and then added directly to Anatomical entity. Both Physical anatomical entity and Material anatomical entity are then deleted from the FMA taxonomy. Beside that, FMA types representing microscopic entities which are not relevant to radiology such as Cell, Cardinal cell part, Biological macromolecule, Cardinal tissue part, are also deleted from Anatomical structure. These operations can be carried out in all levels of the hierarchical tree. 7. DERIVATION OF THE OBR-SCOLIO APPLICATION ONTOLOGY In constructing the OBR-Scolio application ontology for the pathology domain of spine (scoliosis domen) from the OBR reference ontology the de novo method was applied (Figure 4). All the classes which are not relevant to the pathology domain of spine, such as: Matherial anatomical entity, Immaterial anatomical continuant, Physiological continuant, Extra–Organismal continuant, Biological occurent and also theirs relevant subclasses are deleted from the hierarchical tree of the OBR reference ontology. The hierarchical tree of the OBR ontology class Pathological structure and also its subclasses: Subdivision of pathological organ system, Subdivision of pathological skeletal system and Subdivision of pathological axial skeletal system, from which all subclasses which are not relevant for the pathological domen of spine are deleted, are illustrated in Fig. 5, Fig. 6, Fig. 7 and Fig. 8. Figure 5: Subclasses of the Pathological structure class Figure 6: Subclasses of the Subdivision of pathological organ system class Figure 4: OBR-Scolio application ontology derived from the OBR reference ontology UbiCC Journal - Volume 4 No. 3 667 Special Issue on ICIT 2009 Conference - Bioinformatics and Image OBR-Scolio application ontology in the pathology domain of spine (scoliosis), which is one of the many objectives in the realization of the project named: “Ontological modeling in bioengineering 1 ” in the domain of orthopedics and physical medicine. 8. REFERENCES [1] Cornelius Rosse, MD, DSc, Anand Kumar, MD,PhD, Jose LV Mejino Jr, MD, Daniel L Cook, MD, PhD, Landon T Detwiler, Barry Smith, PhD: A Strategy for Improving and Integrating Biomedical Ontologies, AMIA Annu Symp Proc. (2005), pp. 639-643. [2] http://www.loucnr.it/DOLCE.html [3] Grenon P, Smith B, Goldberg L: Biodynamic ontology: applying BFO in the biomedical domain, In DM Pisanelli (ed.), Ontologies in Medicine, Amsterdam: IOS Press, 2004,20-38. Figure 7: Subclasses of the Subdivision of pathological skeletal system class [4] Open Biomedical http://obo.sourceforge.net/ Ontologies: [5] Rosse C, Mejino JLV Jr.: A reference ontology for biomedical informatics: the Foundational Model of Anatomy, J Biomed Inform, 2003 Dec; 36(6):478-500. [6] Smith B, Köhler J, Kumar A.: On the application of formal principles to life science data: A case study in the Gene Ontology, DILS 2004: Data Integration in the Life Sciences. 2004; 124139. [7] Zhang S, Bodenreider O.: Law and order: Assessing and enforcing compliance with ontological modelling principles, Computers in Biology and Medicine 2005: in press. [8] Cook DL, Mejino JLV, Rosse C.: Evolution of a Foundational Model of Physiology: Symbolic representation for functional bioinformatics, Medinfo 2004; 2004:336-340. [9] Jose L.V. Mejino Jr, Daniel L. Rubin, and James F. Brinkley: FMA-RadLex: An Application Ontology of Radiological Anatomy derived from the Foundational Model of Anatomy Reference Ontology, Proceedings AMIA Symposium 2008: Page 465-469. [10] Rosse C, Mejino JLV 2007: The Foundational Model of Anatomy Ontology, in: Burger A, Davidson D, Baldock R. (eds.), Anatomy Figure 8: Subclasses of the Subdivision of pathological axial skeletal system class Ultimately, in the Fig. 9 all subclasses of the Pathological vertebral column class are illustrated. Figure 9: Subclasses of the Subdivision of pathological vertebral column class ACKNOWLEDGEMENT By vertical integration of the FMA reference ontology with the BFO top-level ontology the process of horizontal integration of the two reference ontologies: PRO and PathRO is supported, forming accordingly the new reference ontology OBR, which range over the domains of anatomy, physiology and pathology. This ontology can be successfully applied in development of the 1 “Ontological modeling in bioengineering”, Project funded by national Ministry of science, Faculty of mechanical engineering, University of Kragujevac, Serbia (2008-2010) 668 UbiCC Journal - Volume 4 No. 3 Special Issue on ICIT 2009 Conference - Bioinformatics and Image Ontologies for Bioinformatics: Principles and Practice, pp 59-117, New York: Springer. [11] Online http://www.rsna.org/radlex. аvailable at: [13] Rubin DL 2007: Creating and curating a terminology for Radiology: Ontology Modeling and Analysis, J Digit Imaging. [14] Marwede D, Fielding M and Kahn T. 2007 RadiO: A Prototype Application Ontology for Radiology Reporting Tasks, Proc AMIA 2007, Chicago. IL, pp 513-517. [15] FMA Online аvailable http://fma.biostr.washington.edu. at: [12] Langlotz CP: RadLex 2006. a new method for online educational materials, indexing Radiographics 26:1595–1597. UbiCC Journal - Volume 4 No. 3 669