bookkeeping software

European Laboratory for Particle Physics Laboratoire Européen pour la Physique des Particules CH-1211 Genève 23 - Suisse Atlas Offline Software Application Metadata Requirements Document Version: Document Date: Document Status: Document Author: 1 18 July 2002 12:35 DRAFT Solveig ALBRAND, Jerome FULACHIER Abstract The functions which will be provided for the user interfaces of the Atlas Offline Software Application Metadata catalogue are described. Table 1 Document Change Record Title: ID: Version: Date: Page Atlas Offline Software Application Metadata Requirements [Document ID] 1 2002-07-16 Paragraph Reason for Change Originator: Approved By: S. Albrand 1 Introduction This document is a list of the requirements for an Atlas Application Metadata Catalogue, and the interfaces used to access the catalogue. An application metadata base or metadata catalogue is also known as a bookkeeping database. It could also be called a data “warehouse” because in practice, several different databases will be manipulated, and the evolution of their structure must be managed. DRAFT page 1 Atlas Offline Software Application Metadata 2 Definition of terms. References Requirements Version/Issue: 1/1 Its purpose is to • contain a logical description of the data produced in various processing steps which may be necessary in the analysis of physics data. This data may be simulated data (Monte Carlo data) or come from real detector output (raw data). provide a set of user interfaces to access and to manage the metadata. • 1.1 Purpose of this document This document attempts to list a complete set or user requirements, in the sense that all user requirements should be included, even if some requirements can not be completely defined. User requirements should be testable. The list of user requirements can be considered as a list of tests which can be used to accept or reject a design and implementation of an application. The requirements list is used to construct the design of the application. No details of design or implementation are discussed in the present document. The requirements document is not based on, or related to, an existing system. 1.2 Structure of the document Section 2 contains a definition of the terms used in the document, and a list of documents used in its preparation. Section 3 discusses the context, constraints and dependencies of the application Section 4 lists the specific constraints, assumptions, dependencies, and section 5 lists use cases and requirements. TBD Some items are followed by a paragraph with this format - which marks an item to be discussed. 2 Definition of terms. References No Atlas wide reference document currently exists which defines the terms which should be used to describe the entities which the bookkeeping application must manage. This is a situation which leads to ambiguity, as each application is obliged to define its own terms. Below is the glossary of terms which are used in the present document. page 2 DRAFT Atlas Offline Software Application Metadata 2 Definition of terms. References Requirements Version/Issue: 1/1 2.1 Glossary event The ensemble of data for a particular beam crossing, or a subset of this data. Event data may be “real”, directly recorded from the detector for a particular set of trigger conditions, or simulated, using Monte Carlo techniques. eventID A tag, which could just be an integer, which defines the event, either within the dataset, or uniquely within all Atlas. dataset A collection of events. datasetNumber A tag, which could just be an integer, which is assigned to a dataset. We will use this term to mean that part of the total identification of data which is given to the dataset at its creation. This means that for real data, the datsetNumber is the same as the run number assigned by the DAQ. partition A file which contains a part of a dataset. Datasets have to be divided into partitions because of file size limitations. partitionNumber An integer, from 1 to N, where N is the number of partitions created for a given dataset. project A set of datasets which have been created with the same physics, or computing purpose. Each project has a project name, for example “dc0”, “dc1”. processingStep A dataset, once created, may undergo a sequence of different processes. We refer to each process in the sequence as a processingStep. Each processingStep has a name, for example “simul”. Different projects may choose to define different sets of processing steps. A processing step maps to a particular algorithm or sequence of algorithms. passNumber A dataset may undergo the same processingStep several times with different parameters. datasetID A datasetID (or dataset name) is a combination of other terms which is unique within Atlas. For example datasetNumber.processingStep.passNumber. attribute An attribute is a named property of a dataset or a partition. Each project and processing step pair is associated with a set of attributes, and the set of relations between these attributes. logical file name A tag which completely identifies a partition. It must be unique within the Atlas collaboration. It consists of at least, the dataset Name and the partition number. DRAFT page 3 Atlas Offline Software Application Metadata 2 Definition of terms. References Requirements Version/Issue: 1/1 2.1.1 Acronyms and abbreviations AM AMB MC AMI LFN Application Metadata Application Metadata Base Metadata Catalogue (synonym for AMB) Application Metadata Interface Logical File Name 2.2 References 1 2 Use cases for LAr Bookkeeping. 200-06-19 http://a.home.cern.ch/a/albrand/www/bookkeeping/index.html Logical File Names for DC0, http://atlasinfo.cern.ch/Atlas/GROUPS/SOFTWARE/DC/doc/LogicalFileNa mesforDC0.pdf Application Metadata Base for DC0 2001-11-13 http://a.home.cern.ch/a/albrand/www/AMBforDC0.pdf Hybrid Event Store. Ed. David Adams 2002-02-28 http://www.usatlas.bnl.gov/~dladams/hybrid/hybrid.pdf Replica Selection in the Globus Data Grid. S. Vazhkudai et al. Proceedings of the 1st. IEEE/ACM International Conference on Cluster Computing and the Grid. IEEE Computer Society Press May 2001 http://www.globus.org/research/papers/repsel.pdf Job Configuration, Data Production, Bookkeeping. LHCb data management working group. 2001-11-20 http://lhcb-comp.web.cern.ch/lhcb-comp/Frameworks/DataManagement/Do cuments/Use_Cases_and_Requirements.pdf ATLAS TDAQ/DCS Online Software, Online Bookkeeper Requirements. A. Amorim et al. 2002-02-21 Virtual data catalogue. P. Nevski, Talk given at Atlas Software week 2002-05-30. http://doc.cern.ch/archive/electronic/other/agenda/a02248/a02248s10t2/tra nsparencies/atlasVDC.pdf EUDG WP1: L&B Advanced Queries Extensions. Ales KRENEK, Ludek MATYSKA, Zdenek SALVET. http://edmsoraweb.cern.ch:8001/cedar/doc.info?document_id=345842&versio n=1.7&p_tab= The GriPhyN Virtual Data System: Technical Report GriPhyN-2002-02. Jens-S. Vöckler, Mike Wilde, Ian Foster. http://www.griphyn.org/documents/document_server/uploaded_documents /doc--151--VDS1.V8.020118.pdf 3 4 5 6 7 8 9 10 page 4 DRAFT Atlas Offline Software Application Metadata 3 General Description; context, constraints, assumptions and dependencies. Requirements Version/Issue: 1/1 11 The Raw Data Flow in Atlas. Atlas EDM Group. 2002-06-01 http://atlas.web.cern.ch/Atlas/GROUPS/SOFTWARE/OO/architecture/Even tDataModel/RawDataFlow.pdf Athena framework and Grid Architecture. C.E. Tull.Talk given at Atlas Software week 2002-05-30. http://documents.cern.ch/cgi-bin/setlink?base=agenda&categ=a02248&id=a02 248s16t3/transparencies 12 3 General Description; context, constraints, assumptions and dependencies. In this section we give a general description of the application and of the external systems with which the application must interact. 3.1 Context The system described in this document is to be applied to Atlas offline software. It must therefore be compliant with the general architecture of Atlas offline software where necessary. Compliance means conforming to interfaces defined in the Athena framework, and to the conventions and definition of terms adopted by the collaboration in the Event Data Model, or in the database components. (references [11], [12]) Since Atlas offline software itself aims to be “grid-capable”, Atlas Bookkeeping must also be grid capable. The application described in this document applies particularly to Monte Carlo simulation but it should be adaptable to real data. In particular, it should be able to communicate in the future with the on line bookkeeping [7]. 3.2 Required capabilities of the system The bookkeeping application is a database of application metadata. This means that it provides a way of stocking data about data. The real physics data consists of large binary files written on mass storage devices which are relatively slow to access. A cataloguing system should provide a rapid way of determining the physics content of a data file. The metadata base must be reliable, robust and secure. The bookkeeping application must provide mechanisms for input and output of application metadata, with interfaces adapted to each different group of users. The most important of DRAFT page 5 Atlas Offline Software Application Metadata 3 General Description; context, constraints, assumptions and dependencies. Requirements Version/Issue: 1/1 these interfaces is perhaps that which permits the users to query the catalogue using diverse search criteria. The potential users of the system are widely distributed geographically so the all functions must also be available in a distributed manner. It is not possible to know at the outset the exact set of attributes which will be needed to describe the physics data throughout the lifetime of the application. Therefore any implementation must be flexible, and gracefully evolutive. The desired functionality can be divided into four groups 1. 2. 3. 4. Data base management. Structure - obtaining information about the state of the catalogue. Input - inserting and updating information in the catalogue database Output - querying the catalogue. 3.3 General constraints A constraint is something that affects the way in which requirements are met. It imposes restrictions on the design of the system that do not affect the external behaviour of the system, but must be fulfilled to meet technical or project obligations. In this section we must consider such factors as time, money, technology, and interaction with already existing systems. 3.3.1 Time The development of the application should keep pace with the general development of Atlas offline software. In particular, the bookkeeping application should always be able to meet the needs of the Atlas data challenges. 3.3.2 Money and manpower Since projects of software in particle physics are in general not richly endowed either with money or manpower, the application should make use of low cost software components wherever feasible. The design should take into account the limitations of manpower available for the project. This implies that existing tools should be reused where possible. It also implies that all stages of the project shall be well documented so that maximum use can be made of collaborators with only a limited time of participation. page 6 DRAFT Atlas Offline Software Application Metadata 3 General Description; context, constraints, assumptions and dependencies. Requirements Version/Issue: 1/1 3.3.3 Technology The choice of technology is determined by five factors. • • • • • The technology must be adapted to the requirements. As in all software projects, it is dangerous to become too dependent on a particular technology, as available technologies are in rapid evolution. Since manpower is limited, it is useful to choose technology in function of the competence and experience of those who work on the project. The large number of potential users, and their geographic distribution implies that any technology chosen must be fairly ubiquitous. Certain sites may impose a particular technology, or configuration. 3.3.4 Interaction with existing systems Interaction with the Atlas Athena framework implies that the bookkeeping shall provide a C++ interface which complies to the Athena “Service” architecture. Interaction with the datagrid implies that the application must be aware of the datagrid architecture. The application metadata catalogue is not a part of the grid; it contains information which has no relevance to the grid mechanisms. However some parts of the grid may need to query the catalogue. These are the replica selection service (reference [5]), and the virtual data system (references [8] and [10]). The job submission components of the grid may be involved in input of data to the application metadata catalogue. (references [9] and [12]) Interaction with datagrid tools will imply that certain specific grid compliant interfaces must be provided. 3.4 General Assumptions and Dependencies In establishing a list of requirements we may be obliged to make some assumptions about the external systems with which the bookkeeping application shall interact. These assumptions may become constraints on the external system. The application may be dependent on external systems, for example to supply input to the application in a specific way. The biggest assumptions concern the Atlas offline collaboration itself. The bookkeeping application is dependent on clear definitions of the entities which it must manage. It is unthinkable for the bookkeeping application to define the way that in which an event is identified, or the algorithms which can be used in a particular processing step. On the other hand the bookkeeping cannot efficiently manage metadata unless these clear definitions exist. Efficient searching mechanisms rely on a organization of the data to be searched. It will be necessary to establish some constraints for users. For example, after detailed requirements gathering, the bookkeeping may establish valid sets of values for a dataset attribute. DRAFT page 7 Atlas Offline Software Application Metadata 3 General Description; context, constraints, assumptions and dependencies. Requirements Version/Issue: 1/1 For the offline bookkeeping application we need make no assumptions about interactions with hardware components. For example we have no need to consider interactions with messages from DAQ crates. We assume that interfaces with other software components can always be defined using standard and agreed formats, such as CSV or XML. In the specific case of datagrid replica managers, the interface will be based on the logical file name, which is by definition unique. 3.5 Users 3.5.1 Database Administrators A group of 2 or 3 people who have complete access to all tables. 3.5.2 Project Managers Project managers work with database administrators to ensure that the correct schema is available for their project. This involves the definition of processing steps to be used by the project, and establishing the set of attributes which must be catalogued for each step. Project managers may also pre-populate the catalogue, as a way of informing site production physicists which work is assigned to their group. 3.5.3 Site Production Managers Each site should have at least one person who has the power to edit and delete information in a subset of the bookkeeping catalogue. The site production manager is responsible for ensuring that the correct metadata from his site is uploaded to the AMB. 3.5.4 Physicists Physicists can query the database using any of the interfaces provided. They may have write access on a subset of the bookkeeping catalogue. 3.5.5 Framework and Grid Components These are processes which may query the databases. We expect that framework components will use the C++ interface, whereas Grid Components may require some special interface development. Some Grid components may provide information to be input to the AMB. page 8 DRAFT Atlas Offline Software Application Metadata 4 Specific Constraints, Assumptions and Dependencies Requirements Version/Issue: 1/1 4 Specific Constraints, Assumptions and Dependencies In this section we give numbered lists of specific items. Each item should be unambiguous. 4.1 Constraints These are interactions with already existing systems. They impose restrictions which the AM implementation is not free to alter. 4.1.1 Dataset Identification CO01 The AMB will conform to the dataset identification scheme decided by the Atlas Collaboration 4.1.2 Event Numbering Scheme CO02 The AMB will conform to the event numbering scheme decided by the Atlas Collaboration 4.1.3 Logical File Names CO03 The AMB will conform to the Logical File Name scheme decided by the Atlas Collaboration. 4.1.4 Job submission CO04 The AMB Interface must provide a mechanism for both job submission scripts and GUI programs to input and output application metadata. Job submission may entail using the AMI to query the AMB to obtain suitable input to a new job. It is also this stage which will permit the definition of a new dataset. Job submission may be by using a “classic” batch script, a special “Grid aware” job submission script such as the EUDG WP1 Job Description Language, or by a GUI program which is not yet defined. This means that the AMI must be ready to support several I/O formats. DRAFT page 9 Atlas Offline Software Application Metadata 4 Specific Constraints, Assumptions and Dependencies Requirements Version/Issue: 1/1 CO05 The AMI must support the interfaces required by grid job submission tools which wish to input information to the catalogue. 4.1.5 Grid resource brokers, replica services and catalogues. CO06 The AMI must provide a mechanism for the communication of a Logical File Name, or a list of Logical File names, resulting from a query based on attributes of a dataset. 4.1.6 Grid security. CO07 The AMI must conform to the security requirements of the grid architecture. 4.2 Assumptions AS01 The dataset number will be allocated by a separate mechanism, and known to the user before any information is added to the AM We anticipate that physicists responsible for the Monte Carlo data generation will obtain a dataset number, or a set of dataset numbers from a specific server, or from a person designated by ATLAS who will distribute dataset numbers. In the case of real data, the dataset number correspond to the run numbers allocated by the DAQ. The dataset number will stay with the dataset through all the processing steps which the dataset passes. It could be used to tag the events which belong to the dataset. TBD Will users want to merge events from different datasets into a new dataset? If so will a new number be given to this new dataset? How? Perhaps datasets formed in this way will be collections of event tags, i.e. references to events, and not events themselves. See section 10 of reference [4]. AS02 The datsetID is unique for all Atlas production. The datasetID consists of several parts. One of the parts is the datasetNumber. The datasetID will be unique within Atlas. It could be used as part of the logical file nameAS03 We assume that the events within a dataset are numbered consecutively, and that the eventID is unique only within the dataset. An alternative would be that every event in Atlas has a unique eventID, for example a time stamp. TBD Can we assume that the first event generated in a dataset will always be numbered “1”? page 10 DRAFT Atlas Offline Software Application Metadata 4 Specific Constraints, Assumptions and Dependencies Requirements Version/Issue: 1/1 AS03 An event is uniquely identified by two tags; the dataset Name of the event collection to which it belongs and the eventID. Logical File Names will be unique by construction. AS04 Nevertheless, since the AM database is able to check the uniqueness of LFNs, a mechanism should be provided to do it. AS05 There will be a published schema for Atlas LFN The schema will evolve as we progress through the different Atlas data challenges. There is a schema published for DC0 [2] AS06 In the absence of a clear understanding of the details of interactions with Grid tools, we will assume that the LFN are constructed by the job submission mechanism, which will inform both the AMB and the Grid Replica catalogues. An alternative would be that the LFN is attributed by the Replica catalog itself, in which case it would follow global Grid rules, and not be a specifically Atlas defined name. AS07 We assume that the Grid Virtual Data Service will not interact directly with the metadata catalog. Communication will be through the replica service, and will use the Logical File Name. 4.3 Dependencies This is the list of components whose behaviour may be affected by interaction with the AMB. 4.3.1 Framework Persistency Service DE01 When a new file of events is written, the Framework persistency service will be required to inform the AMB. Even if the job submission mechanism declares that a new partition will be written by a job, it is only when the job terminates successfully that the file can really be considered to exist See reference [12] for two scenarios 4.3.2 On line bookkeeping DE02 There should be a possibility of exchange of information between the two bookkeeping catalogues. The requirements of the on line bookkeeping are given in reference [7] DRAFT page 11 Atlas Offline Software Application Metadata 5 Use Cases and Requirements. Requirements Version/Issue: 1/1 5 Use Cases and Requirements. 5.1 Sources of Use Cases Use cases come from references [1],[4] and [6], and also from private communications to the authors. 5.2 List of Use Cases UC01 Retrieval of datasets for physics analysis. The physicist wants to access the datasets, which contains the information he asks for. He wants to select datasets according to several criteria, which are: The type of event, selected from a list of known event types such as “B-> J/Psi” A set of generator parameters. He also wants to possibly restrict his selection to datasets, where certain job configuration parameters have a certain value. Via a Web display program, he is able to do selections on different channels and to get a list of the data found. At the same moment he also wants to retrieve basic information about the resulting datasets such as the total number of events. UC02 Retrieve additional information about a dataset. The results of a data analysis job has shown results, which cannot be explained. To understand the differences to the expectation, the physicist has the suspicion, that e.g. the Monte-Carlo event generation was performed with incorrect parameters. He wants to inspect the parameter set, which was used to produce the dataset in question. Using a display program (e.g. WWW browser), he is able to retrieve all relevant information from the individual processing steps, which were used to produce this dataset Code configuration. • • • UC03 Parameter configuration. Input datasets. Log files. Access to event data for application tests. Program developers have very similar needs as physicists performing data analysis. Their selection is often smaller than for the data analysis, and they sometimes want to have a look to some details (as: log files, etc.). They would be happy to be able to do their selections on the page 12 DRAFT Atlas Offline Software Application Metadata 5 Use Cases and Requirements. Requirements Version/Issue: 1/1 bookkeeping database directly from Gaudi/Athena. This feature typically is used just to check that programs basically work. For this purpose they would like to access event data by specifying a statement like “10 events of type B->pi pi”. UC04 Updating the Bookkeeping database A subsystem application (production, analysis,....) has an output (data collection) to write to the Persistent event store. The Bookkeeping must be informed. The producer of the output can supplement the information written by the Bookkeeper by adding a text comment. Precondition The application has permission to write in the persistent data store. A message service exists between the bookkeeping and the event store Flow of Events An application has produced a data collection which it wishes to put in the persistent store. A request is sent to the Event Persistency Service. The request is examined. The Event Persistency Service requires information on the origin of the new data collection in order to allocate a name. (or maybe just its pre- assigned logical name) If the Event Persistency service is successful in storing the data collection a new data collection name is assigned which will give access to the newly stored data collection. The Event Persistency service informs the Bookkeeping of the existence of the new object and sends all the information about it. The new name is returned to the application The application may now use this new name write additional information on the new dataset, including a text comment, to the Bookkeeping. Later, another application, or a physicist using a direct interface to the bookkeeping may update the information and add other text comments about the data collection, accessing it by its name. UC05 Monitoring of production by the production manager A production manager has put in place a chain of processing steps to be performed on several datasets. Each dataset consists of several thousand events. The work has been distributed over a large number of sites and physicists. Each site manager completes the work assigned, and updates the bookkeeping database. The production manager can query the bookkeeping to determine how many sites have completed their work, or how many events have passed each processing step of the chain. 5.3 List of Functional Requirements. Functional requirements describe the behaviour which the system should have. DRAFT page 13 Atlas Offline Software Application Metadata 5 Use Cases and Requirements. Requirements Version/Issue: 1/1 5.3.1 Main Functions Required UR01 The metadata catalogue will provide information about real or simulated physics data in function of a logical file name, or in function of a set of attributes which define a logical file. The metadata catalogue will provide information about real or simulated physics data in function of a dataset name or in function of a set of attributes which define a dataset. The metadata catalogue will permit retrieval of a set of logical file names in function of other information provided by the user. The information with which the metadata catalogue is concerned is the physics metadata. This information should permit physicists to completely determine the contents of a data file without having to actually read the file. TBD UR05 UR02 UR03 UR04 How are we going to test this? In addition to the physics metadata described in UR04, the metadata catalogue should contain any information which the users consider necessary for retrieval purposes. An example is what can be considered “sociological” information, such as the name of the physicist who ran the process, or the production site. There seems to be no reason why the AMB should not have a certain amount of overlap with other relevant database applications, such as the replica catalogue, or the virtual data catalogue 5.3.2 Organization of Metadata UR06 The user should not be required to know the schema of the databases in order to use the set of interfaces to it. Interfaces should be able to hide the implementation details of databases. UR07 The organization of metadata (schema) is expected to evolve during the lifetime of the project. Reorganization should be transparent for the user. This means that it is important that the interfaces to the database should be generic. page 14 DRAFT Atlas Offline Software Application Metadata 5 Use Cases and Requirements. Requirements Version/Issue: 1/1 5.3.3 Metadata Set of Attributes UR08 The set of attributes which describe the contents of data files produced by particular processing steps is likely to evolve over the lifetime of the project. Therefore the system must manage evolution of sets of attributes. Since it is not possible to foresee the attributes of a dataset the bookkeeping must allow users to add extra attributes to a particular dataset. Different datasets may have different sets of attributes. Since a particular logical file may be processed (used as input) several times, it should be possible for any physicist to attach a text comment to a logical file. Attributes should be able to be associated with a comment which explains their meaning UR09 UR010 UR011 UR012 5.3.4 Metadata Integrity UR013 The metadata shall be subject to a certain number of “business” rules which ensure that it is coherent. numeric data Should always have its units specified. An application needs to make sure that numeric data is entered in the correct units. relational data Min <= Max text Fields which are not free comments should be regarded as case sensitive. 5.3.5 Metadata Acquisition UR014 Metadata acquisition must be possible in several formats. Some physicists like command line interfaces, and some like GUI. Some like plain text and some like XML. Some know how to use spreadsheets, others detest them. UR015 Metadata acquisition must be available in a distributed way This means that it should be possible to input to the application metadata catalogue from many different sites. UR016 One of the formats of data input to the AMB should be close to that proposed by EU Grid WP1. This will probably be XML. DRAFT page 15 Atlas Offline Software Application Metadata 5 Use Cases and Requirements. Requirements Version/Issue: 1/1 UR017 Insertion and update of data into the AMB must be possible from the Athena Framework. An Athena Bookkeeping service is required. It remains to be seen whether the user will call it directly, or whether it will be called by a persistency service. For the present purpose, the answer does not matter. UR018 Insertion and update of data into the AMB must be possible from a command line interface. This of course facilitates communication with other applications, such as the replica catalogue. UR019 Physicists should be able to keep local copies of the data that they have sent to the AMB. This is evident when using a command line interface. The requirement implies that any GUI developed should produce log files. UR020 UR021 Messages should be sent to the users confirming the data base update. If an error condition prevents the database update the user must be informed. 5.3.6 Accessing Metadata UR022 UR023 UR024 UR025 UR026 The AMB will be available to all users for read access. Physicists will have write access to a subset of the AMB. Site managers will have delete access to a subset of the AMB. To facilitate the interactive access to the data a web interface must be provided. Search masks must be provided for the most common searches. This should be both from the web, and from a command line. UR027 UR028 It should be possible to refine the result of a query. One of the standard queries should allow the selection a list of logical file names defined by attribute values. Even if the query was to find a particular dataset, the result always maps to a list of LFN. page 16 DRAFT Atlas Offline Software Application Metadata 5 Use Cases and Requirements. Requirements Version/Issue: 1/1 UR029 UR030 One of the standard queries should return an LFN which contains a particular event One of the standard queries should allow the display of attribute values in function of a logical file name. One of the standard queries should allow the display of the history of a logical file. UR031 This means that it must be possible to see which LFN are ancestors of a particular LFN. UR032 The result of a query which has selected set of logical file names, should display the total number of events selected. It must be possible to convert the result of a query which has selected set of logical file names, into input which can be understood by an analysis program. A mechanism should be provided to allow users to specify their own queries. Users should be able to save their private queries for later reuse. An Athena service must be provided for AMB access. UR033 UR034 UR035 UR036 5.3.7 Communication with the Replica Catalogue The Replica Catalogue contains the physical description of data. UR037 The base parameter of the interface with the replica catalogue must be the LFN. It will be necessary to communicate with the replica catalogue to satisfy requirement UR033 DRAFT page 17 Atlas Offline Software Application Metadata 5 Use Cases and Requirements. Requirements Version/Issue: 1/1 5.4 List of Non Functional Requirements. Non functional requirements determine the manner in which the application satisfies the functional requirements. 5.4.1 Choice of Technology UR038 UR039 UR040 The design of the AMB shall not depend on any particular platform The design of the AMB must not depend on any particular technology The design of the AMB shall not depend on any particular implementation of a particular technology. 5.4.2 Scalability UR041 The AMB design must enable the management of an as yet undetermined amount of data. TBD This is a tricky question - how to estimate the amount of data - in terms of the size of the database itself, and in terms of the number of records that the database may be expected to manage? It is probably possible to estimate the number of datasets which Atlas will produce, but the number of records is strongly dependent on the size of partitions for example. 5.4.3 Response Time UR042 To make sure that the response time of a query remains on an interactive scale (of the order of a few minutes), some kind of limiting mechanism should be included. This is both a technical and a psychological question. A very long query could potentially block access to other users. Also after a certain time users lose confidence and decide that the application does not work. Either they go away disgusted, or they start launching new queries, which will probably have the effect of making the situation worse. This could take the form of a warning - or an upper limit on the number of records which a query response can contain. page 18 DRAFT Atlas Offline Software Application Metadata 5 Use Cases and Requirements. Requirements Version/Issue: 1/1 5.4.4 Conviviality UR043 UR044 AMI will include user help facilities, and user documentation. AMI should remain stable and homogeneous, even if the implementation changes. If we change technology, the commands, the Athena service and the web interface should remain unchanged for the user. 5.4.5 Distribution UR045 The design of the AMB should permit its implementation in a distributed environment. This is probably the best way of ensuring scalability and availability. 5.4.6 Availability UR046 The context of an International collaboration requires availability of the database 365 days a year, and 24 hours a day. The design should ensure that the application does not depend on the availability of a unique server. 5.4.7 Security and Robustness Robustness means that the application does not break if a particular resource is not available, or if a user performs an unexpected action. Security means that the system should prevent unauthorised access to the data. UR047 An unexpected or inappropriate input from a user should not allow data corruption. It should be signalled by an error message indicating the source of the problem Write access will be managed by passwords which limit access to subsets of the AMB. The AMB will implement an interface which ensures compliance to grid certification authorisation. UR048 UR049 5.4.8 Reliability (Back-Up) This section is about not losing data. UR050 Only database administrators have delete privileges on data which is used for searching. DRAFT page 19 Atlas Offline Software Application Metadata 5 Use Cases and Requirements. Requirements Version/Issue: 1/1 This implies that there is a buffering between the acquisition of data - a phase during which site managers have delete privileges on a particular subset of the data, and the central databases which are available for all the collaboration. UR051 UR052 All AMB servers should be backed up nightly. In addition to the regular nightly back-up of the disk image, the AMB must be saved in such a way that the database can be recreated at another site. This facilitates changing of servers. UR053 All the data of the AMB should be able to be dumped into text files which are accessible without the database engine. This facilitates changing technology. page 20 DRAFT

Related docs
bookkeeping software
Views: 165  |  Downloads: 0
free bookkeeping software
Views: 369  |  Downloads: 2
Introduction To Bookkeeping
Views: 234  |  Downloads: 7
bookkeeping forms
Views: 677  |  Downloads: 25
business bookkeeping
Views: 77  |  Downloads: 5
accounting and bookkeeping
Views: 405  |  Downloads: 59
Bookkeeping program
Views: 4  |  Downloads: 0
small business bookkeeping software
Views: 66  |  Downloads: 3
Bookkeeping Tips
Views: 3  |  Downloads: 0
Bookkeeping Accountancy
Views: 38  |  Downloads: 5
Bookkeeping Accounts
Views: 0  |  Downloads: 0
bookkeeping small business
Views: 364  |  Downloads: 36
Other docs by abe20
mymemorystore
Views: 126  |  Downloads: 4
accounts receivables software
Views: 167  |  Downloads: 3
voicemail greetings business
Views: 1884  |  Downloads: 6
global collectibles
Views: 134  |  Downloads: 0
bisiness news
Views: 82  |  Downloads: 0
williamhouse envelopes
Views: 149  |  Downloads: 0
metropolitan west capital management
Views: 87  |  Downloads: 0
franchisor franchisee
Views: 131  |  Downloads: 0
verizon wireless mobile internet home page
Views: 221  |  Downloads: 1
selling business plans
Views: 158  |  Downloads: 10
appointing a director
Views: 238  |  Downloads: 6
corporate names available
Views: 345  |  Downloads: 4
raising capital small business
Views: 76  |  Downloads: 2
qsub election
Views: 124  |  Downloads: 0
issuance of common stock
Views: 708  |  Downloads: 1