Report on the Bookkeeping for the LHCb
Metadata Catalogue (Draft v0.2)
The ARDA Group
Editors: Wei-Long, Ueng
Description of Bookkeeping Service
The Bookkeeping service is one of functionalities of the Distributed Infrastructure
with Remote Agent Control (DIRAC), the LHCb Monte Carlo production system.
DIRAC has a client/server architecture based on: Compute elements distributed among
the collaborating institutes; Databases for production management, bookkeeping (the
metadata catalogue) and software configuration; Monitoring and cataloguing services for
updating and accessing the databases. The Bookkeeping service domain purpose is to
allow storage and retrieval of data. The main focus of the Bookkeeping database is to
store/select/retrieve data files, many other data are stored in the Bookkeeping database.
When new datasets are produced they are registered by sending an XML dataset
description to the Bookkeeping service. Here all the dataset descriptions are stored in a
cache before they are checked by the production manager. After the check is done, the
dataset metadata information is passed to the LHCb Bookkeeping database. The
Bookkeeping database is hosted by the CERN Oracle server. It can be interrogated by
users using a dedicated web page with specialized forms. The output of the queries can be
used directly in the user analysis jobs to specify input data.
The bookkeeping services are hosted by a central server that deals both with web
pages and XMLRPC services. The Bookkeeping server is based on a piece of python
code providing both an XMLRPC server and a web server. This code was initially used in
the Production system and was reused with no change in the Bookkeeping. It is
implemented in file python/gaudiweb.py of the Bookkeeping package. The server is then
customized in file python/BookkeepingServer.py of the Bookkeeping package.
The server handles essentially three things : webpages, servlets and XMLRPC
services. The webpages and XMLRPC services just need to be registered to become
available, as you can see in methods startBookkeepingWeb and startRPCServices. Most
of the functionalities is actually provided by the underlying gaudiweb server.
For the servlets, the situation is a bit more complex since this concept is java
specific and thus not available in python. For this purpose, the Bookkeeping server can
actually not run using a regular python implementation. It actually needs Jython, a java
implementation of python. On top of that, dedicated classes were defined to handle
servlets correctly, reusing the gaudiweb.Service mechanism.
This servlet support is 70% of the code of BookkeepingServer.py. It uses some
home made, dummy implementation of a servlet engine provided in the calsses
DummyServletConfig, DummyServletResponse and DummyServletRequest. However,
the servelt engine could be droped as soon as the servlets are hosted by an external web
server, like apache or the ORACLE one.
To conclude with the Bookkeeping server, here is the list of entry points in the
current server as well as their type and usage.
Main : A servlet building the main bookkeeping web page.
EvtTypes : A servlet building a list of existing event types. The types for which no
events can be found are not listed.
EvTypeInfo : A servlet handling the selection of data and building a web page with
the results.
DataSets
Select A Servlet allowing to browse the database.
NewConfirm : A Servlet to enter data into the database.
DisplayBookFile : A servlet that displays details of database modification requests.
Manager : A python servlet allowing to shutdown the server from a web interface.
Bookkeeping : A directory containing files used by the generated html pages.
Essentially javascripts files.
RPC/BookkeepingSvc : An XMLRPC service providing the Bookkeeping API.
RPC/BookkeepingQuery : An XMLRPC service providing the BookkeepingQuery
API. This API allows to select files from the Bookkeeping.
RPC/GetLogFiles : An XMLRPC service providing easy access to log files.
Measuring the performance
The goal of this study is to measure the performance of Bookkeeping Services.
The main objectives are to find out how many concurrent connections/requests
Bookkeeping Service is capable to handle in an acceptable time and evaluate the
resiliency of the service to extremely high load.. We would like create sensors to
monitor the network and CPU process. All of performance testing items as below:
• Bookkeeping Web Server and XMLRPC Service stress tests
• Database I/O Sensor (??)
• Server Host
– Report server host stats :
• CPU Load, Network, Process time
– Web applications performance tests
• 1, 2, 5, 10, …virtual users
• Client Host
– Report Client host stats :
• CPU Load, Network, Process time
• Network monitor: perform SNMP queries to a network device e.g. router or
switch. (??)
Performance tests
Bookkeeping Web Server stress tests
The client in CERN was a P4-machine with 512MB of memory, running a
Ruby-script which several processes which simultaneously sent requests to the
Bookkeeping server. The time to complete these requests was then measured.
The Web Server URL is http://lbnts3.cern.ch:8100/
Figure 1: The Search for datasets webpage. We used default values to get results.
0.2
Response Time
0.15
> 7 Clients are failed
0.1
0.05
0
0 1 2 3 4 5 6 7 8 9
The Number of Clients
Figure 2: The time to complete a the request of several clients depending on the number of records
selected from the database and as a function of the number of clients.
The result is shown in Figure 2. For default request and greater than 6 numbers of
clients the clients failed on the request. A typical limiting factor was the amount of
concurrent that the web server can handle. A typical error message from the server is:
*** Exception caught when preparing Statement.ORA-01000: maximum open
cursors exceeded ***
We also running a software in Windows machine called “WAPT”, Web Applications
Testing V3.0, which simultaneously sent requests to the Bookkeeping server. The client
was a PIII-machine with 512MB of memory in Taiwan.
Response time(s)
> 7 Clients are failed
The Number og Clients
Figure 3: Average response time of web transaction data of 7clients.
Figure 3 shows the response time of Bookkeeping server as a function of amount of
concurrent connections.
Figures 4 -5 present the average web transaction and the average bandwidth.
> 7 Clients are failed
The Number og Clients
Figure4: Average web transaction of 7clients.(Taiwan-CERN)
Average bandwidth (KBites/Sec)
The Number og Clients
Figure5: Average web transaction of 7clients.(Taiwan-CERN)
XMLRPC Service tests
The client was a PIII-machine with 512MB of memory, running a
Python-script.
Figure6: Bookkeeping XML-RPC service conversation
The usage of the XMLRPC interface is standard. One just needs the name of
the service (RPC/BookkeepingSvc) and the machine and port where to find it.
Listing 1 shows some code displaying the replicas of a given file.
from xmlrpclib import Server
server = Server('http://lbnts3.cern.ch:8100/RPC/BookkeepingSvc');
f = server.file("00000154_00000072_6.oodst");
print server.replicas(f);
Listing 1: XMLRPC code sample
Clients Response Time(s) Clients Response Time(s)
1 0.54 40 3.05
2 0.65 50 3.79
3 0.76 100 8.51
4 0.81 150 11.18
5 0.9 200 14.99
10 1.1 300 22.95
20 1.73 400 33.17
30 2.38 500 40.66
Listing 2: Average Response Time of the number of clients
50
Response Time(s)
40
30
20
10
0
0 50 100 150 200 250 300 350 400 450 500 550
The Number of Clients
Figure7: Average Response Time of the number of clients
The Average CPU Utilization is 1~2% every client. Figure 7 present CPU
Utilization of client.
(%) CPU Utilization
7.00
6.00
5.00
4.00
val
3.00
2.00
1.00
0.00
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51
Time (sec)
Figure7: the CPU Utilization of Client
Conclusions
The response times were measured for the Bookkeeping development server.
Nevertheless, the results of the stress test were not acceptable in current state. There
should be need to serve more that 20 users concurrently.
Appendix : Installing LHCb Bookkeeping services
These steps must be executed on lxplus.cern machine. Under LHCb domain all the
software are managed by CMT so everything is configured to work with this package.
Getpack is a script that allows to checkout packages from the repository and does some
preliminary CMT configurations.
1. install cmt package. Refer to
http://lhcb-comp.web.cern.ch/lhcb-comp/Support/CMT/cmt.htm
2. cd to the directory where is going to be installed the Bookkeeping
3. set CMTPATH to this directory “setenv CMTPATH `pwd`”
4. do “alias getpack „/afs/cern.ch/lhcb/scripts/getpack‟ ”
5. “getpack ExternalLibs v4r4”
6. “getpack JAVA/JAVACore v1r141”
7. “getpack JAVA/JYTHON v2r2”
8. “getpack JAVA/XERCES_J v2r2”
9. “getpack JAVA/XALAN_J v2r4”
10. “getpack DataMgmt/Bookkeeping v3r0”
a. “cd DataMgmt/Bookkeeping/v3r0/cmt”
b. “emacs requirements” change “use JYTHON v2r1” with “use
JYTHON v2r2” testing head
c. “cmt config”
d. “setenv SITEROOT /afs/cern.ch`”
e. “source setup.csh”
f. “make”
g. “cd ../cmds”
h. “emacs startBookkeeping”
i. modify the first line “#!/usr/bin/sh” with “#!/bin/sh”
ii. comment the lines:
BookkeepingJDBCDriver="oracle.jdbc.driver.OracleDriver"
ConnectionString="jdbc:oracle:thin:@oradev:10521:D"
UserName="sponce"
Password="ch4nge1t"
HomePage=http://lhcboracle.cern.ch:9000/pls/lhcb_bookkeeping_prod
/bookkeep2_procedures.general?procname=create_page
iii. add the lines:
BookkeepingJDBCDriver="oracle.jdbc.driver.OracleDriver"
ConnectionString='jdbc:oracle:thin:@(DESCRIPTION=(ADDRES
S=(PROTOCOL=TCP)(HOST=oradev9.cern.ch)(PORT=1521))(C
ONNECT_DATA=(SID=D9)))'
UserName='cioffi'
Password='ARDAtesting'
i. “setenv CLASSPATH ${CLASSPATH}:
/afs/cern.ch/project/oracle/jdbc/9i/1.3_1.2/classes12.zip”
j. save the content of PYTHONPATH in a file
k. “source ../../../../JAVA/JYTHON/v2r2/cmt/setup.csh” this command will
change the content of PYTHONPATH
l. run “startBookkeeping” script. This script will run all the Bookkeeping
services: Servelet service, XML-RPC service
m. reset the content of PYTHONPATH to the value saved
To restart the services when login :
1. cd to the base directory of Bookkeeping installation
2. “setenv CMTPATH `pwd`” (as during the installation process)
3. “cd DataMgmt/Bookkeeping/v3r0/cmt”
4. “setenv SITEROOT /afs/cern.ch”
5. “source setup.csh”
6. save the content of PYTHONPATH in a file
7. “source ../../../../JAVA/JYTHON/v2r2/cmt/setup.csh”
8. “setenv CLASSPATH ${CLASSPATH}:
/afs/cern.ch/project/oracle/jdbc/9i/1.3_1.2/classes12.zip”
9. “cd ../cmds”
10. setenv JAVA_HOME /afs/cern.ch/sw/java/i386_redhat73/jdk/sun-1.4.2
11. setenv JDK_HOME /afs/cern.ch/sw/java/i386_redhat73/jdk/sun-1.4.2
12. setenv PATH $JAVA_HOME/bin:$PATH
13. run “startBookkeeping”
14. reset the content of PYTHONPATH to the value saved
Bibliography