Working Group: DECIQ
1st WG meeting
Friday, December the 1 st
start of works
14.00 – Dr. Elda Rossi Working Group status and perspective for the new COST
Dr. Antonio Monari The new version of the Q5cost library and its first
Prof. Kim Baldridge Gemstone Distributed workflow environment .
Dr. Attila Tajiti State of the art for Columbus Wrapper
Prof Valerie Vallet The possible use of Q5Cost and Qcml in EPCISO
Prof M. Kallay The possible use of Q5Cost and Qcml in MRCC
20 Dinner together - end of first day
Saturday, December the 2 nd
9.00- Prof. Kenneth Ruud Dalton file system and possible integration with Q5Cost
Dr. Antony Semama The “Toulouse” chain and its migration towards the
Q5Cost file format
Prof. Gian Luigi Is it possible to manage the various existing orbital
Bendazzoli ordering inside the q5cost library to save both portability
Dr. Celestino Angeli The Wave Function Problem: The best way to store WF in
The actual format for Quantum chemistry has been described, based on XML (for small ascii data)
and HDF5 (for large binary data).
Several Chemistry programs, to be integrated in the systems, have been presented, in particular:
FCI (Prof. Bendazzoli)
FerraraChain (C. Angeli)
ToulouseChain (A. Semama)
Dalton (K. Ruud)
EPCISO (Valerie Vallet))
Gamess (K. Baldrige)
MRCC (M. Kallay)
A very sophisticated environment for workflow execution is the QC context has been presented by
prof. Baldridge. The environment can be tested by downloading (installation is quite simple). Send
all comments to firstname.lastname@example.org: §
The topics of the final discussion was about:
1) Is the proposed data format complete and well suited for the reference community?
2) What are the priority for integrating new codes into the framework?
In the following a short discussion is reported, together with the name of the partners that took the
responsability to further study the problem and to present a possible solution to be discussed in the
XML vs HDF (K. Baldridge, V. Vallet)
at present the XML part is only theory (we are using “namelist” to input the small data).
We have to decide if to continue to support and develop both of them or if to store all the needed
info in the HDF file.
PRO for two files:
XML is widely used
We should take on other on going activity like CML or CCP1
Note: it has been identified by other data interoperability workshops that CML likely is not
extensive enough for QM, but certainly we can provide an interface (WSDL) to
PRO for HDF only:
One file is better than one (but we can put two-ways links or to pack the xml file into HDF)
We would need two different libraries for managing XML and HDF; again, one library is better
than two, but if necessary, we could.
Normalization - AO order (V.Vallet)
There is a problem related to the order of the AO orbitals. Even if two programs use the same basis
sets, they may have different implicit orbital orderings and this is important in given cases.
Another problem to be solved, related to this one, is that of Orbitals Normalisation.
We need anserws to these problems:
How can we define an order for the AOs?
Is this related to the input of the basis set?
Is it possible to use the AO labels to this end by defining a standard code
Implicit order in the integral lists (no indices store) (S.Evangelisti,
At present all integrals are stored with their indices. This allows us to ignore the “order” problem.
In fact the data are written in the natural order of the “producer” (by generating the indices in case)
and are re ordered by the “consumer” if a different order is needed. In any case all the required info
This could be a problem for large systems, since storing the indices is space consuming.
The indices can be avoided if a conventional order is chosen (like the “standard order”).
The idea is to define how many “orders” are previsti in the format, to code them (perhaps together
with specific routines for sorting the data) in a specific metadata. One of the value (“proprietary)
could be reseved for proprietary orders not well defined (of course this could prevent the use of that
data by general consumers).
Are there a limited number of well defined orders (standard order, ...) to code into the format?
What about writing ad-hoc routines in the library for the coded translations?
For flexibility we could add a “proprietary” keyword for special cases
The HDF format comes with special routines for data compression. They have never be checked
A test must be done
Wavefunction (C.Angeli, K.Ruud)
At present the WaveFunction has not been defined in the q5 format. Some groups requires it.
A clear definition is needed
The Zero-order wf is an input data required by several programs
It can be based on the determinants definition and/or indices defining contracted wavefunctions
The determinants could be described in a Hole/Particle strategy (does not depend on basis set
and perturbation type)
Other strategies: String products with default order; List of occupied spinorbitals
Possibly an implicit order?
If more than one way is possible and the translation is not easy, maybe the different ways should
be considered as different objects and we could code all of them.
Toos for collaborative working (e. Rossi)
The new WEB site is based on Plone and wiki. At present the address is
http://manage2.zope.cineca.it/abigrid but it will be moved in http://abigrid.cineca.it in few days
A common username on CINECA’s machines (costch01)
o sp.sp5.cineca.it (IBM SP5/512)
o cl.clx.cineca.it (IBM CLX/1024)
A CVS archive for Q5cost and wrappers
(all the information will be posted in the web site as soon as possible)
Wrappers to be implemented:
MOLCAS – Lund and A.Monari, Vallet, Antony Semama
DALTON – Celestino Angeli, Kennet Ruud, Antonio Monari
Columbus / MRCC - Attila and M.Kallay
GAMESS – Antony Semama, Zurich
Program history (output, scripts, ...)
It should be nice to keep the history of the saved data, for example the text output of the program
that genereted them, for future reference.
The problem is that there are different type of date, possibly created by different programs, or
different combination of programs.
Different conventions (indices, orders, WF, ...)
In order to guarantee the maximum flexibility, more conventions should be allowed for each data
type, provided that all the information are present for the full knowledge of the data.
Each producer code writes in its own convention, the translation is in charge of the consumer
The consumer must have all the info for translating the data
two solutions: define a limited number of conventions (well defined name and characteristics);
explicity give the algorithm for the translation