Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

The actual format for Quantum chemistry has been described, by sus16053


									                                  COST D37/0004/0006
                                 Working Group: DECIQ

1st WG meeting
Final Program

Friday, December the 1 st
         start of works
14.00 – Dr. Elda Rossi           Working Group status and perspective for the new COST
19.00                            action.
         Dr. Antonio Monari      The new version of the Q5cost library and its first
          Prof. Kim Baldridge    Gemstone Distributed workflow environment .
          Dr. Attila Tajiti      State of the art for Columbus Wrapper
          Prof Valerie Vallet    The possible use of Q5Cost and Qcml in EPCISO
          Prof M. Kallay         The possible use of Q5Cost and Qcml in MRCC
20        Dinner together - end of first day

Saturday, December the 2 nd

9.00-     Prof. Kenneth Ruud      Dalton file system and possible integration with Q5Cost
          Dr. Antony Semama       The “Toulouse” chain and its migration towards the
                                  Q5Cost file format
          Prof. Gian Luigi        Is it possible to manage the various existing orbital
          Bendazzoli              ordering inside the q5cost library to save both portability
                                  and efficiency?
          Dr. Celestino Angeli    The Wave Function Problem: The best way to store WF in
11.00-    Discussion

Meeting report
The actual format for Quantum chemistry has been described, based on XML (for small ascii data)
and HDF5 (for large binary data).

Several Chemistry programs, to be integrated in the systems, have been presented, in particular:
 FCI (Prof. Bendazzoli)
 FerraraChain (C. Angeli)
 ToulouseChain (A. Semama)
 Dalton (K. Ruud)
 EPCISO (Valerie Vallet))
 Gamess (K. Baldrige)
 MRCC (M. Kallay)
A very sophisticated environment for workflow execution is the QC context has been presented by
prof. Baldridge. The environment can be tested by downloading (installation is quite simple). Send
all comments to §

The topics of the final discussion was about:
        1) Is the proposed data format complete and well suited for the reference community?
        2) What are the priority for integrating new codes into the framework?
In the following a short discussion is reported, together with the name of the partners that took the
responsability to further study the problem and to present a possible solution to be discussed in the
web site.

XML vs HDF (K. Baldridge, V. Vallet)
at present the XML part is only theory (we are using “namelist” to input the small data).
We have to decide if to continue to support and develop both of them or if to store all the needed
info in the HDF file.
PRO for two files:
 XML is widely used
 We should take on other on going activity like CML or CCP1
     Note: it has been identified by other data interoperability workshops that CML likely is not
         extensive enough for QM, but certainly we can provide an interface (WSDL) to
PRO for HDF only:
 One file is better than one (but we can put two-ways links or to pack the xml file into HDF)
 We would need two different libraries for managing XML and HDF; again, one library is better
    than two, but if necessary, we could.

Normalization - AO order (V.Vallet)
There is a problem related to the order of the AO orbitals. Even if two programs use the same basis
sets, they may have different implicit orbital orderings and this is important in given cases.
Another problem to be solved, related to this one, is that of Orbitals Normalisation.
We need anserws to these problems:
 How can we define an order for the AOs?
 Is this related to the input of the basis set?
 Is it possible to use the AO labels to this end by defining a standard code

Implicit order in the integral lists (no indices store) (S.Evangelisti,
At present all integrals are stored with their indices. This allows us to ignore the “order” problem.
In fact the data are written in the natural order of the “producer” (by generating the indices in case)
and are re ordered by the “consumer” if a different order is needed. In any case all the required info
are present.
This could be a problem for large systems, since storing the indices is space consuming.
The indices can be avoided if a conventional order is chosen (like the “standard order”).
The idea is to define how many “orders” are previsti in the format, to code them (perhaps together
with specific routines for sorting the data) in a specific metadata. One of the value (“proprietary)
could be reseved for proprietary orders not well defined (of course this could prevent the use of that
data by general consumers).
 Are there a limited number of well defined orders (standard order, ...) to code into the format?
 What about writing ad-hoc routines in the library for the coded translations?
 For flexibility we could add a “proprietary” keyword for special cases
Compression (A.Monari)
The HDF format comes with special routines for data compression. They have never be checked
until now.
A test must be done

Wavefunction (C.Angeli, K.Ruud)
At present the WaveFunction has not been defined in the q5 format. Some groups requires it.
A clear definition is needed
 The Zero-order wf is an input data required by several programs
 It can be based on the determinants definition and/or indices defining contracted wavefunctions
 The determinants could be described in a Hole/Particle strategy (does not depend on basis set
    and perturbation type)
 Other strategies: String products with default order; List of occupied spinorbitals
 Possibly an implicit order?
 If more than one way is possible and the translation is not easy, maybe the different ways should
    be considered as different objects and we could code all of them.

Toos for collaborative working (e. Rossi)
    The new WEB site is based on Plone and wiki. At present the address is but it will be moved in in few days
 A common username on CINECA’s machines (costch01)
       o (IBM SP5/512)
       o (IBM CLX/1024)
 A CVS archive for Q5cost and wrappers
(all the information will be posted in the web site as soon as possible)

Wrappers to be implemented:
   MOLCAS – Lund and A.Monari, Vallet, Antony Semama
   DALTON – Celestino Angeli, Kennet Ruud, Antonio Monari
   Columbus / MRCC - Attila and M.Kallay
   GAMESS – Antony Semama, Zurich

Program history (output, scripts, ...)
It should be nice to keep the history of the saved data, for example the text output of the program
that genereted them, for future reference.
The problem is that there are different type of date, possibly created by different programs, or
different combination of programs.

Different conventions (indices, orders, WF, ...)
In order to guarantee the maximum flexibility, more conventions should be allowed for each data
type, provided that all the information are present for the full knowledge of the data.
 Each producer code writes in its own convention, the translation is in charge of the consumer
 The consumer must have all the info for translating the data
 two solutions: define a limited number of conventions (well defined name and characteristics);
    explicity give the algorithm for the translation

To top