C++ Templates Loops if statements Function calls Member functions by alllona

VIEWS: 74 PAGES: 3

									2D1263 : Scientific Computing                       Lecture 22     (1)   2D1263 : Scientific Computing                      Lecture 22     (2)




                       C++ Templates                                                    Loops & if statements
        Templates can be used to write classes for                              Function for computing xk , where k = 2, 3:
                                                                                                        i
        general data types. The actual C++ code is                              void f1(int n, double *x, int k) {
        generated upon compile-time.                                              for (int i=0 ; i<n ; i++)
        We could e.g. define objects of class                                        if (k==2) x[i] = pow(x[i],2);
        Matrix<double> and Matrix<int>.                                             else x[i] = pow(x[i],3);
         + High level                                                           }
                                                                                void f2(int n, double *x, int k) {
         ± Optimisation                                                           if (k==2)
         – Template support is compiler-dependent                                   for (int i=0 ; i<n ; i++)
         – Large object files & compile time                                            x[i] = pow(x[i], 2);
                                                                                  else
        Templates have not been used extensively for                                for (int i=0 ; i<n ; i++)
        computation, although there are exceptions                                     x[i] = pow(x[i], 3);
        (Blitz++).                                                              }
                                                                                f1 and f2 perform the same calculations.
        Standard Template Library (STL)
                                                                                Selection statements inside loop makes
         • Common computer science classes, e.g.                                optimisation more difficult.
           linked lists, hash tables and queues.
                                                                                Applied to vector of 104 elements f2 is 8%
         • Several different implementations with                                faster averaged over one million applications
           different functionality                                               (g++ -O).

                                                Marco Kupiainen                                                        Marco Kupiainen
                                                Michael Hanke                                                          Michael Hanke
 NADA                                                                    NADA




2D1263 : Scientific Computing                       Lecture 22     (3)   2D1263 : Scientific Computing                      Lecture 22     (4)




                                                                                             Member functions
                          Function calls                                        Some inefficiencies can usually be avoided by
        Rules of thumb:                                                         changing interfaces to member functions.
         • inline small functions since the cost of call                        From previous lecture, computing C = A · B
           may be significant compared to
                                                                                Matrix A(20,10), B(10, 30), C(1,1);
           computational work
                                                                                // Set elements of A & B
         • pass variables by (constant) reference                               C = A*B;
           instead of value to avoid data duplication
                                                                                Inside “*” a temporary object is allocated, and
                                                                                it’s data copied to C via assignment operator.
        Virtual functions
        It is preferable to avoid short virtual functions                       Consider instead
        which are called often.                                                 Matrix::mult(Matrix &B, Matrix &C);
                                                                                which multiplies B from the left, and stores the
        The actual function to call is determined at run
                                                                                result in C.
        time:
                                                                                 + No temporary storage
         • inlining not possible
                                                                                    – smaller storage requirements
         • more expensive than normal function call
                                                                                    – less data to copy
        Trade-off between convenience and efficiency
                                                                                 – difficult to write M = A · B · C · D compact.
        depending on purpose of code.
                                                                                 • Why not implement both?


                                                Marco Kupiainen                                                        Marco Kupiainen
                                                Michael Hanke                                                          Michael Hanke
 NADA                                                                    NADA
2D1263 : Scientific Computing                      Lecture 22     (5)                                  2D1263 : Scientific Computing                                                                                                                                                                                                                  Lecture 22                                                              (6)




                                                                                                                                                                                                                                   Friends
                                                                                                                   It is sometimes useful to relax the data
                                                                                                                   encapsulation in order to increase efficiency.
                                                                                                                   By defining classes or functions as friends, they
                                                                                                                   are allowed to access all members.

                     Member functions
                                                                                                                                             class a {
        New definition of Matrix::operator*
                                                                                                                                                    friend class b;
        Matrix Matrix::operator*(Matrix &M) {
                                                                                                                                                    public:                                                     // any class
          Matrix tmp(1,1);
          (*this).mult(M, tmp);                                                                                                                     protected:                                                                          // derived classes
          return tmp;                                                                                                                               private:                                                           // classes a and b only
        }                                                                                                                                    };

                                                                                                                   Objects of class b can more efficiently use
                                                                                                                   members of a (no accessors).
                                                                                                                   Friends introduce codependencies between
                                                                                                                   classes which are not obvious from class
                                                                                                                   hierarchy.


                                               Marco Kupiainen                                                                                                                                                                                                                                                         Marco Kupiainen
                                               Michael Hanke                                                                                                                                                                                                                                                           Michael Hanke
 NADA                                                                                                  NADA




2D1263 : Scientific Computing                      Lecture 22     (7)
                                                                       (8)




                                                                                                                                                                                                                                                                                                                                                                                                                             Marco Kupiainen
                                                                                                                                                                                                                                                                                                                                                                                                                             Michael Hanke
                                                                       Lecture 22




                        Class hierarchy                                                                                                                                                                                                                                                                           Solution of 3D Euler equations on Sun Ultra (pp. 73–85 lecture
                                                                                                                                                                          303 points
                                                                                                              C++ vs other languages – an example
                                                                                                                                                    Performance (Mflops)




        The implementation of an object-oriented code
                                                                                                                                                                                                                                                                                              1.7
                                                                                                                                                                                                                                                  16
                                                                                                                                                                                                                                                        20
                                                                                                                                                                                                                       31
                                                                                                                                                                                       22




        could proceed roughly according to
                                                                                                                                                                                                                                                                            9




         1. Determine which problem you wish to solve
                                                                                                                                                                          103 points




            now, and possible extensions.
                                                                                                                                                                                                                                                                                              1.7




         2. What concepts/abstractions are needed?
                                                                                                                                                                                                                                                  16
                                                                                                                                                                                                                                                        20
                                                                                                                                                                                                                                                                            10
                                                                                                                                                                                                                       33
                                                                                                                                                                                       28




            Iterate:
                                                                                                                                                                                                                                                                                                                                                                                   Results are likely compiler dependent.
                                                                                                                                                                                       f77 (update in several loops)




            • Try to associate each abstraction with a
                                                                                                                                                                                                                       f77 (update in one loop)




              well-defined class.
                                                                                                                                                                                                                                                                                              C++ (A++ library)




            • Determine relations between classes.
                                                                                                                                                                                                                                                        C++ (“f77 style”)
                                                                                                                                                                                                                                                                            C++ (arc class)




            • Determine interfaces
         3. Implement if the class hierarchy is suitable
            for possible extensions.
                                                                                                                                                                          Code
                                                                       2D1263 : Scientific Computing




         4. Optimisation and/or generalisation
                                                                                                                                                                                                                                                  f90




                                                                                                                                                                                                                                                                                                                  notes).



                                                                                                                                                                                                                                                                                                                                                                                                                                               NADA




                                               Marco Kupiainen
                                               Michael Hanke
 NADA
2D1263 : Scientific Computing                                          Lecture 22     (9)




                               C++ vs other languages
        The Blitz++ project has shown that C++ can actually be made
        more efficient than Fortran 77.
        Based on templates with sophisticated compile-time
        cache-optimisation of computational kernels.
        Blitz++ home page: http://www.oonumerics.org/blitz/
        Problem: Compiles only with certain compilers!


        The linear algebra routines implemented for Diffpack have efficiency
        comparable to optimized Fortran 77 code (sometimes even better on
        Sun).


                                                                   Marco Kupiainen
                                                                   Michael Hanke
 NADA

								
To top