Common Analysis Services Design Mistakes and How to .._1_ by pptfiles


									Common Analysis Services Design
Mistakes and How to Avoid Them
          Chris Webb
                Who Am I?
• Chris Webb
• Independent Analysis Services and MDX
  consultant and trainer
• SQL Server MVP
• Blogger:
•   Why good cube design is a Good Thing
•   Using built-in best practices in BIDS
•   ETL in your DSV
•   User-unfriendly names
•   Unnecessary attributes
•   Parent/child pain
•   One cube or many?
•   Over-reliance on MDX
•   Unused and/or unprocessed aggregations
  Why Good Design is Important!
• As if you needed reasons…?
• Good design =       good performance
                  =   faster initial development
                  =   easy further development
                  =   simple maintenance
• This is not an exhaustive list, but a selection of
  design problems and mistakes I’ve seen on
  consultancy engagements
         Best Practices in BIDS
• Don’t ignore the blue squiggly lines in BIDS!
  – They sometimes make useful recommendations
    about what you’re doing
• Actively dismissing them, with comments, is a
  useful addition to documentation
• As always, official ‘best practices’ aren’t
  always best practices in all situations
      Common Design Mistakes
• Three questions need to be asked:
  – What’s the problem?
  – What bad things will happen as a result?
  – What can I do to fix it (especially after I’ve gone
    into production)?
• This is not a name-and-shame session!
        Problem: ETL in your DSV
• It’s very likely, when you are working in SSAS, that
  you need changes to the underlying relational
  structures and data
   – Eg you need a new column in a table
• You then have two options:
   – Go back to the relational database and/or ETL and
     make the change
   – Hack something together in the DSV using named
     queries and named calculations
• The DSV is the easy option, but…
  Consequences: ETL in your DSV
• It could slow down processing performance
  – No way to influence the SQL that SSAS generates
  – Expensive calculations/joins are better done once
    then persisted in the warehouse; you may need
    to process more than once
• It makes maintenance much harder
  – DSV UI is not great for writing SQL
  – Your DBA or warehouse developer certainly won’t
    be looking at it
           Fix: ETL in your DSV
• Bite the bullet and either:
  – Do the necessary work in the underlying tables or
    ETL packages
  – Create a layer of views instead of using named
    queries and calculations
• Use the Replace Table With option to point
  the table in the DSV at your new view/table
• No impact on the rest of the cube!
     Problem: Unfriendly Names
• Cubes, dimensions and hierarchies need to
  have user-friendly names
• However names are often user-unfriendly
  – Unchanged from what the wizard suggests, or
  – Use some kind of database naming convention
• Designing a cube is like designing a UI
• Who wants a dimension called something like
  “Dim Product”….?
 Consequences: Unfriendly Names
• Unfriendly names put users off using the cube
  – These are the names that users will see in their
    reports, so they must be ‘report ready’
  – Users need to understand what they’re selecting
• Also encourage users to export data out of
  cube to ‘fix’ the names
  – And so you end up with stale data, multiple
    versions of the truth etc etc etc
         Fix: Unfriendly Names
• You can rename objects easily, but:
  – This can break calculations on the cube
  – It can also break existing queries and reports,
    which will need rewriting/rebuilding
  – IDs will not change, which makes working with
    XMLA confusing
• You should agree the naming of objects with
  end users before you build them!
 Problem: Unnecessary Attributes
• Wizards often generate attributes on
  dimensions that users don’t want or need
• Classic example is an attribute built from a
  surrogate key column
  – Who wants to show a surrogate key in a report?
Consequences: Unnecessary Attributes
• The more attributes you have:
  – The more cluttered and less useable your UI
  – The slower your dimension processing
  – The harder it is to come up with an effective
    aggregation design
     Fix: Unnecessary Attributes
• Delete any attributes that your users will
  never use
• Merge attributes based on key and name
  columns into a single attribute
• Set AttributeHierarchyEnabled to false for
  ‘property’ attributes like email addresses
• Remember that deleting attributes that are
  used in reports or calculations can cause more
 Problem: Parent Child Hierarchies
• Parent Child hierarchies are the only way to
  model hierarchies where you don’t know the
  number of levels in advance
• They are also very flexible, leading some
  people to use them more often than they
     Consequences: Parent Child
• Parent Child hierarchies can lead to slow
  query performance
  – No aggregations can be built at levels inside the
  – Slow anyway
• They can also be a nightmare for
  – Scoping advanced MDX calculations
  – Dimension security
               Fix: Parent Child
• If you know, or can assume, the maximum depth
  of your hierarchy, there’s an alternative
• Normal user hierarchies can be made ‘Ragged’
  with the HideMemberIf property
   – Hides members if their parent has no name, or the
     same name as them
• Still has performance issues, but less than
• You can use the BIDS Helper “parent/child
  naturaliser” to convert the underlying relational
  table to a level-based structure
   Problem: One Cube or Many?
• When you have multiple fact tables do you
  – One cube with multiple measure groups?
  – Multiple cubes with one measure group?
• Each has its own pros and cons that need to
  be understood
      Consequences: One Cube
• Monster cubes containing everything can be
  intimidating and confusing for users
• Also tricky to develop, maintain and test
  – Often changing one thing breaks another
  – Making changes may take the whole cube offline
• Securing individual measure groups is a pain
• If there are few common dimensions between
  measure groups and many calculations, query
  performance can suffer
   Consequences: Multiple Cubes
• If you need to analyse data from many cubes
  in one query, options are very limited
• A single cube is the only way to go if you do
  need to do this
• Even if you don’t think you need to do it now,
  you probably will do in the future!
      Fix: One Cube to Multiple
• If you have Enterprise Edition, Perspectives
  can help overcome usability issues
• Linked measure groups/dimensions can also
  be used to split out more cubes for security
• If you have one cube, you probably don’t want
  to split it up though
     Fix: Multiple Cubes to One
• Start again from scratch!
• LookUpCube() is really bad for performance
• Linked measure groups and dimensions have
  their own problems:
  – Duplicate MDX code
  – Structural changes require linked dimensions to
    be deleted and recreated
  Problem: Over-reliance on MDX
• As with the DSV, it can be tempting to use
  MDX calculations instead of making structural
  changes to cubes and dimensions
• A simple example is to create a ‘grouping’
  calculated member instead of creating a new
• Other examples include pivoting measures
  into a dimension, or doing m2m in MDX
Consequences: Over-reliance on MDX
• MDX should always be your last resort:
• Pure MDX calculations are always going to be
  the slowest option for query performance
• They are also the least-easily maintainable
  part of a cube
• The more complex calculations you have, the
  more difficult it is to make other calculations
      Fix: Over-reliance on MDX
• Redesigning your cube is a radical option but
  can pay big dividends in terms of performance
• Risks breaking existing reports and queries but
  your users may be ok with this to get more
  Problem: Unused Aggregations
• Aggregations are the most important SSAS
  feature for performance
• Most people know they need to build some
  and run the Aggregation Design Wizard…
• …but don’t know whether they’re being used
  or not
 Consequences: Unused Aggregations
• Slow queries!
• If you haven’t built the right aggregations,
  then your queries won’t get any performance
• You’ll waste time processing these
  aggregations, and waste disk space storing
      Fix: Unused Aggregations
• Design some aggregations!
• Rerun the Aggregation Design Wizard and set
  the Aggregation Usage property appropriately
• Perform Usage-Based Optimisation
• Design aggregations manually for queries that
  are still slow and could benefit from
 Problem: Unprocessed Aggregations
• Even if you’ve designed aggregations that are
  useful for your queries, you need to ensure
  they’re processed
• Running a Process Update on a dimension will
  drop all Flexible aggregations
      Consequences: Unprocessed
• Slow queries! (Again)
   Fix: Unprocessed Aggregations
• Run a Process Default or a Process Index on
  your cube after you have run a Process Update
  on any dimensions
• Note that this will result in:
  – Longer processing times overall
  – More disk space used
• But it will at least mean that your queries run
                         Coming up…
The Developer Side of the Microsoft Business Intelligence stack
Sascha Lorenz
Understanding SARGability (to make your queries run faster)
Rob Farley
Notes from the field: High Performance storage for SQL Server
Justin Langford
Service Broker: Message in a bottle
Klaus Aschenbrenner
Save the Pies for Lunch - Data Visualisation Techniques with SSRS 2008
Tim Kent


To top