Docstoc

Common Analysis Services Design Mistakes and How to .._1_

Document Sample
Common Analysis Services Design Mistakes and How to .._1_ Powered By Docstoc
					Common Analysis Services Design
Mistakes and How to Avoid Them
          Chris Webb
       www.crossjoin.co.uk
                Who Am I?
• Chris Webb
  chris@crossjoin.co.uk
• Independent Analysis Services and MDX
  consultant and trainer
• SQL Server MVP
• Blogger: http://cwebbbi.spaces.live.com
                    Agenda
•   Why good cube design is a Good Thing
•   Using built-in best practices in BIDS
•   ETL in your DSV
•   User-unfriendly names
•   Unnecessary attributes
•   Parent/child pain
•   One cube or many?
•   Over-reliance on MDX
•   Unused and/or unprocessed aggregations
  Why Good Design is Important!
• As if you needed reasons…?
• Good design =       good performance
                  =   faster initial development
                  =   easy further development
                  =   simple maintenance
• This is not an exhaustive list, but a selection of
  design problems and mistakes I’ve seen on
  consultancy engagements
         Best Practices in BIDS
• Don’t ignore the blue squiggly lines in BIDS!
  – They sometimes make useful recommendations
    about what you’re doing
• Actively dismissing them, with comments, is a
  useful addition to documentation
• As always, official ‘best practices’ aren’t
  always best practices in all situations
      Common Design Mistakes
• Three questions need to be asked:
  – What’s the problem?
  – What bad things will happen as a result?
  – What can I do to fix it (especially after I’ve gone
    into production)?
• This is not a name-and-shame session!
        Problem: ETL in your DSV
• It’s very likely, when you are working in SSAS, that
  you need changes to the underlying relational
  structures and data
   – Eg you need a new column in a table
• You then have two options:
   – Go back to the relational database and/or ETL and
     make the change
   – Hack something together in the DSV using named
     queries and named calculations
• The DSV is the easy option, but…
  Consequences: ETL in your DSV
• It could slow down processing performance
  – No way to influence the SQL that SSAS generates
  – Expensive calculations/joins are better done once
    then persisted in the warehouse; you may need
    to process more than once
• It makes maintenance much harder
  – DSV UI is not great for writing SQL
  – Your DBA or warehouse developer certainly won’t
    be looking at it
           Fix: ETL in your DSV
• Bite the bullet and either:
  – Do the necessary work in the underlying tables or
    ETL packages
  – Create a layer of views instead of using named
    queries and calculations
• Use the Replace Table With option to point
  the table in the DSV at your new view/table
• No impact on the rest of the cube!
     Problem: Unfriendly Names
• Cubes, dimensions and hierarchies need to
  have user-friendly names
• However names are often user-unfriendly
  – Unchanged from what the wizard suggests, or
  – Use some kind of database naming convention
• Designing a cube is like designing a UI
• Who wants a dimension called something like
  “Dim Product”….?
 Consequences: Unfriendly Names
• Unfriendly names put users off using the cube
  – These are the names that users will see in their
    reports, so they must be ‘report ready’
  – Users need to understand what they’re selecting
• Also encourage users to export data out of
  cube to ‘fix’ the names
  – And so you end up with stale data, multiple
    versions of the truth etc etc etc
         Fix: Unfriendly Names
• You can rename objects easily, but:
  – This can break calculations on the cube
  – It can also break existing queries and reports,
    which will need rewriting/rebuilding
  – IDs will not change, which makes working with
    XMLA confusing
• You should agree the naming of objects with
  end users before you build them!
 Problem: Unnecessary Attributes
• Wizards often generate attributes on
  dimensions that users don’t want or need
• Classic example is an attribute built from a
  surrogate key column
  – Who wants to show a surrogate key in a report?
Consequences: Unnecessary Attributes
• The more attributes you have:
  – The more cluttered and less useable your UI
  – The slower your dimension processing
  – The harder it is to come up with an effective
    aggregation design
     Fix: Unnecessary Attributes
• Delete any attributes that your users will
  never use
• Merge attributes based on key and name
  columns into a single attribute
• Set AttributeHierarchyEnabled to false for
  ‘property’ attributes like email addresses
• Remember that deleting attributes that are
  used in reports or calculations can cause more
  problems
 Problem: Parent Child Hierarchies
• Parent Child hierarchies are the only way to
  model hierarchies where you don’t know the
  number of levels in advance
• They are also very flexible, leading some
  people to use them more often than they
  should
     Consequences: Parent Child
• Parent Child hierarchies can lead to slow
  query performance
  – No aggregations can be built at levels inside the
    hierarchy
  – Slow anyway
• They can also be a nightmare for
  – Scoping advanced MDX calculations
  – Dimension security
               Fix: Parent Child
• If you know, or can assume, the maximum depth
  of your hierarchy, there’s an alternative
• Normal user hierarchies can be made ‘Ragged’
  with the HideMemberIf property
   – Hides members if their parent has no name, or the
     same name as them
• Still has performance issues, but less than
  parent/child
• You can use the BIDS Helper “parent/child
  naturaliser” to convert the underlying relational
  table to a level-based structure
   Problem: One Cube or Many?
• When you have multiple fact tables do you
  create:
  – One cube with multiple measure groups?
  – Multiple cubes with one measure group?
• Each has its own pros and cons that need to
  be understood
      Consequences: One Cube
• Monster cubes containing everything can be
  intimidating and confusing for users
• Also tricky to develop, maintain and test
  – Often changing one thing breaks another
  – Making changes may take the whole cube offline
• Securing individual measure groups is a pain
• If there are few common dimensions between
  measure groups and many calculations, query
  performance can suffer
   Consequences: Multiple Cubes
• If you need to analyse data from many cubes
  in one query, options are very limited
• A single cube is the only way to go if you do
  need to do this
• Even if you don’t think you need to do it now,
  you probably will do in the future!
      Fix: One Cube to Multiple
• If you have Enterprise Edition, Perspectives
  can help overcome usability issues
• Linked measure groups/dimensions can also
  be used to split out more cubes for security
  purposes
• If you have one cube, you probably don’t want
  to split it up though
     Fix: Multiple Cubes to One
• Start again from scratch!
• LookUpCube() is really bad for performance
• Linked measure groups and dimensions have
  their own problems:
  – Duplicate MDX code
  – Structural changes require linked dimensions to
    be deleted and recreated
  Problem: Over-reliance on MDX
• As with the DSV, it can be tempting to use
  MDX calculations instead of making structural
  changes to cubes and dimensions
• A simple example is to create a ‘grouping’
  calculated member instead of creating a new
  attribute
• Other examples include pivoting measures
  into a dimension, or doing m2m in MDX
Consequences: Over-reliance on MDX
• MDX should always be your last resort:
• Pure MDX calculations are always going to be
  the slowest option for query performance
• They are also the least-easily maintainable
  part of a cube
• The more complex calculations you have, the
  more difficult it is to make other calculations
  work
      Fix: Over-reliance on MDX
• Redesigning your cube is a radical option but
  can pay big dividends in terms of performance
• Risks breaking existing reports and queries but
  your users may be ok with this to get more
  speed
  Problem: Unused Aggregations
• Aggregations are the most important SSAS
  feature for performance
• Most people know they need to build some
  and run the Aggregation Design Wizard…
• …but don’t know whether they’re being used
  or not
 Consequences: Unused Aggregations
• Slow queries!
• If you haven’t built the right aggregations,
  then your queries won’t get any performance
  benefit
• You’ll waste time processing these
  aggregations, and waste disk space storing
  them
      Fix: Unused Aggregations
• Design some aggregations!
• Rerun the Aggregation Design Wizard and set
  the Aggregation Usage property appropriately
• Perform Usage-Based Optimisation
• Design aggregations manually for queries that
  are still slow and could benefit from
  aggregations
 Problem: Unprocessed Aggregations
• Even if you’ve designed aggregations that are
  useful for your queries, you need to ensure
  they’re processed
• Running a Process Update on a dimension will
  drop all Flexible aggregations
      Consequences: Unprocessed
            Aggregations
• Slow queries! (Again)
   Fix: Unprocessed Aggregations
• Run a Process Default or a Process Index on
  your cube after you have run a Process Update
  on any dimensions
• Note that this will result in:
  – Longer processing times overall
  – More disk space used
• But it will at least mean that your queries run
  faster
Thanks!
                         Coming up…
P/X001
The Developer Side of the Microsoft Business Intelligence stack
Sascha Lorenz
P/L001
Understanding SARGability (to make your queries run faster)
Rob Farley
P/L002
Notes from the field: High Performance storage for SQL Server
Justin Langford
P/L005
Service Broker: Message in a bottle
Klaus Aschenbrenner
P/T007
Save the Pies for Lunch - Data Visualisation Techniques with SSRS 2008
Tim Kent

                   #SQLBITS

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:35
posted:11/23/2012
language:English
pages:34