Docstoc

Grid

Document Sample
Grid Powered By Docstoc
					Evolving the Enterprise’s
 Database Infrastructure

      “Move to the Grid”
                   Agenda
•   Problems
•   Introduction to Oracle 10g features
•   Demonstrate impact on the Enterprise
•   Propose Phase I Project
    – Consolidation
    – Scalable Grid Architecture
           Top 7 Problems
              for DBAs
• Growth in number and size of databases
  do not match staffing levels
• Root cause of performance bottlenecks
  are not easily diagnosed or obvious
• After a session ends, statistics and
  troubleshooting information are not always
  available
• Databases are shoehorned onto servers
  without consideration of correct layout
  leading to IO bottlenecks
           Top 7 Problems
              for DBAs
• Impossible to manually monitor and tune
  all databases
• Managing storage correctly is very time
  consuming
• Database tuning is part experience, part
  science, part art and part intuition.
           Top Problems
           for Sysadmins
• Many different servers, different
  architectures
• High number of databases per single
  node – complex to schedule maintenance
  windows
• Grey area between DBA and sysadmin
  responsibilities
             New in 10g
•   The vision for the grid
•   10g not a regular database upgrade
•   RAC enhancements
•   Backup strategy
•   ASM (Automatic Storage Management)
•   ADDM & Advisors
•   DataGuard
     Problems Solved in 10g
           for DBAs
• Some tedious and time consuming DBA
  tasks are now managed by Oracle
• Oracle will identify root causes of
  performance issues and rank the
  effectiveness of fixing them
• Oracle stores statistics about every
  session in its repository
• ASM will rebalance hot spots making it
  easier to have many databases on a
  server
     Problems Solved in 10g
           for DBAs
• 10g metrics and alerts will allow the DBAs
  to be more proactive by providing out of
  the box alerts
• ASM will allow for Oracle to manage
  storage reducing this very time consuming
  problem
• Oracle 10g provides advisors for tuning
     The vision for the Grid
• The “g” in 10g
• Grid is not RAC, RAC is not Grid
• Treat all computing resources like a utility
  in all layers of the product stack
• Clustered application servers (ias cluster)
• Clustered database (RAC)
• Automatic Storage management (ASM)
  for provisioning Storage
     The vision for the Grid
• Scalability – Easily add more resources
• Management, monitoring and provisioning
  with “Grid Control”
• Virtualization of resources – Applications
  are not tied to specific hardware but rather
  see one large pool of resources
   10g NOT a regular database
            upgrade

• Big learning curve
• Changes at all levels of the hardware
  stack
• Good opportunity to define job
  responsibilities in relation to the hardware
  stack
    The grid hardware stack
• Application servers (ISR/NCS depending on
  application)
• Databases (DBA Team / ISR)
• Load balancers/Interconnects/Network
  Infrastructure (NCS)
• Servers (NCS Sysadmins)
• Storage Architect (NCS)
• Cluster (Sysadmins/Storage Architects)
• Firewall appliances (NCS)
• Backups (DBA / NBU Admins)
      RAC Enhancements
• FAN – Fast Application Notification
• Smarter load balancing across nodes
  – Can now mix different classes of servers in
    your Cluster this gives ability to leverage
    existing hardware
  – Before grid some servers were almost always
    idle and some were never idle, grid makes the
    best use of resources
• Assign % of CPU usage to a Service
• Better management of workload
       Backup Philosophy in 10g

• Backups go to disk not tape
• Flashback logs
   – Supports flashback database and recovery through
     resetlogs
• Flash recovery area
   –   On disk
   –   Holds one full backup
   –   Holds all Incrementals
   –   Archive & flashback logs
   –   Backed up and managed by RMAN
   –   Flash recovery area backed up to Tape
   –   Best practice: Use ASM for this area
   –   Shared by all instances on server
     Backup philosophy in 10g

• Benefits
   – Most failures now are due to NBU on a rate of 5 or 6 per
     day. Requires operations to resubmit the backup and
     DBA time to follow up.
   – Time of Backup now at 4-6 hours (for MCGP)
   – Lots of time spent waiting on tape
   – Recovery from tape is slow, new features help minimize
     downtime
   – All files to recover are in same location
   – Having this on ASM minimizes work to maintain
     archivelog free space (avoid database hang)
         Automatic Storage
           management
• Oracle’s “Smart” Filesystem
• DBAs only have to deal with a few diskgroups
  rather then trying to fit datafiles on fixed size
  mountpoints.
• Raw partitions have always been recommended
  for performance but before ASM were very
  difficult to manage
• ASM can stripe and mirror your storage
  (Optional)
• ASM can rebalance to avoid hot spots
• Managing storage is very time consuming to do
  right, ASM does the tedious tasks for you.
        ADDM & Advisors
• Oracle has internalized metric collection in
  10g
• ADDM runs and looks for problems
• ADDM will recommend the use of advisors
  to further investigate the problem
• Will help the DBA (and developer) by
  providing tuning advice.
               DataGuard
• What is redo
• RAC = Instance availability
• DataGuard = Database availability
• Logical and Physical standby
• Protect database vs. Provide service
• All enterprise systems should have Dataguard
• Imagine loosing an hour of committed
  transactions in Banner or Vista?
• Time to rebuild an enterprise system?
• Uses for DWH
      Phase I Project scope
• Bring in required infrastructure
• Consolidate
   – Tempest/Squall replaced with scalable grid
     technology
   – Migrate DORACs/ORACs into this architecture
   Phase I Project scope

Current grid control implementation not
highly available
– Migrate Grid Control repository database to
  RAC.
– Cluster application server, Norad2
– Leverage virtualization
        Required Infrastructure
            (Grid Control)
• Have been using grid control for the past two years since it
  was beta
• Not optional in 10g*
• Has helped us to develop standards and be proactive
• Upgrade to release 2 in progress
• Release 2 improves on provisioning and RAC management
• Will be used by developers as well as DBAs when we go to
  10g
• In release 2, Oracle has partnered with third parties to
  deploy agents on non Oracle software and appliances
  Including SQL Server, WebLogic, F5 Load Balancers
        Losing Grid Control

•   No monitoring and alerts for databases
•   No GUI to manage 10g databases
•   Loss of tools for programmers and DBAs
•   Scheduled DBA jobs would not run
       Required Infrastructure
               (OID)
Oracle Internet Directory
• ONAMES is deprecated in 10g. ONAMES is a
  central naming service used to translate a name
  to a connect string and is needed for connectivity.
• Bridge from Oracle products to Active Directory
  for single sign-on and authentication
• Could have many other uses to manage and
  simplify security in Oracle products (Needs more
  research)
• Should be highly available or risk users not being
  able to connect to databases
       Required Infrastructure
               (OID)
Establish a two node OID, objectives:
• Replace ONAMES and shared TNSNAMES files
  as a standard naming method
• Clean up of all names as well as investigate the
  use of global_names
• Replace infra1.portal.mcgill.ca for managing
  authentication. (Migrate asdb instance on infra1
  to RAC - solely for Portal metadata)
            Infrastructure
         (worth investigating)
WebCache
• Part of Oracle application server install
• Used by Portal (but not currently installed in HA
  config)
   – Should be made highly available
• Should have a better understanding of how it
  works
• Can it benefit more than just the portal? (Improve
  Registration?)
• Investigate “Times 10” data cache
              Consolidation
               (Tempest/Squall)
• Tempest and Squall are servers funded by NCS
  as per a Tony Masi initiative to consolidate
  disparate databases from across campus.
• Tempest is a test server containing 12
  databases.
• Squall is a production server containing 20
  databases
• Databases serve mostly E-business group’s
  clients, ICS (HEAT) and ARR (Scheduling)
• On-going demand for new databases
• Difficult to estimate capacity and resource needs
• Not scalable and not highly available
• Best candidate for new architecture
                  Consolidation
                   (Tempest/Squall)
• Set up a 10g test grid to replace Tempest
• Set up a 10g production grid to replace Squall
• Migrate any applications on Tempest/Squall to 10g grid for
  which 10g is supported as well as migrate all McGill
  developed applications currently residing on
  Tempest/Squall.
• Migrate NCS databases
• Production Grid will provide a location for any 10g database
  that needs to be highly available (Grid Control repository,
  Portal repository)
• Project should include consultant from Oracle to review
  plan, discuss best practices and guide in initial setup of test
  environment.
• Good learning experience before restructuring large
  Enterprise systems (Vista, Banner)
          Risks of non-action
• Not a Tony Masi “Top 5” project but if we do not get Phase I
  accomplished and gain the needed knowledge we will not
  meet next year’s objectives (i.e. Vista upgrade, Banner
  upgrade)
• Staff resources continue to be stressed
• Advantages of new best practices for RMAN and backups
  of flash recovery area
• Development of methodology for migrating to Cost based
  optimizer
• Learning best practices for ASM on Hitachi SAN
• Benefiting from new features in OEM (monitoring, tuning
  and provisioning)
• New failover and load balancing features on RAC (FAN –
  Fast application Notification)
• Setup and configuration of 10g RAC
       Key Skills to Develop

• Best practice to migrate 9i RAC to 10g
  RAC
• Correct use of WebCache
• Understand implications of
  global_names=true
• Get developers up to speed on writing
  good code and performance tuning as
  well as trained on using new 10g tools
• Oracle Internet Directory
               Summary
• Big learning curve
• Need to move forward or future projects
  will be in jeopardy of failure
• All levels of hardware stack are implicated

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:11
posted:2/8/2012
language:English
pages:30