Docstoc

96 Foresight User Conf speaker template - PowerPoint

Document Sample
96 Foresight User Conf speaker template - PowerPoint Powered By Docstoc
					Progress
Database
Repair &
Recovery
    Dan Foreman
 United Systems, Inc.
Email: danf@prodb.com

www.BravePoint.com
     Introduction- Dan Foreman
• Progress user since 1984 (V2.1)
• Guest speaker at every USA Progress Users
  Conference from 1990 to 1998
• Quest speaker at QAD Conference: 1997, 1999-2001
• Author of:
   • Progress Performance Tuning Guide
   • Progress Database Administration Guide (Nov 2001)
   • Progress Virtual System Tables
   • Progress V9 Database Administration Jumpstart
   • A paper purchase now allows free online access
   • ProMonitor - Performance Monitoring Tool
   • ProD&L - Accelerated Dump/Load Utility
                www.BravePoint.com
     Introduction - Who Are We?
•   BravePoint Inc. (www.BravePoint.com)
•   Formerly United Systems, Inc.
•   125+ Employees
•   Three of us have used Progress since 1984
•   Progress Service Provider
     • Performance Tuning
     • Database Administration and Recovery
     • Progress Training
     • Internet, Webspeed, Java
     • Custom Programming Services
               www.BravePoint.com
     Introduction - Who Are You
•   Progress Version: V6, V7, V8.2, V8.3, V9
•   Unix? NT? Is there anything else?
•   Largest Single Database?
•   Highest User Count?




                www.BravePoint.com
            Special Request
• Phasers on Stun
  • Isn't it interesting that the Microsoft
    spell checker can‟t find the word
    “Phaser”
• Cell Phones on Mute Please!




              www.BravePoint.com
                  Goals
• Can I teach you database brain surgery in
  only one hour?




            www.BravePoint.com
      Preventative Maintenance
• Backups (yes, I know you‟ve heard of them but
  have you tested them recently?)
• Test your Entire Recovery Plan
• Warm Spare Database - A database on another
  machine with a recent copy of the production DB
  This is easy to do in Progress...covered soon
• Unix: don‟t logon as root unless you really need to
• Use O/S security to protect the DB, BI, and AI files
  from accidental/casual/intentional deletion


                www.BravePoint.com
    Preventative Maintenance
• Unix: Don‟t use kill -9 to terminate a Progress
  session; You might bring the database DOWN!
  if you kill a session that is holding a Latch
• Always have an up-to-date Structure (.st) file
  available
• Make sure your Sequences are not about to
  overflow (at approx. 2 billion)




              www.BravePoint.com
     Preventative Maintenance
• Monitor the BI file High Water Mark to avoid the
  2gb size limit bug which will corrupt your DB
  (_dbstatus-bisize Virtual System Table); not
  using VSTs? Turn Them On!
   • Monitor 'Delinquent' Transactions (transactions
     longer than 30-60 minutes)
   • Monitor Large Transactions (-L Overflow)
• longtrx4.p Progress program on the BravePoint
  website to Automatically disconnect Delinquent
  Transactions
• Use the V8.3 -bithold parameter as an extra
  safeguard; Set to 999 or less; Set even in V9
• V9 supports Terabyte sized BI Files
               www.BravePoint.com
     Preventative Maintenance
• Monitor the database High Water Mark to avoid
  growing into the Variable Length Extent
   • Could hit the 2gb size limit
   • Unbuffered I/O to the Variable Extent
   • Sync call issued each time the Extent grows
   • Size is requested from the OS in increasingly
     larger chunks (16-128 DB Blocks)
   • Could run out of disk space


               www.BravePoint.com
            Quiz Question
• Who are the Smartest People in the
  Room?




            www.BravePoint.com
                 Answer
• Those running After Imaging on their
  Mission Critical Databases
• Or stated in another way, if you‟re not
  using AI, you shouldn‟t be responsible for
  your company‟s databases
• If your company can‟t afford to lose a
  day‟s worth of data you Absolutely need
  After Imaging


             www.BravePoint.com
              After Imaging
• Who is currently using After Imaging?
• If no, why not?
• The PSC documentation says it offers
  protection against media failure
   • Disk fails 5 minutes before the backup starts
      on the final day of your year end close
   • No paper trail
   • Ouch! Time to work on your resume (C.V.)
• After Image File(s) + Last (Good) Backup =
  State of DB at time of crash


              www.BravePoint.com
   After Imaging - Why Use It?
• But you say…”I have disk mirroring (also
  known as RAID 1) so I‟m protected against a
  disk failure”
• BUT Mirroring does NOT protect against all
  database evils




             www.BravePoint.com
   After Imaging - Why Use It?
• True Horror Story #1
   • A DBA FTP‟d a test database into the
     directory where the production database
     resided... unfortunately they had the same
     name
   • Disk Mirroring worked just fine…..
   • After Imaging would have probably saved
     the day



              www.BravePoint.com
   After Imaging - Why Use It?
• True Horror Story #2
   • A user ran an archiving program on live
     data that wasn‟t ready to be archived
   • Once again the mirroring performed
     perfectly
• After Imaging might have improved the
  situation as it is possible to Roll Forward to a
  specific point in time



              www.BravePoint.com
   After Imaging - Why Use It?
• True Horror Story #3
• BI file hit the 2GB bug (true even with MV BI files)
  @ 4PM on a busy day (300+ users)
• 10GB Database was corrupted
• Progress Tech Support Recommendation: dump
  & load or restore from backup which meant
  substantial down time or data loss
• Fortunately the customer called me and I was
  able to temporarily patch the database
• I had recommended AI to this customer over one
  year prior to this event

               www.BravePoint.com
   After Imaging - Why Use It?
• Avoid the Problems related to probkup online
   • Transaction Activity is Frozen while the BI File
     is Backed Up
   • The Buffer Cache is "polluted" with Database
     Blocks that may not be Commonly Accessed
   • The Overhead of Running a Streaming Tape
     Drive on the Same Machine as the DB
• Possible Solution
   • Backup the AI Files
   • Restore/Copy to Another System (or disk)
   • Apply the AI Files (Roll Forward)
   • Backup the Replicated Database
               www.BravePoint.com
   After Imaging - Why Use It?
• Easy generation of a Warm Spare DB
• A Warm Spare DB is:
   • A standby database
   • On another machine
   • That can be brought online quickly in case
      of failure to the production system
• It‟s „warm‟ because it is not 100% current
• A HOT spare is not possible using AI but is
  possible using Replication Triggers


              www.BravePoint.com
Online Backup  Warm Spare

  Production
                             Warm Spare
   System




Production                    Warm Spare
   DB                               DB
               AI Files

               www.BravePoint.com
   After Imaging - Why Use It?
• Easy generation of a Report Server DB
• A Report Server DB is:
   • A database on another system
   • Used for reporting only
   • To relieve the production system of the
     load imposed by reporting
   • Doesn‟t require same level of hardware or
     Progress license



              www.BravePoint.com
     Preventative Maintenance
• Corruption Checks
   • proutil dbanalys Blocks, Records, & Indexes
   • probkup/procopy Blocks only
   • proutil dbrpr     Blocks and Records
   • proutil dbscan    Same as dbrpr (cmd line)
   • proutil idxfix    Indexes <-> Records
• Run periodically to make sure you don‟t have
  hidden or unreported corruption


              www.BravePoint.com
           Log File Checking
• Check the Database log (.lg) file for errors DAILY.
  Look for words such as:
• kill*       drastic       warn*       error
  system abnormal exceed*               fail*
  wrong       unexpected*invalid        died
  damage* dead              overflow* violation
  fatal       insufficient missing      disappear*
        corrupt*     allow*       attempt*    cannot
  enough illegal            impossible increase
  unknown unable            stop*       beyond
• There are programs on the our Web Site to assist
  with log file checking or products such as
  ProMonitor
               www.BravePoint.com
                1124 Errors
• SYSTEM ERROR: wrong dbkey in block…..
• Indicates (with 99.9999% probability) a
  hardware problem
• Don‟t limit your search to disks; also consider:
  Disk Controllers, SCSI termination, RAM (parity
  errors), Firmware, etc.
• Don‟t Let the Hardware Technician Blame
  Progress or the Application
• Don‟t let the Hardware Technician escape
  without solving the problem - every component
  might need to be swapped out
              www.BravePoint.com
           Common Problems
•   Corrupt Blocks
•   Running out of disk space
•   Deleted/Damaged Portion of the Database
•   Database brought down by the Watchdog or
    the Broker itself
     • Upgrade Progress (V7.3E with Patches,
       V8.2C with Patches)
     • Use Loopback Client connections (-H & -S)


               www.BravePoint.com
               Corrupt Blocks
• What Kind of Block? Index or Data
   • Block Type 2 (Index - IX) or 3 (Record Mgr - RM)
   • If IX block, try rebuilding the indexes
   • If RM block, consider the following
• Reformat a block as a Free block (proutil dbrpr)
• Replace the block with the same block from
  another DB (probably restored from a backup)
   • proutil dbrpr
   • 5. Dump Block          (from the good database)
   • 4. Load Block          (into the bad database)

                www.BravePoint.com
           Emergency Dump
• „Front and Back‟ 4GL Dump
   • for each customer by cust-num (until you hit
      the bad spot)
   • for each customer by cust-num descending
• If the Primary Index is Damaged, try Dumping
  using a non-Primary Index
• RECID Dump
   • Doesn‟t require an Index
   • Very Slow on a Large Database
   • Last Resort

              www.BravePoint.com
        Miscellaneous Hints
• If you can‟t get into the database with -F or
  any other way, try the Read Only (-RO) option




              www.BravePoint.com
             Deleted Extents
• First, backup the remaining pieces of the
  database
• This may seem like a useless step but if your
  backup is defective you may need to repair the
  broken DB and that‟s difficult if it‟s deleted
• The Backup gives you time to:
   • Prepare a plan of action
   • Call outside resources (like me) for help
   • Calm down
   • Lock your door
   • Prepare a new Resume (C.V.)

               www.BravePoint.com
          Deleted AI Extents
• If an AI Extent is Deleted, simply disable AI
  and...
   • What? You‟re not running AI? BAD IDEA!
   • Disable AI (rfutil aimage end)
   • Fix the problem that caused the lost Extent
   • Recreate the Extent with prostrct add
   • Restart AI (rfutil aimage begin)
• If this doesn‟t work, go to the next slide


              www.BravePoint.com
            Deleted AI Extents
• Disable AI with rfutil aimage end. You may get an
  error message regarding the missing AI Extent
  but typically AI is still disabled
• Truncate the BI file with proutil truncate bi. You
  may get an error message regarding the missing
  AI Extent but typically the BI file is still truncated
• Remove all AI Extents with prostrct remove
• Recreate the original AI Extents with prostrct add
• Restart After Imaging with rfutil aimage begin
• Reformat the truncated BI file with proutil bigrow


                 www.BravePoint.com
            Deleted BI Extents
• Force Access with -F
   • V8.2 and later -F only forces access on proutil
     truncate bi
   • If you Force Access, consider the DB
     damaged!
   • Forcing Access THROWS AWAY the BI file
   • Forcing Access sets the „Tainted Flag‟
   • Even if you fix the Tainted Flag, consider the
     DB damaged!
   • Dump & Load (this is if AI is not enabled)

                www.BravePoint.com
           Deleted DB Extents
•   Restore the the DB and BI from Backup
•   Apply the AI files
•   Re-enable AI
•   BI Grow
•   Done!
•   Still not running AI?, next slide please




                www.BravePoint.com
         Deleted DB Extents
• Use prostrct unlock if the deleted Extent
  was Empty (above the High Water Mark)
• prostrct unlock will recreate missing
  Extents
• However unlock also changes the time
  stamps on the AI files and they can‟t be
  used any longer



             www.BravePoint.com
            Deleted DB Extents
• Backup Extent Substitute
   • This technique is for Extents that contain Data
     except the .d1 Extent (.d1 contains Master Block)
   • Restore a copy of the deleted Extent from a
     Backup
   • The Extent‟s „Last Opened‟ time stamps won‟t
     match
   • Use prostrct unlock to sync the time stamps
     (broken in V8.2 and V9)
   • The data in the Extent might not match but…
   • Use the -miracle option to re-create the Data 

                www.BravePoint.com
          Deleted DB Extents
• If the database Broker is still running:
   • DON‟T Shutdown the Database
   • That „closes‟ the database extents and you
      won‟t be able to re-open them
   • If a Client is still attached to the DB and
      that Client can access the Progress Editor,
      simply Dump the Database from the
      Dictionary
   • Even if they can‟t get into the Editor, put
      dict.p (renamed as a menu item) into their
      PROPATH
              www.BravePoint.com
     Deleted DB Extents - Unix
• Warm Boot the System Immediately
• Don‟t Shut Down the DB First
• When Unix fsck runs, it will *probably*
  recover the deleted Extent
• Why?
• A file is not finally deleted until every process
  that has it open is gone (the Broker still has it
  open)



               www.BravePoint.com
         Overlay Database
• .db File is lost
• Create a Void Multi-volume Structure that
  matches the problem Database (hopefully
  you have a current .st file
• Take the .db from the Void Structure and
  use it for the problem DB
• Use prostrct unlock to Sync the Extent
  Timestamps
• The above steps apply if you are using
  probkup otherwise restore the .db from 3d
  party backup
            www.BravePoint.com
            Overlay Database
• On V9 it is much easier to restore the .db file
• prostrct builddb
• Requires an up-to-date Structure File
  (remember that from the Preventative
  Maintenance list?)




               www.BravePoint.com
                   Disk Full
• Use prostrct repair to relocate Extents to a
  location with more space
   • Copy the Extent
   • Create a new Structure File (.st) that reflect
     the current location of the Database Extents
     (one good reason to have a current one)
   • Run prostrct repair new.st
   • Done!
• There is a bug in prostrct repair. Don‟t
  misspell the name of the Structure File or your
  .db file will vanish forever (fixed in V8.3C)
               www.BravePoint.com
                BI Disk Full
• Do Not Run out of Space on the BI Disk if:
   • You can‟t make more space for the BI file
   • You can‟t relocate the BI file
• To Perform Crash Recovery, the BI file must
  grow
• If there is no space for the BI file to grow,
  there is no Crash Recovery
• Force Access (-F) is the only option (if you
  don‟t have AI enabled)

              www.BravePoint.com
     Progress HA Challenges
• Online Backups
   • probkup online
   • AI Backup (already covered)
   • Split Mirror
• Multi-DB backups cannot be restored to
  exactly the same point in time unless you are
  using Two Phase Commit




              www.BravePoint.com
        Online Backup Options
• probkup online
   • Transaction Freeze during BI Backup
   • Usually High I/O Volume
   • Only DB and BI are Backed Up
   • Performs an AI Extent Switch
   • If backup to disk, backup file can‟t be greater
     2gb (except on NT) until V9.1B
   • Use the Volume Size (-vs) Parameter if backup
     file is greater than 2gb



               www.BravePoint.com
      Online Backup Options
• Split Mirror Backup
   • Quiet the DB with proquiet enable
   • Break the Disk Mirror
   • Unquiet the DB with proquiet disable
   • Backup the Offline Mirror
   • Resync the Mirrors




              www.BravePoint.com
      Online Backup Options
• Split Mirror Backup
   • Database is vulnerable during the backup
     unless you have triple mirroring or similar
     feature
   • Resynchronization of the Disks can be a
     big Performance Problem
   • Still Can‟t Backup (and restore) Multiple
     DBs Simultaneously



              www.BravePoint.com
      Progress HA Challenges
• Schema Changes. It is not possible to make
  Schema changes online, even in V9.1
• Adding New Extents. It is not possible to add
  Extents online, even in V9.1
• The BI Notes counter is limited to 2 billion until
  V8.3D and V9.1C
• Promon/VST Statistics. The numbers recorded by
  promon are limited to 4 billion
• Updating the Progress Version. It is necessary to
  shutdown the DB to change Progress versions
• The DB log (.lg) file is limited to 2gb. A system
  that runs for months could approach this limit.
  There is no way to „truncate‟ the log file online.
               www.BravePoint.com
        Progress HA Challenges
• Database Reorg. In V8 is possible to build indexes
  online. In V9 it is possible move tables (and
  indexes) from one Area to another but the
  tablemove option has the following disadvantages:
   • The table being moved is EXCLUSIVEly locked
     for the duration of the move.
   • Because the RECIDs will be different in the new
     Area, all indexes on the table must be rebuilt.
   • The table is moved in one transaction. This
     means that the BI file can grow quite large. If AI is
     enabled, the notes generated by the table move is
     written to the AI file the same way that other
     transaction related activities are
                 www.BravePoint.com
            Sources of Help
• Progress Documentation
• Progress Database Administration Guide
• dba@peg.com
• Progress Knowledgebase
  (techweb.progress.com)
• My Home Number: 541-754-2116
• My Mobile Number is: 541-829-7741
   • For those weekend emergencies when you
     need expert assistance
   • This is not a free call


             www.BravePoint.com
www.BravePoint.com

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:7
posted:3/4/2010
language:English
pages:49