Start of Content Area

Procedure documentation Identifying the Problem Requiring a Restore  Locate the document in its SAP Library structure

Use

Before you start doing a restore with the Informix database, you need to make sure that you have identified the problem and are sure what kind of failure has occurred.

Procedure

Answer the following questions:

  1. Is the database server up or has the failure caused it to go down?
  2. To find out whether the database server is up or down, you can enter the following from the command line:

    $ onstat -

    If the response is not similar to the following example, then the database server might have failed:

    INFORMIX-OnLine Version 7.20.UC3 --On-Line-- Up 6 days 20:57:16 -- 80298 Kbytes

    To make sure that the database server has not terminated normally, see "What is in the message-log file?" below.

    If the database server is blocked, the reason is sometimes given in an extra information line, as in the following example:

    Blocked: Media_failure

    If the database server is down due to a problem, this generally means that a failure has occurred in a "critical" dbspace ( logdbs , physdbs or rootdbs ). Whether the database server is up or down influences what kind of restore you need.

  3. Which dbspaces have failed?

You need to identify which dbspaces have failed:

· Critical dbspaces ( logdbs , physdbs , rootdbs )

If a critical dbspace has failed, then the database server goes down.

· Non-critical dbspaces (all remaining dbspaces)

If only a non-critical has failed, then the database server might still be up.

The type of dbspace that has failed determines what kind of restore you need to do. If the database server is still up, identify which dbspaces have failed by using the following procedure:

    1. Enter the following command from the command line:
    2. $ onstat -d

    3. Read the second section of output from this command, as in the following example (only part of the output is shown):
    4. Chunks

                   

      address

      chk/dbs

      offset

      size

      free

      bpages

      flags

      pathname

      c34ac178

      1 1

      8

      75000

      20203

       

      PO-

      /.../physdev1/data1

      c34ac398

      2 2

      8

      175106

      53

       

      PO-

      /.../physdev2/data3

      c34ac470

      3 3

      75008

      435000

      7

       

      PD-

      /.../physdev1/data1

      c34ac548

      4 4

      175114

      205000

      3

       

      PO-

      /.../physdev2/data3

      c34ac620

      5 5

      8

      10106

      53

       

      PO-

      /.../physdev1/data2

      c34ac6f8

      6 6

      8

      50000

      49939

       

      PO-

      /.../physdev2/data4

      c34ac7d0

      7 7

      10114

      350000

      74345

       

      PO-

      /.../physdev1/data2

      c34ac8a8

      8 8

      50008

      10000

      1295

       

      PO-

      /.../physdev2/data4

      c34ac980

      9 4

      360114

      150000

      265

       

      PO-

      /.../physdev1/data2

      ...

                   
    5. Check "flags" for a value of "D" (that is, "down") in the second position, and then read across to find the value in "chk/dbs" (that is, "chunk/dbspace"). In this example, chunk 3 – belonging to dbspace number 3 – is down.
    6. Read the first section of the output to find the name of the affected dbspace, as in the following sample of output from this example:

dbspaces

             

address

number

flags

fchunks

nchunks

flags

owner

name

c34ac108

1

1

1

1

N

informix

rootdbs

c34ad2c8

2

1

2

1

N

informix

logdbs

c34ad338

3

1

3

2

N

informix

psapes30e

...

             

Look for the dbspace number 3 and read across to find the name of the dbspace. In this example, the affected dbspace is psapes30e , a non-critical dbspace.

  1. What is in the message file?
  2. Look in this file to find if there is any clue to what has happened. The database server keeps a processing audit trail in this file. The file also tells you if the database server has terminated normally, as in the following example:

    16:46:07 INFORMIX-OnLine Stopped.

    In this case, you should be able to see the checkpoint information before this message, indicating that the data on disk is consistent. If so, a restore is not necessary.

    You can use SAPDBA to look at the message file. Refer to Listing System Information with SAPDBA.

  3. How can I find out exactly what went wrong?
  4. In most cases, you can use the answers to the previous questions to identify what has happened.

    If you require extra confirmation and can afford to spend more time investigating, you can execute the command oncheck to obtain a comprehensive picture of the disk structure. For more information on oncheck , see the Informix documentation. Depending on which parameters you use, oncheck might take up to several hours to complete.

  5. Did the fault occur after a particular point in time?

You might – after examining the message-log file (see "What is in the message-log file?" above) – have found that the error that caused the failure occurred at a particular time. You can then do a "Point-in-Time" (PIT) restore (only available if you use ON-Archive or ON-Bar for data recovery and are doing a cold restore). A PIT restore avoids using corrupted or faulty data from after this point. See Performing Logical Restore for Full-System Cold Restore (ON-Archive) or Performing Logical Restore for a Full-System Cold Restore (ON-Bar) for more information on PIT restores.

Result

Now that you have identified the problem, you need to decide what kind of restore is the most appropriate for the situation. If you suspect that the fault lies in the database server rather than in the database data, you need to find a solution to the problem because a restart might not be possible or failure might recur soon after the restart. Contact the Informix hotline in this case.

 

See also:

Informix documentation