Frederick Tang Weblog

Stories about Oracle concepts I have learnt from work, and the occasional brain-dump…

Archive for May 2008

Failover for 9i RAC

with 3 comments

– Edit: I have updated the title of this post to include 9i. New failover features are available in 10g, so this article is a little out-of-date.

Over the week, my colleagues and I discussed application failover on a Oracle9i RAC database. I don’t know much about how RAC works, so I decided to do some research and post some notes here. Hopefully, this information can add to someone else’s researching into understanding the technology.

Disclaimer: What I am posting here isn’t something new, the information is freely available in Oracle documentation. If you find the information I post here is wrong, please kindly let me know. By all means, test the technology for yourselves. If you would like to ask a question, please feel free and I will try research a bit more to give an educated answer when I have time.

To start this off, I would imagine I have a 2-node Oracle9i RAC database and the simplest client tnsnames entry would look like this:

BLOGRAC =
  (DESCRIPTION =
     (LOAD_BALANCE = ON)
     (ADDRESS_LIST =
        (ADDRESS=(PROTOCOL=TCP)(HOST=blograc1)(PORT=1521))
        (ADDRESS=(PROTOCOL=TCP)(HOST=blograc2)(PORT=1521))
     )
     (CONNECT_DATA = (SERVICE_NAME=BLOGRAC))
   )    

The first failover feature is connect-time failover, where “a client connect request is forwarded to another listener if a listener is not responding”[source][source]. To enable this, the client tnsnames entry might look like this:

BLOGRAC =
  (DESCRIPTION =
     (LOAD_BALANCE = ON)
     (FAILOVER = ON)
     (ADDRESS_LIST =
        (ADDRESS=(PROTOCOL=TCP)(HOST=blograc1)(PORT=1521))
        (ADDRESS=(PROTOCOL=TCP)(HOST=blograc2)(PORT=1521))
     )
     (CONNECT_DATA = (SERVICE_NAME=BLOGRAC))
   )

The next failover feature is Transparent Application Failover (TAF), where “the failover and reestablishment of application-to-service connections. It enables client applications to automatically reconnect to the database if the connection fails, and optionally resume a SELECT statement that was in progress. This reconnect happens automatically from within the Oracle Call Interface (OCI) library.” [source].

TAF has several restrictions [source]:

  • All PL/SQL package states on the server are lost at failover
  • ALTER SESSION statements are lost
  • If failover occurs when a transaction is in progress, then each subsequent call causes an error message until the user issues an OCITransRollback call. Then Oracle issues an Oracle Call Interface (OCI) success message. Be sure to check this message to see if you must perform additional operations.
  • Oracle fails over the database connection and if TYPE=SELECT in the FAILOVER_MODE section of the service name description, Oracle also attempts to fail over the query
  • Continuing work on failed-over cursors can result in an error message

To enable this feature, the client tnsnames entry might look like this:

BLOGRAC =
  (DESCRIPTION =
     (LOAD_BALANCE = ON)
     (FAILOVER = ON)
     (ADDRESS_LIST =
        (ADDRESS=(PROTOCOL=TCP)(HOST=blograc1)(PORT=1521))
        (ADDRESS=(PROTOCOL=TCP)(HOST=blograc2)(PORT=1521))
     )
     (CONNECT_DATA =
        (SERVICE_NAME=BLOGRAC)
        (FAILOVER_MODE=
           (TYPE=select)
           (METHOD=basic))
     )
   )

The TYPE are session| select | none, and METHOD are basic | preconnect [source].

As stated above, TAF is only supported on JDBC OCI drivers and not on JDBC Thin drivers. I found a couple of Oracle Metalink Notes that re-iterates this, although they are updated for Oracle10g, they still apply to Oracle9i. I also found a note which explains how to write Java code to enable OCI TAF callbacks:

Note:297490.1 Is TAF (Transparent Application Failover) Supported On The Latest JDBC Thin Driver (10g)?

Note:465423.1 Can the JDBC Thin Driver Do Failover by Specifying FAILOVER_MODE?

Note:276225.1 JDBC Sample on How to Implement OCI TAF callbacks with DML and PL/SQL

This is an aside… I wanted to know what error will occur to a Java application [source] when there’s instance shutdown or failure. I have written a very simple Java program to test this, and it will query results from a big table I created on a Oracle9i RAC database, using the Oracle9i JDBC Thin Driver:

1. Oracle instance shutdown

Vendor Error Code: 1089
java.sql.SQLException: ORA-01089: immediate shutdown in progress – no operations are permitted
    at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:134)
    at oracle.jdbc.ttc7.TTIoer.processError(TTIoer.java:289)
    at oracle.jdbc.ttc7.Oclose.receive(Oclose.java:122)
    at oracle.jdbc.ttc7.TTC7Protocol.close(TTC7Protocol.java:683)
    at oracle.jdbc.driver.OracleStatement.close(OracleStatement.java:644)
    at TestDBOracle.main(TestDBOracle.java:49)

2. Oracle instance failure (by killing the pmon and smon processes) – yes, it came out with funny characters…

Vendor Error Code: 17410
java.sql.SQLException: �S���q��y����
    at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:134)
    at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:179)
    at oracle.jdbc.dbaccess.DBError.check_error(DBError.java:1160)
    at oracle.jdbc.ttc7.MAREngine.unmarshalUB1(MAREngine.java:963)
    at oracle.jdbc.ttc7.MAREngine.unmarshalSB1(MAREngine.java:893)
    at oracle.jdbc.ttc7.Oclose.receive(Oclose.java:101)
    at oracle.jdbc.ttc7.TTC7Protocol.close(TTC7Protocol.java:683)
    at oracle.jdbc.driver.OracleStatement.close(OracleStatement.java:644)
    at TestDBOracle.main(TestDBOracle.java:49)

[Please note: If your application is getting a 17410 error, it doesn't always mean the database / instance died.]

Written by fredericktang

May 30, 2008 at 9:24 am

Posted in 9i, Oracle

ARC0: Media Recovery Disabled

without comments

Just a quick one… when I was creating a new database today, I encountered this message in the alert log after the database was created and instance restarted. The message repeated itself continuously to the point I felt the need to find out what is going on.

Turns out the instance started the archiver with the database created in NOARCHIVELOG mode:

1. My CREATE DATABASE statement did not include the keyword ARCHIVELOG or NOARCHIVELOG. The default is NOARCHIVELOG.
2. My initialization parameter had log_archive_start=true.

Oracle Metalink Note: 131119.1 explains more about this:

To verify:

SQL> select log_mode from v$database; /* LOG_MODE is NOARCHIVELOG */
SQL> show parameter log_archive_start; /* VALUE is TRUE */

I am not going to copy the fix on the Note here. To fix my problem, I set log_archive_start=false in the initialization parameters and restarted the instance. I wanted my new database to be in NOARCHIVELOG mode initially.

The log_archive_start parameter is deprecated in Oracle10g (10.1).

Written by fredericktang

May 26, 2008 at 8:27 am

Posted in 9i, Oracle

Starting OCA Oracle10g Administration I

with one comment

I have decided to try gain OCA certification for 10g, and maybe OCP later on… who knows. I have got the OCA Study Guide published by Sybex. I taken the initial Assessment Test given by the book and got 28 out of 40 questions correct. My weak points are Listener/Dispatcher, Enterprise Management Console, Backup and Recovery.

Wish me luck!

Written by fredericktang

May 21, 2008 at 6:35 am

Posted in 10g, Oracle

Counting rows in an empty table

without comments

This is of course nothing new, but I stumbled across this interesting scenario today. I was trying to import a large table ~ 14mil rows into a test database for testing purposes. Half way during the import, it failed with “ORA-1653: unable to extend table…”. I simply ran out of space in my tablespace.

What is interesting is when I count the rows in the table, even when import failed.

SQL> set autotrace on
SQL> set timi on
SQL> select count(*) from se_activity_log;

  COUNT(*)
----------
        0 

Elapsed: 00:00:21.21 

Execution Plan
----------------------------------------------------------
     0   SELECT STATEMENT Optimizer=CHOOSE (Cost=14654 Card=1)
1    0   SORT (AGGREGATE)
2    1     TABLE ACCESS (FULL) OF 'SE_ACTIVITY_LOG' (Cost=14654 Card=15100349) 

Statistics
----------------------------------------------------------          
    0  recursive calls          
    0  db block gets      
68253  consistent gets      
62033  physical reads          
    0  redo size        
  490  bytes sent via SQL*Net to client        
  655  bytes received via SQL*Net from client          
    2  SQL*Net roundtrips to/from client          
    0  sorts (memory)          
    0  sorts (disk)          
    1  rows processed

So… it took 20 seconds to tell me I don’t have any rows in my table. Why?

It probably has to do with the High Water Mark of the table. Even though the Oracle Import failed, the extents allocated to the table is not released. A full table scan will scan up to the High Water Mark [reference]. Let’s take a look at the High Water Mark:

VARIABLE total_blocks number;
VARIABLE total_bytes number;
VARIABLE unused_blocks number;
VARIABLE unused_bytes number;
VARIABLE last_used_extent_file_id number;
VARIABLE last_used_extent_block_id number;
VARIABLE last_used_block number; 

begin dbms_space.unused_space(user,'SE_ACTIVITY_LOG','TABLE', :total_blocks, :total_bytes, :unused_blocks, :unused_bytes, :last_used_extent_file_id, :last_used_extent_block_id, :last_used_block);
end;
/
print total_blocks total_bytes unused_blocks unused_bytes last_used_extent_file_id last_used_extent_block_id last_used_block; 

SQL>
TOTAL_BLOCKS
------------       
       68608 

TOTAL_BYTES
-----------  
  562036736 

UNUSED_BLOCKS
-------------            
            0 

UNUSED_BYTES
------------           
           0 

LAST_USED_EXTENT_FILE_ID
------------------------
                      11 

LAST_USED_EXTENT_BLOCK_ID
-------------------------
                   126344 

LAST_USED_BLOCK
---------------
           1024

My test database is built on a slow machine, so full table scanning 500MB data takes a while. There’s no point in freeing the space allocated, because it was a failed Oracle Import, I will have to extend the tablespace and retry the Import again.

 

Written by fredericktang

May 5, 2008 at 3:24 am

Posted in 9i, Oracle