Frederick Tang Weblog

Stories about Oracle concepts I have learnt from work, and the occasional brain-dump…

Archive for April 2008

My query is slow…

without comments

Recently, I have been getting a few calls from application developers asking for help. Oh well… so what do they expect me to do? To determine whether the problem is trivial or complex, I would run through a series of questions with them. Perhaps I will share a few here.

1. What’s the SQL, which tables on which database?
Start off with some basic details, so I know what I am working with and understand the problem I am trying to solve. Most of the time, an index is all that’s required. For more complex problems, more information is needed.
I might even know a bit about the particular database already, because a similar problem might have appeared on that database before as reported by another user.

2. How many rows are on the table? and how many rows are you trying to select? and which columns you are selecting?
This tells me whether your query should select a handful of rows or are you doing a full table scan on a 10 million rows table. Basically, I would like to know the volume of data transferred from database to your client/desktop.

2. How much time does the query take, when you let it run?
Tell me how long is too long, 1 second or 10 seconds or minutes. Did the query finish running at all or did the application timeout before results were returned by the database.

3. How much slower than usual are we talking about? Does it happens all the time, or once in a while, particular time of the day? When was the first time the slowness was noticed?
A few questions combined into one. I would like to know the usual performance of the query, so that I can determine the extent of the tuning effort and how bad the query has performed. Given a timeframe, I can take a look at the Oracle Statspack reports to get an idea what’s running on the database at the time, might be some one-off batch reports or there’s a bottleneck somewhere.

4. Was there anything different between the times when it was faster and slower?
Perhaps the tables just got dumped with a few thousand/million rows with the recent product launch or data migration exercise. Perhaps the environment was different, did you run it from home or office? Was there some database maintenance work carried on the database recently? Worse still, did someone drop an index that was suppose to be there?

4. What tools did you use to run the query?
Some application developers test their queries using PL/SQL Developer, whereas I like to use SQL*Plus. Also taking into account the volume of data to move from database into the client, it might take a few seconds to load all these data into memory, and display them on screen. Some programs might fail to respond if the volume of data to load is a few MBs. Understandably, if the query is slow by minutes, then we have another problem on our hands.

7. What network were you on when you executed the query?
I normally test queries by running SQL*Plus on the database server itself. Some developers were running queries from home over VPN. This can often explain the difference if the slowness is by seconds.

There might be other good questions too, but usually answers to these will give me enough to get me start the investigations. Anyone have questions that they find useful to share?

Written by fredericktang

April 28, 2008 at 8:02 am

Posted in Oracle

Virtual Index

without comments

Douglas was right about virtual index to the Primary Key Puzzle. In fact, Douglas has already replicated the problem, except the name “VI$…..”. It took me a while to understand what caused it. The bigger worry I have is duplicate keys were allowed into the table despite the presence of the primary key constraint. Let me try replicate this in a test environment:

I start off by creating a simple table, with no primary key constraints, then create a virtual index. Virtual (Fake) Indexes are created using the nosegment keyword in CREATE INDEX. It is interesting to see the user_objects, user_ind_columns and user_indexes views.

SQL> create table pktest ( idkey varchar2(100) ); 

Table created. 

SQL> create index ix_pktest_idkey on pktest (idkey) nosegment; 

Index created. 

SQL> select object_name, object_type, status, temporary, generated, secondary
  2  from user_objects
  3  where object_name='IX_PKTEST_IDKEY'; 

OBJECT_NAME     OBJECT_TYPE        STATUS  T G S
--------------- ------------------ ------- - - -
IX_PKTEST_IDKEY INDEX              VALID   N N N 

SQL> select index_name, table_name, column_name
  2  from user_ind_columns
  3  where table_name='PKTEST'; 

INDEX_NAME           TABLE_NAME           COLUMN_NAME
-------------------- -------------------- ---------------
IX_PKTEST_IDKEY      PKTEST               IDKEY 

SQL> select * from user_indexes
  2  where table_name='PKTEST' or index_name='IX_PKTEST_IDKEY'; 

no rows selected 

SQL> select constraint_name, constraint_type, index_name
  2  from user_constraints where table_name='PKTEST'; 

no rows selected

Next, I will create a primary key constraint using ALTER TABLE and see its effects:

SQL> alter table pktest add constraint pk_pktest_idkey primary key (idkey); 

Table altered. 

SQL> select constraint_name, constraint_type, index_name
  2  from user_constraints where table_name='PKTEST'; 

CONSTRAINT_NAME                C INDEX_NAME
------------------------------ - ------------------------------
PK_PKTEST_IDKEY                P IX_PKTEST_IDKEY 

SQL> select index_name, table_name, column_name
  2  from user_ind_columns
  3  where table_name='PKTEST'; 

INDEX_NAME      TABLE_NAME      COLUMN_NAME
--------------- --------------- ---------------
IX_PKTEST_IDKEY PKTEST          IDKEY 

SQL> select object_name, object_type, status, temporary, generated, secondary
  2  from user_objects
  3  where object_name in ('IX_PKTEST_IDKEY', 'PK_PKTEST_IDKEY'); 

OBJECT_NAME     OBJECT_TYPE        STATUS  T G S
--------------- ------------------ ------- - - -
IX_PKTEST_IDKEY INDEX              VALID   N N N 

SQL> insert into pktest (idkey) values ('1'); 

1 row created. 

SQL> / 

1 row created. 

SQL> / 

1 row created. 

SQL> / 

1 row created. 

SQL> select * from pktest; 

IDKEY
--------------------------------------------------------------------------------
1
1
1
1 

SQL> commit; 

Commit complete. 

SQL>

Unfortunately, I couldn’t find how the Virtual Index was created against the PVSSESSION table in the first place (in my last blog entry). I suspect a DBA was investigating whether an Index would be useful on MSISDN in improving query performance, using the Enterprise Manager – Oracle Tuning Pack. Virtual Indexes are used by the Oracle Tuning Pack to test the CBO whether an index would improve query performance without actually creating an index. There are also various Oracle Metalink Notes talking about this (297507.1, 329457.1). A detailed test posted in Donghua’s blog was also useful to know.

Written by fredericktang

April 28, 2008 at 2:54 am

Posted in 9i, Oracle

Primary Key Puzzle

with 2 comments

Okay, this is a little weird… See if anyone can make any sense out of this. This is on a Production Oracle 9.2.0.8 2-node RAC database:

SQL> desc PVSSESSION
Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
MSISDN                                    NOT NULL VARCHAR2(20)
CREATIONDATE                              NOT NULL DATE
LASTACCESSDATE                            NOT NULL DATE

  1* select constraint_name, constraint_type, search_condition, status, deferrable, deferred, validated, generated, bad, rely, index_name, invalid from user_constraints where table_name='PVSSESSION'
SQL> / 

CONSTRAINT_NAME           C SEARCH_CONDITION               STATUS
------------------------- - ------------------------------ --------
DEFERRABLE     DEFERRED  VALIDATED     GENERATED      BAD RELY
-------------- --------- ------------- -------------- --- ----
INDEX_NAME                     INVALID
------------------------------ -------
SYS_C002996               C "MSISDN" IS NOT NULL           ENABLED
NOT DEFERRABLE IMMEDIATE VALIDATED     GENERATED NAME 

SYS_C002997               C "CREATIONDATE" IS NOT NULL     ENABLED
NOT DEFERRABLE IMMEDIATE VALIDATED     GENERATED NAME 

SYS_C002998               C "LASTACCESSDATE" IS NOT NULL   ENABLED
NOT DEFERRABLE IMMEDIATE VALIDATED     GENERATED NAME 

PK_PVSSESSION_MSISDN      P                                ENABLED
NOT DEFERRABLE IMMEDIATE VALIDATED     USER NAME
VI$1146123733024$01

SQL> select * from user_indexes where table_name='PVSSESSION';

no rows selected

SQL> select index_name, column_name from user_ind_columns where table_name='PVSSESSION'; 

INDEX_NAME                     COLUMN_NAM
------------------------------ ----------
VI$1146123733024$01            MSISDN

SQL> insert into pvssession select * from pvssession where rownum <= 5; 

5 rows created. 

SQL> select count(*) from (select msisdn from pvssession group by msisdn having count(*) > 1); 

  COUNT(*)
----------
         5 

SQL> rollback; 

Rollback complete. 

SQL>

Written by fredericktang

April 24, 2008 at 2:30 am

Posted in 9i, Oracle

ORA-12535: TNS:operation timed out

with one comment

So this story starts with the error message. One of our testers reported this error when testing connections to our new database. I checked the database configurations and the tnsnames.ora he is using – the details are correct. So I asked him what is his environment, in an attempt I can reproduce the error myself.

Here are some facts:

1. The database name is CIMSTMP, and it is a Oracle9i RAC with 2-nodes – cissdb01/02. The Oracle listener is on port 2521.

2. The Client environment is Windows XP SP2 (SP2 Firewall is OFF). PL/SQL Developer by Allround Automations. Oracle9i (9.2.0.1) Client is installed.

3. In our Production environment, every server has two network interfaces: Production and Management. We are provided VPN software to access Production servers via the Management interface. Firewall restricts access to Production interfaces via the VPN.

4. The Client (my PC) tnsnames.ora file is as follows (uses Management network interface):

CIMSTMP =
  (description =
    (load_balance = on)
    (failover = on)
    (address=(protocol=tcp)(host=cissdb01-mgt)(port=2521))
    (address=(protocol=tcp)(host=cissdb02-mgt)(port=2521))
    (CONNECT_DATA =
      (SERVICE_NAME = CIMSTMP)
    )
  )

CIMSTMP1=
(description=
  (address=(protocol=tcp)(host=cissdb01-mgt)(port=2521))
  (connect_data=
    (service_name=CIMSTMP)
    (instance_name=CIMSTMP1)))

CIMSTMP2=
(description=
  (address=(protocol=tcp)(host=cissdb02-mgt)(port=2521))
  (connect_data=
    (service_name=CIMSTMP)
    (instance_name=CIMSTMP2)))

5. The Server tnsnames.ora file is as follows (uses Production network interface):

CIMSTMP =
  (description =
    (load_balance = on)
    (failover = on)
    (address =(protocol=tcp)(host=cissdb01)(port=2521))
    (address =(protocol=tcp)(host=cissdb02)(port=2521))
    (CONNECT_DATA =
      (SERVICE_NAME = CIMSTMP)
    )
  )

CIMSTMP1=
(description=
  (address=(protocol=tcp)(host=cissdb01)(port=2521))
  (connect_data=
    (service_name=CIMSTMP)
    (instance_name=CIMSTMP1)))

CIMSTMP2=
(description=
  (address=(protocol=tcp)(host=cissdb02)(port=2521))
  (connect_data=
    (service_name=CIMSTMP)
    (instance_name=CIMSTMP2)))

6. After completed my test environment setup, I can SQL*Plus into the database using CIMSTMP1 or CIMSTMP2, but attempts using CIMSTMP returned the ORA-12535 error. However, tnsping CIMSTMP returns OK (90msec). This means I have no problems contacting the listener via the Management network interface.

Of course, if I had read the Oracle documentation and understood how Oracle listener works, I wouldn’t have to go through this exercise. The problem is… I didn’t… So starting from the beginning…

1. I turned on Oracle listener tracing on both instances:

cissdb01$ lsnrctl trace 16 lsnr_cimstmp1

cissdb02$ lsnrctl trace 16 lsnr_cimstmp2

2. I also started a SSH session to monitor the Oracle listener logs:

cissdb01$ tail -f $ORACLE_HOME/network/log/lsnr_cimstmp1.log

cissdb02$ tail -f $ORACLE_HOME/network/log/lsnr_cimstmp2.log

3. On my Windows XP machine, I started a Command Prompt and first tested connections to CIMSTMP1 and CIMSTMP2. Both SQL*Plus connected successfully, the corresponding listener log recorded the following entries:

11-APR-2008 16:25:22 * (connect_data=(service_name=CIMSTMP)(instance_name=CIMSTMP1)(CID=(PROGRAM=C:\Oracle\92\bin\sqlplus.exe)(HOST=MABBO)(USER=fredt))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.140.193.22)(PORT=3688)) * establish * CIMSTMP * 0

11-APR-2008 16:25:28 * (connect_data=(service_name=CIMSTMP)(instance_name=CIMSTMP2)(CID=(PROGRAM=C:\Oracle\92\bin\sqlplus.exe)(HOST=MABBO)(USER=fredt))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.140.193.22)(PORT=3690)) * establish * CIMSTMP * 0

4. Next I tried SQL*Plus using CIMSTMP, but ORA-12535 appeared after ~20 secs of waiting. The listener log on cissdb01 recorded the following entry however. No connection attempts were recorded by the listener log on cissdb02.

11-APR-2008 16:32:00 * (CONNECT_DATA=(SERVICE_NAME=CIMSTMP)(CID=(PROGRAM=C:\Oracle\92\bin\sqlplus.exe)(HOST=MABBO)(USER=fredt))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.140.193.22)(PORT=3710)) * establish * CIMSTMP * 0

Interesting enough, the listener return code is 0 which means a successful connection or command [source].

5. I then examined the listener trace produced by cissdb01, much of the content doesn’t make much sense to me, but something strikes me after staring at it for 10 minutes… I won’t attach the full trace in this blog…

There were two sections of the trace that stood out (nsprecv and nspsend):

a). (nsprecv) packet received and dump. The packet contains the following information:

(DESCRIPTION=
   (load_balance=on)
   (failover=on)
   (address=(protocol=tcp)(host=cissdb01-mgt)(port=2521))
   (connect_data=
      (service_name=CIMSTMP)
      (CID=
         (PROGRAM=C:\Program.Files\PLSQL.Developer\plsqldev.exe)
         (HOST=MABBO)(USER=fredt))))

b). (nspsend) packet send and dump. The packet contains the following information:

(DESCRIPTION=
(address=(protocol=tcp)(host=cissdb02)(port=2521))
(connect_data=(service_name=CIMSTMP)(instance_name=CIMSTMP2)))
.
(DESCRIPTION=
   (load_balance=on)
   (failover=on)
   (address=(protocol=tcp)(host=cissdb01-mgt)(port=2521))
   (connect_data=
      (service_name=CIMSTMP)
      (CID=
         (PROGRAM=C:\Program.Files\PLSQL.Developer\plsqldev.exe)
         (HOST=MABBO)(USER=fredt))
      (SERVER=dedicated)(INSTANCE_NAME=CIMSTMP2)))

It seems that the Listener on cissdb01 has responded to my client (SQL*Plus) with information on how to connect to a particular node. In fact, I have just learnt how Oracle Listener Local Naming works as it is explained here.

The process for establishing a client session with the aid of a naming method is as follows:

  1. The client initiates a connect request by providing a connect identifier.
  2. The connect identifier is resolved to a connect descriptor by a naming method. This information is returned to the client.
  3. The client makes the connection request to the address provided in the connect descriptor.
  4. A listener receives the request and directs it to the appropriate database server.
  5. The connection is accepted by the database server.

The local naming method stores net service names and their connect descriptors in a localized configuration file named tnsnames.ora.

The problem is however, the Server tnsnames.ora only contains Production network interface hostnames. When this information is used by the Client (on my PC), the firewall blocks that traffic. This explains the TNS operation timeout.

Another useful tool on Windows XP I used was `netstat 5` on Command Prompt. It shows all the network connections established or attempts on my PC.  With it, I could see my PC was trying to establish a connection to cissdb02 (SYN_SENT).

So yes (for a long story)… problem was firewall.

Written by fredericktang

April 11, 2008 at 7:42 am

Posted in 9i, Oracle

How to disagree

without comments

http://www.paulgraham.com/disagree.html

This is just gold!

Funny material to read on a Friday afternoon… Let me think if I had an DHn thrown at me in the past…

DH0. Name-calling – nope… nobody has called me a fag yet… or maybe they have without saying aloud.

DH1. Ad Hominem – yes… I had my business-related input dismissed before because I am just a developer. I haven’t yet had my technical input dismissed for being a DBA or maybe I am just lucky. I did get my input disagreed for being an inexperience DBA or being new in the company for not knowing the *process*.

DH2. Responding to Tone – I think this level applies when I don’t sound confident or slightly hesitant when I open my mouth or having English rolling out a bit rough out of my mouth, and get dismissed for not knowing what I am talking about.

Sometimes, I get dismissed even if I don’t open my mouth. It’s often not my fault if you reject my opinion for being too quiet. I only open my mouth if I have important things to say or if I am knowledgeable about the topic.

DH3. Contradiction – Now we are getting somewhere. I get this type of disagreement *a lot*. Take a read at CrazyDBA’s story. I like to think that DBA is a very precise business. We don’t start making conclusions unless we know all the facts. Our customers however, may not. I find myself always have to distinguish whether a customer is telling me a fact or interpretation of a fact. When you tell us you know something, we got to either prove it is true, or refute it is not true with evidence.

An interpretation of a fact might be “I can’t access the database. Is the database up?” – “what’s the error message you are getting?” – “ORA-01017: invalid username/password.”. The contradictious argument might be “It was working before…”.

I had customers telling me he’s dead certain he knows the cause of problem because I cannot prove that’s not the root cause due the limitation of my Oracle skills or the lack of time.

DH4/DH5/DH6 – Counter-argument / Refutation / Refuting the Central Point.

I would like to add: Silence. It is also a form of disagreement.

Written by fredericktang

April 4, 2008 at 7:40 am

Posted in Oracle, bdump

Life as a PhD student…

without comments

I am not a PhD and don’t plan on becoming one. My partner is a PhD student. We’ve been together for more than 5 years, of which she’s been a PhD student for 4 years. Walking that journey together is quite gruesome. To keep herself amused, she likes reading the PhD Comics, and I have started reading it too. We both agreed that almost every story is sadly… very true.

If you ever wonder what it is like being a PhD student, take a look at the comics to get a taste…

Written by fredericktang

April 3, 2008 at 6:02 am

Posted in bdump