Archive for the ‘Oracle’ Category
ORA-01652: unable to extend temp segment
This is a short explanation about one scenario where a SQL query can result in an ORA-01652 error, for the benefit of a database user instead of a DBA. The following has been tested on an Oracle10g (10.2.0.4) database.
This week during a routine check of a pre-production database I built, I found a couple of ORA-01652 errors in the alert log, so I proceeded to extract an AWR report over the timeframe of the error, in the hope that I could guess which SQL caused the error. But to simplify the scenario for this post, I will show a small demo case instead.
An ORA-01652 error typically means that a SQL query required more temporary segments than the temporary tablespace can provide. Temporary tablespace is typically used for sort operations, such as joins, index builds, ordering, computing aggregates (GROUP BY), and collecting optimizer statistics [source].
Let’s just show a simple example:
SQL> select * from mrt_alert order by alertname;
select * from mrt_alert order by alertname
*
ERROR at line 1:
ORA-01652: unable to extend temp segment by 128 in tablespace FREDT_TEMP
The first thing that comes to mind might be report this error to your resident DBA, or simply extend the temporary tablespace to whatever is required. But how big do we need to extend this temporary tablespace to, so that our SQL query will work? Use Explain Plan.
SQL> explain plan for 2 select * from mrt_alert order by songname; Explained. SQL> SELECT * FROM TABLE(dbms_xplan.display); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Plan hash value: 2703675187 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 320K| 153M| | 39661 (1)| 00:07:56 | | 1 | SORT ORDER BY | | 320K| 153M| 384M| 39661 (1)| 00:07:56 | | 2 | TABLE ACCESS FULL| MRT_ALERT | 320K| 153M| | 5405 (2)| 00:01:05 | ---------------------------------------------------------------------------------------- 9 rows selected.
The Explain Plan tells us the SORT ORDER BY operation requires 384MB of free temporary tablespace. The temporary tablespace is usually shared between other database users in the database. Remember to have uptodate statistics on the table, otherwise, the Explain Plan will show non-accurate information. In 10g, this is mostly done for you via the nightly dbms_stats scheduled job run.
Before shooting a request straight to our DBA to ensure the temporary tablespace have more than 384MB of free space, it’s necessary to just do a sanity check on our SQL query. For example:
1. How many rows are there in our table, so much that we need 384MB of space to sort?
2. Can we reduce the number of columns to view?
3. Can we limit our results by adding some conditions using the WHERE clause?
SQL> select count(*) from mrt_alert;
COUNT(*)
----------
320203
SQL> select bytes/1024/1024 as MB from user_segments
2 where segment_name='MRT_ALERT';
MB
----------
192
If we are only interested in one column for example, then we need less space for the SORT ORDER BY operation. As we can show below, we reduced the Temporary Segments required from 384MB to 19MB, but selecting only one column.
SQL> explain plan for 2 select alertname from mrt_alert order by alertname; Explained. SQL> SQL> SELECT * FROM TABLE(dbms_xplan.display); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Plan hash value: 2703675187 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 320K| 7192K| | 7618 (2)| 00:01:32 | | 1 | SORT ORDER BY | | 320K| 7192K| 19M| 7618 (2)| 00:01:32 | | 2 | TABLE ACCESS FULL| MRT_ALERT | 320K| 7192K| | 5401 (2)| 00:01:05 | ---------------------------------------------------------------------------------------- 9 rows selected.
It is possible to further reduce space required for the SORT ORDER BY operation, by adding some conditions using a WHERE clause. First let’s start with the demonstration before explaining what’s going on.
SQL> explain plan for
2 select alertname from mrt_alert where alertname like 'A%' order by alertname;
Explained.
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1920725646
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3895 | 89585 | 5308 (2)| 00:01:04 |
| 1 | SORT ORDER BY | | 3895 | 89585 | 5308 (2)| 00:01:04 |
|* 2 | TABLE ACCESS FULL| MRT_ALERT | 3895 | 89585 | 5306 (2)| 00:01:04 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("ALERTNAME" LIKE 'A%')
14 rows selected.
We can see that the TempSpc column in our Explain Plan has disappeared. Shervin provided a very simple explanation here. Basically, when the disk based temporary tablespace is not required, the TempSpc column is not shown.
Metalink Note: 102339.1 Temporary Segments: What Happens When a Sort Occurs further explains “the sort area allocated inside the Program Global Area (PGA) is used first. If the sort operation needs additional memory, then the sorted rows are written to disk to free up the sort area so that it can be re-used for the remaining sort. This means that temporary segments are created.“. In simple words, if the sort can be done in memory, it won’t resort to disk.
So there, we have managed reduce the Temporary Segments usage from 384MB to 19MB, and with a WHERE clause, the SORT ORDER BY operation can be done entirely in memory, and we don’t even have to rely on Temporary Segments (i.e. 0MB). Therefore, we don’t have to talk to our DBA about adding more space to the Temporary Tablespace.
There is additionally one extra step we can do however, which can eliminate a SORT ORDER BY operation altogether. This is done by introducing a B*Tree index.
SQL> create index IX_MRT_ALERT_2 on MRT_ALERT(ALERTNAME);
Index created.
SQL> exec dbms_stats.gather_index_stats(user, 'IX_MRT_ALERT_2');
PL/SQL procedure successfully completed.
SQL> explain plan for
2 select alertname from mrt_alert where alertname like 'A%'
3 order by alertname;
Explained.
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 249942460
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3895 | 89585 | 21 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| IX_MRT_ALERT | 3895 | 89585 | 21 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ALERTNAME" LIKE 'A%')
filter("ALERTNAME" LIKE 'A%')
14 rows selected.
As can be observed from the Explain Plan, Oracle uses Index Range Scan to retrieve results that satisfy our WHERE clause. The result of an Index Range Scan returns results in an order – in this case, a VARCHAR2 column – alphabetical order. Therefore, Oracle can avoid a SORT ORDER BY operation as the results is already ordered.
There’s a caveat though, in Metalink Note: 67409.1 When will an ORDER BY statement use an Index, “Oracle explains that the index is only used when the column must be NOT NULL, otherwise, the index will not be considered“. It then gives an explanation for this behaviour which I won’t copy and paste here.
Conclusion
I guess I have presented an over-simplified scenario here, and the SQL query you are testing might be a lot more complex than this, but it is important to just do some sanity checks before asking a DBA to increase database resource capacity.
I have shown some steps where it is possible to reduce Temporary Segments usage, and with a clever usage of an index, to complete remove the need to perform a SORT. Having said all the above, if you do require the additional Temporary Segments, then by all means, go for it!
sys_connect_by_path and ORA-01489
This is strictly on an issue on a Oracle9i database. It has been fixed in Oracle10g.
Just would like to write a brief note about a test case we found with using sys_connect_by_path to concatenate values on a column from multiple rows into one string, on an Oracle9i database.
This technique has been discussed on a few websites:
http://www.williamrobertson.net/documents/one-row.html
http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php
http://www.oracle.com/technology/oramag/code/tips2006/101606.html
As we might all know, VARCHAR2 in Oracle 9i has a limit of 4000 bytes (note, not characters). This limits applies to string concatenation, and will (or should) result in ORA-01489: result of string concatenation is too long if the result of concatenation yields a string greater than 4000 bytes in length. Eddie has demonstrated some examples here: http://awads.net/wp/2005/12/06/more-on-generating-strings/
I discovered a difference in behaviour when concatenating strings using sys_connect_by_path on Oracle9i, between a Multibyte character set database and a Single-byte character set database, where I would get an ORA-01489 error at just over the 2000 bytes mark on a Multibyte database. For example, UTF8 is Multibyte and WE8ISO8859P1 is a Single-byte character set. (http://download.oracle.com/docs/cd/B10501_01/server.920/a96529/appa.htm#956722)
There are a couple of ways to find out which character set the database is built on (requires dba privilege or select_catalog_role granted):
select value$ from props$ where name='NLS_CHARACTERSET'; select * from database_properties where property_name='NLS_CHARACTERSET';
We construct a simple test case, a table with an ID column, and a VARCHAR2 column. I am going to insert a fixed string of 10 characters into the VARCHAR2 column, then apply the string concatenation technique using sys_connect_by_path. I am not using any special characters, so each character should just be 1 byte. It is possible to get size of a varchar2 string using vsize().
create table test_concat_sys ( id number, str varchar2(4000) ); insert into test_concat_sys SELECT 1 as id, '1111111111' RNDMSTR FROM all_objects WHERE rownum <= 200; commit;
On a UTF8 character set database, this will work, and the last row will be 2002 bytes (182×10+182). That’s 182 rows of 10 bytes, plus 1 byte for each ‘/’ in the concatenation delimiter.
SELECT VSIZE(SYS_CONNECT_BY_PATH(str, '/')) as BYTES FROM ( select A.*, row_number() over (partition by id order by id) row# from test_concat_sys A where rownum <= 182 ) START WITH row#=1 CONNECT BY PRIOR id=id and prior row# = row# -1;
This will result in an ORA-01489 error:
SELECT VSIZE(SYS_CONNECT_BY_PATH(str, '/')) as BYTES FROM ( select A.*, row_number() over (partition by id order by id) row# from test_concat_sys A where rownum <= 183 ) START WITH row#=1 CONNECT BY PRIOR id=id and prior row# = row# -1;
The behavior is working as expected on a Oracle 9i Single-byte database or an Oracle10g (or later) release, and we can set rownum upto 363 rows (363×10+363 = 3993), one more row and it will return ORA-01489 as expected since we will reach the limit (364×10+364 = 4004).
I have logged a SR for this, and the reply from Oracle was that they “officially accepted this is the right candidate for bug.”, but “9i is in sustaining support mode and development will not provide any fix for this. Also this issue is fixed in 10g so they would not accept it.”
All the more reason to upgrade to 10g if not already done so. This bug is rather odd to hit because we wouldn’t normally expect to concatenate a string upto 2000 characters and beyond. But in our case, the business obviously did.
There are other ways to concatenate strings from multiple rows, they are also covered in the websites linked at the beginning of this article.
impdp ORA-06550 ORA-00917
Just want to quickly write about this bug.
I created a Datapump export (expdp) of a partitioned table from an Oracle 10.2.0.3 database, and import (impdp) into an Oracle 10.2.0.4 database.
During impdp, everything worked well until importing statistics, then the process ran into the following errors:
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
ORA-39083: Object type TABLE_STATISTICS failed to create with error:
ORA-06550: line 12, column 17:
PL/SQL: ORA-00917: missing comma
ORA-06550: line 4, column 121:
PL/SQL: SQL Statement ignored
…
Turns out this is a bug with expdp/impdp partitioned table, described in Metalink Note “Bug 6862987 – impdp throws an error when importing partitioned table stats”, which happens “when importing statistics for tables if there are global statistics only.”. This bug is fixed in the 10.2.0.5 Patchset. The workaround is not to import statistics.
I dumped the sqlfile and had a look what’s really going on, it turns out there was a repeated syntax error with some of the INSERT statements:
INSERT INTO “SYS”.”IMPDP_STATS” (type, version,
c1, c2, c3, c4, c5, n1,
n2, n3, n4, n5, n6, n7, n8, n9, n10, n11,
d1, r1, r2, ch1, flags)
VALUES (‘C’, 4, ‘HTTP_LOG’, ‘P20090921′, NULL, ‘PROTOCOL_ID’, ‘PBC_ADMIN’,
0, 0, 0, NULL, 0, NULL, NULL, 0, NULL, NULL, NULL,
‘, NULL), NULL, NULL, NULL, 0);
PL/SQL Developer Associate
I have added another Certification to my bag, passing the 1Z0-147 Program with PL/SQL exam. I studied the Oracle documentation and used the Selftest software to prepare for this exam.
Completed OCE and OCA
I have taken a brief break from blogging for some personal reasons, and lately I have taken the time to pass and complete a couple of Oracle certifications: 1Z0-047 (Oracle Database SQL Expert Exam), and 1Z0-042 (Oracle Database 10g: Administration I Exam).
It’s certainly quite an experience. I must say on-the-job experience helped a lot, and reading up the Oracle documentation helped me to learn a lot of concepts that I obviously overlooked, such as restrictions and notes about a specific feature.
Client side connect time load balancing for RAC
Recently, I took an interest in studying how various load balancing features work in a 10g RAC. I thought Jim has written a very good article on this, in a language most people can understand. In my opinion, there are 3 types of load balancing on a 10g RAC. I say in my opinion because Run Time Load Balancing relies on Server Side Load Balancing configurations, but I have separated it out anyways.
- Client Side Load Balancing
- Server Side Load Balancing
- Run Time Load Balancing (using Implicit Connection Cache)
In this post, I wanted to test for evidence of Client Side Load Balancing, using Client Side Oracle Net Tracing.
Definition
Client Side Load Balancing is by definition “evenly spreads new connection requests across all listeners”, and “Oracle Database randomly selects an address in the address list and connects to that node’s listener. This provides a balancing of client connections across the available listeners in the cluster.” [source]. Another definition, “whereby if more than one listener services a single database, a client can randomly choose between the listeners for its connect requests. This randomization enables all listeners to share the burden of servicing incoming connect requests.” [source].
Client Side Load Balancing is configured by adding LOAD_BALANCE=ON in tnsnames.ora file. The Net Services Reference tells us how and were to embed this parameter – “embed this parameter under either the DESCRIPTION_LIST parameter, the DESCRIPTION parameter, or the ADDRESS_LIST parameter.” [source]. Note however, whether you place LOAD_BALANCE under ADDRESS_LIST or outside, will make a different in terms of load balancing ADDRESS_LISTs or load balancing ADDRESSes.
Configuration
I have a 2-node 10g RAC cluster, and for load balancing, I have the following entry in my tnsnames.ora file.
MODDB=
(description=
(load_balance=on)
(address=(protocol=tcp)(host=moddb1-vip)(port=2521))
(address=(protocol=tcp)(host=moddb2-vip)(port=2521))
(connect_data=
(service_name=MODDB)
)
)
I want to trace the Client Side Load Balancing behaviour from the client side, so I enable Oracle net tracing by adding the following lines in sqlnet.ora [source], and I should fine trace files for each physical connection made under $ORACLE_HOME/network/trace.
TRACE_LEVEL_CLIENT = USER TRACE_FILE_CLIENT = SQLTRC
Testing a connection
We can first test one connection…
$ sqlplus -s tester/tester@MODDB exit
Now take a look at the trace file produced, under $ORACLE_HOME/network/trace. An inspection of of the trace file shows two entries that are of interest (it is possible to observe server-side load balancing as well, but that’s a topic for another post):
niotns: Calling address: (description=(load_balance=on)(failover=on)(address=(protocol=tcp)(host=moddb1-vip)(port=2521))(address=(protocol=tcp)(host=moddb2-vip)(port=2521))(connect_data=(service_name=MODDB)(CID=(PROGRAM=e:\oracle\product\10.2.0\client_1\bin\sqlplus.exe)(HOST=LADDO)(USER=tester)))) nsc2addr: (DESCRIPTION=(load_balance=on)(failover=on)(address=(protocol=tcp)(host=moddb2-vip)(port=2521))(connect_data=(service_name=MODDB)(CID=(PROGRAM=e:\oracle\product\10.2.0\client_1\bin\sqlplus.exe)(HOST=LADDO)(USER=tester))))
Unfortunately, I couldn’t find much documentation on how to read a trace file, I can only guess…
* niotns entry shows a lookup of tns information based on the service name I supplied “MODDB”.
* nsc2addr entry shows Oracle Net connects to listener on moddb2-vip.
It’s difficult to observe load balancing with just one connection, so I wrote a script to run 1000 connections, using Bash Shell scripting:
#!/bin/bash
for a in {1..1000}
do
echo $a
sqlplus -s tester/tester@MODDB<<EOF
EOF
done
exit 0
Examining the traces
Extending the idea above, we can use a combination of grep, and wc to see the result of load balancing. For some reason, each connection produced 2 trace files, so we have 2000 trace files for 1000 connections.
$ ls -l *.trc |wc -l 2000 $ grep nsc2addr *.trc | grep load_balance |grep moddb1-vip |wc -l 498 $ grep nsc2addr *.trc | grep load_balance |grep moddb2-vip |wc -l 502
We can see over 1000 connections, 498 were made to moddb1-vip listener, whilst 502 were made to moddb2-vip listener. It is a fairly even distribution. But just how random is it? I have used the output of the trace files, and plot the first 100 connections on a scattered chart. The first note one can observe is that, these connections are not in a round-robin fashion. (1 = moddb1-vip; 2 = moddb2-vip)

Conclusion
As demonstrated above, it is quite simple to setup Oracle Net tracing on the client, to test and show whether Client Side Load Balancing is working properly. It is worth to note (again), that this type of load balancing has nothing to do with balancing server side load. It is to do with balancing load across the listeners.
Installing Oracle10g JAccelerator (NCOMP) on Solaris 10 (SPARC)
Let’s suppose you find the need to natively compile some Java code on an Oracle10g database, then you would need to have JAccelerator (NCOMP) installed. I have prepared a few installations steps to follow that worked for me. It took me a few days to work this out…
Disclaimer: These steps were tested on Solaris 10 SPARC platform, with a Oracle10g 10.2.0.4 standalone database. Best efforts were made to ensure the steps are correct, but please take extra care when executing them on your platform. Do not run this on your Production environment if you are unsure what it will do.
Let’s start from the beginning…
1. To find out whether you have NCOMP installed, there are two things you can do:
select dbms_java.full_ncomp_enabled from dual * ERROR at line 1: ORA-29558: JAccelerator (NCOMP) not installed. Refer to Install Guide for instructions. ORA-06512: at "SYS.DBMS_JAVA", line 236
or try executing this on command line, which should show two lines if NCOMP is installed.
$ORACLE_HOME/OPatch/opatch lsinventory -detail |grep JAccelerator JAccelerator (COMPANION) 10.2.0.1.0 JAccelerator (COMPANION) Patch 10.2.0.4.0
2. Suppose NCOMP is not installed, then you would need to get the Oracle10g Companion CD (downloadable here). Run the installer, and select the “Oracle Database 10g Products” option which will install into your existing Oracle home. Full installation guide is available here. This will install JAccelerator 10.2.0.1.0 along with other 10g Companion products. If you run the opatch command again, you should now see JAccelerator listed.
3. Next, we need to patch JAccelerator to the latest patchset – 10.2.0.4. Download the 10.2.0.4 patchset from Metalink and start the installer as usual. Metalink Note 293658.1 explains that the OUI will see the new components installed and only install the 10.2 patches associated to the new products. Re-run catupgrd.sql and utlrp.sql .
4. After completing the patchset installation, re-running the first two checks should show you some good news.
SQL> select dbms_java.full_ncomp_enabled from dual; FULL_NCOMP_ENABLED ---------------------------------------------------- OK
$ORACLE_HOME/OPatch/opatch lsinventory -detail |grep JAccelerator JAccelerator (COMPANION) 10.2.0.1.0 JAccelerator (COMPANION) Patch 10.2.0.4.0
5. Before one can natively compile anything, we need to follow a few more steps as documented here. Verify a C compiler and linker is installed, and the ncomp properties file knows their full path. Suppose your platform is like mine, then it means you probably need a proper C compiler:
$ which cc /usr/ucb/cc $ cc /usr/ucb/cc: language optional software package not installed
6. Metalink Note: 43208.1 documents a list of Certified Compilers. For Solaris SPARC:
* 10.1.0 Sun ONE Studio 8, C/C++ 5.5 * 10.2.0 Sun ONE Studio 8, C/C++ 5.5 * 11.1.0 Sun ONE Studio 11, C/C++ 5.8 Note: Sun ONE Studio 8 or higher is supported with 9.2, 10.1, 10.2
In any case, I downloaded Sun Studio 11 software here. This software installation requires root privilege and installs the C compiler into /opt/SUNWspro by default. Check all system requirements before installation. As an aside, I also tried gcc 3.4.6 but NCOMP didn’t like it very much…
7. After the Sun Studio software is installed, update the PATH and LD_LIBRARY_PATH environment variables, e.g.
export PATH=/opt/SUNWspro/bin:$PATH export LD_LIBRARY_PATH=/opt/SUNWspro/lib:$LD_LIBRARY_PATH
8. Inside the $ORACLE_HOME/javavm/jahome directory, you will possibly find two properties files.
$ grep 'CC =' Settings*.properties Settings.properties:CC = cc Settings_os.properties:CC = /opt/SunProd/SUNWspro8/bin/cc
Update these two lines to the C compiler location, such that if I do another grep:
Settings.properties:CC = /opt/SUNWspro/bin/cc Settings_os.properties:CC = /opt/SUNWspro/bin/cc
9. We are just about done… to enable the database user to perform native compilation, the user needs to be granted certain privileges as documented here, where /app/oracle/10/db is my $ORACLE_HOME:
grant java_deploy to fredt;
grant javasyspriv to fredt;
call dbms_java.grant_permission('FREDT', 'java.io.FilePermission', '/app/oracle/10/db/-', 'read,write');
10. We can now ncomp away!
ncomp -u fredt/fredt@MODDB <myJar.jar>
PL/SQL Instrumentation using ILO
I worked on a project to develop a domain index using the Oracle Data Cartridge Interface last year, When I was testing my code, I needed to know what the progress and state of the PL/SQL execution, and found dbms_output.put_line() inadequate as it only outputs when the call is returned to SQL*Plus, i.e. the output is not displayed during run-time. This is better explained by Robert Vollman. That was until Douglas showed me how to use DBMS_APPLICATION_INFO, MODULE, ACTION and CLIENT_INFO.
Whilst reading blogs of my favourite Oracle bloggers, I found Instrumentation Library for Oracle (ILO) developed by Method-R, presented here by Karen Morton. Instrumentation is described by Tom Kyte in this blog post. The idea behind ILO is quite interesting, a comment by Cary Millsap reveals the motivation behind this tool. I downloaded ILO 2.2, installed it on my 10g test database and test ran a very simple test case. I will write a few words about what I found.
I start off with a very simple test code, I granted myself privilege to execute dbms_lock:
DECLARE
BEGIN
-- Start Task 1
ILO_TASK.BEGIN_TASK(module=>'ILO Tester', action=>'Task 1', comment=>'Time 0s', begin_time=>sysdate);
dbms_lock.sleep(10);
-- Start Task 2
ILO_TASK.BEGIN_TASK(module=>'ILO Tester', action=>'Task 2', comment=>'Time 10s',begin_time=>sysdate);
dbms_lock.sleep(10);
-- End Task 1
ILO_TASK.END_TASK(end_time=>sysdate);
-- End Task 2
ILO_TASK.END_TASK(end_time=>sysdate);
EXCEPTION
WHEN OTHERS
THEN
dbms_output.put_line('Exception thrown');
ILO_TASK.END_ALL_TASKS(p_error_num=>SQLCODE, p_end_time=>sysdate);
END;
I am logged on the database using SQL*Plus, before I execute the above PL/SQL block, my session information looks like this:
SQL> select module, action, client_info from v$session where username='FREDT'; MODULE ACTION CLIENT_INFO --------------- --------------- --------------- SQL*Plus
When the PL/SQL block is executed:
MODULE ACTION CLIENT_INFO --------------- --------------- --------------- ILO Tester Task 1
10 seconds later inside the PL/SQL block:
MODULE ACTION CLIENT_INFO --------------- --------------- --------------- ILO Tester Task 2
10 seconds more, and PL/SQL block exits:
MODULE ACTION CLIENT_INFO --------------- --------------- --------------- SQL*Plus
ILO manages tasks in a stack like data structure, therefore, a developer can start multiple tasks using BEGIN_TASK and end the last (started) task using END_TASK, or end all tasks using END_ALL_TASKS. To view session information, one needs either DBA or SELECT ANY DICTIONARY privilege.
This most basic use of ILO requires continuous query on v$session to get anything meaningful. If one wants to take this further, ILO provides the framework to do so. For example, one can enable tracing by inserting this line of code at the beginning of the PL/SQL block (after the line BEGIN):
ILO_TIMER.set_mark_all_tasks_interesting(true, true);
A trace file will be produced in USER_DUMP_DEST, and one can find ILO information inside the trace file. It seems feasible to develop a script which can extract this information out of a trace file into something meaningful or simply more human-readable:
ILO_TASK.BEGIN_TASK[][ILO Tester][Task 1][oracle~pts/2~sqlplus@moddb1 (TNS V1-V3)~SYS$USERS][Time 0s] *** 2009-01-13 16:50:02.919 *** ACTION NAME: (Task 1) 2009-01-13 16:50:02.919 *** MODULE NAME: (ILO Tester) 2009-01-13 16:50:02.919 ILO_TASK.BEGIN_TASK[][ILO Tester][Task 2][oracle~pts/2~sqlplus@moddb1 (TNS V1-V3)~SYS$USERS][Time 10s] *** 2009-01-13 16:50:12.921 *** ACTION NAME: (Task 2) 2009-01-13 16:50:12.921 *** MODULE NAME: (ILO Tester) 2009-01-13 16:50:12.921 ILO_TASK.END_TASK[][ILO Tester][Task 2][oracle~pts/2~sqlplus@moddb1 (TNS V1-V3)~SYS$USERS][Time 10s][] *** ACTION NAME: (Task 1) 2009-01-13 16:50:22.931 *** MODULE NAME: (ILO Tester) 2009-01-13 16:50:22.931 ILO_TASK.END_TASK[][ILO Tester][Task 1][oracle~pts/2~sqlplus@moddb1 (TNS V1-V3)~SYS$USERS][Time 0s][] *** ACTION NAME: () 2009-01-13 16:50:22.931 *** MODULE NAME: (SQL*Plus) 2009-01-13 16:50:22.931
After reading the ILO source code, and playing around, I could think of a few more ways to take this idea further, but it will require additional development of the library. Most of the “hooks” are inside ILO_TIMER. Indeed, I believe that is why ILO is licensed as LGPL, allowing developers the flexibility to decide how they want to make use of the instrumentation.
Perhaps I will write about these ideas in another post. My parting thoughts were how to have Developers instrument their code the same way, if I am the DBA. Perhaps, if DBA=Developer then it would be a different story. I think about Apache Log4j and if the same adoption can be applied to a tool like ILO, that would be great.
HP-UX PA-RISC or Itanium
Don’t do what I did… I got this error when mistakenly installing Oracle 10gR2 (10.2.0.1) HP-UX PA-RISC distribution on a HP-UX Itanium platform:
ar: warning: the file <some_file> is from an incompatible architecture
Metalink Note: 603735.1 - HP-UX: “Incompatible Architecture” Error During The Relink Phase
… explains this.
uname -m
On HP-UX Itanium systems, the “uname -m” command returns “ia64″.
On HP-UX PA-RISC systems, the “uname -m” command returns something similar to “9000/800″.
Foreign keys and locking
Sometimes things are learnt the hard way. A couple of weeks ago, I was involved in purging some historical data from a production database using a set of DELETE statements. I assumed these transactions will not impact the Application (performance aside) because of how Oracle maintains read consistency and concurrency. This assumption was formed on weak knowledge on my part …
In short, the DELETE statements were executed on a parent table. It has foreign key on delete cascade relationships with several child tables. There were no indexes defined on the foreign key columns of the child tables. Whilst the DELETE statements were executing, the child tables became locked against other DMLs until a COMMIT or ROLLBACK is issued. Other DML transactions on these unindexed child tables appeared to hang, i.e. enqueued.
The on delete cascade constraint was the key, if it was disabled, locking would not happen but instead the transaction will fail with ORA-02292: constraint violation – child records found. However, enabling the on delete cascade constraint wass the right thing to do. To prevent locking, we just had to define indexes on the foreign key columns.
This apparent database design knowledge showed how little I know about Oracle, and how inexperienced I am.
I conjured a more simplistic example, and also found some reference to this in Metalink.
(Sorry… I am just being very silly with the example…)
Create tables:
Suppose I create two parent tables, and a child table with foreign keys linked to them. The foreign keys do not have indexes.
create table dad_table
(
id number(10),
name varchar2(10),
constraint pk_dad_table_id primary key (id)
);
create table mum_table
(
id number(10),
name varchar2(10),
constraint pk_mum_table_id primary key (id)
);
create table child_table
(
dad number(10)
constraint fk_child_dad_table_id
references dad_table(id)
on delete cascade,
mum number(10)
constraint fk_child_mum_table_id
references mum_table(id)
on delete cascade,
name varchar2(10)
);
SQL> select index_name from user_indexes
2 where table_name='CHILD_TABLE';
no rows selected
Populate data:
insert into dad_table (id, name) values (1, 'john'); insert into dad_table (id, name) values (3, 'peter'); insert into dad_table (id, name) values (5, 'david'); insert into mum_table (id, name) values (2, 'mary'); insert into mum_table (id, name) values (4, 'jane'); insert into mum_table (id, name) values (6, 'ivy'); insert into child_table (dad, mum, name) values (1, 2, 'ashton'); insert into child_table (dad, mum, name) values (3, 4, 'james');
Testing:
I am going to start three Oracle sessions, two as the schema user, one as SYSDBA which I will use to monitor locks. I used the locks information script from Metalink Note 1020012.6. I also found a script in Tom Kyte’s book (Expert Oracle Database Architecture – Chapter 6 – Locking and Latching) useful in displaying locks blocker and blockee.
i. Without indexes on foreign keys:
Session #1 (SID=11): SQL> delete from dad_table where id=1; 1 row deleted. Session #2 (SID=10): SQL> delete from mum_table where id=4; (... hangs) Session SYSDBA: * Note 1020012.6 - SID SERIAL# USERNAME Resource Type TAB Lock Held Lock Req. ---- ------- ---------- -------------------- ---------- -------------- -------------- 10 5596 FREDT TM - DML Enqueue DAD_TABLE Row Share 10 5596 FREDT TM - DML Enqueue MUM_TABLE Row Exclusive 10 5596 FREDT TM - DML Enqueue CHILD_TABL Shr Row Excl 11 3311 FREDT TM - DML Enqueue DAD_TABLE Row Exclusive 11 3311 FREDT TM - DML Enqueue MUM_TABLE Row Share 11 3311 FREDT TM - DML Enqueue CHILD_TABL Row Exclusive * Tom's Blocker, Blockee and Object Name - BLOCKER SID BLOCKING BLOCKEE SID ON_OBJ OBJECT_NAME ---------- ------ --------------- ---------- ------ ------ -------------------- CIMS 11 is blocking CIMS 10 on CHILD_TABLE * V$Session_Wait - SQL> select sid, event from v$session_wait where event='enqueue'; SID EVENT ---- ---------------------------------------------------------------- 10 enqueue
Comments:
The two sessions were deleting different rows of data, but one was blocking the other. The lock information showed SID 10 was requesting a Share Row Exclusive lock (SSX) on the child_table, when a Row Exclusive (SX) lock on the same table was held by SID 11. This conflict is explained in 10g Concept Guide (the 9i Concept Guide html formatting looks a bit out of whack).
The puzzling thing is why request a SSX lock? Metalink Note 33453.1 explains this behaviour, but it still does not explain the *why*:
Without indexes… A share lock (LMODE=4) of the entire child table is required until the transaction containing the insert/delete/update statement for the parent table is committed, thus preventing any modifications to the child table. It even can be a SSX (LMODE=5) lock when deleting from the parent table with a delete cascade constraint. In 9.2.0, the downgraded ‘mode 3 Row-X (SX)’ locks are no longer required except when deleting from a parent table with a ‘delete cascade’ constraint.
Another interesting point – SID 11 also acquired a Row Share (SS) lock on mum_table. It isn’t immediately obvious, but a SS lock (only) prohibits other transactions from acquiring an Exclusive (X) lock on table, either explicitly (lock table … in exclusive mode) or implicitly (alter table). Note 223303.1 also made this statement:
Starting in version 9.2.0, a Row-SS lock is taken on the parent table for any DML issued against the child table. This will occur with or without an index on the foreign key column of the child table.
The table structure is somewhat similar to the application schema I was working on. Whilst I was deleting from the parent table, the Application was also deleting rows from another parent table, to which both parent tables are linked to a single child table. Given the above experiment, I now understand why the Application appeared to have “hang” and threads were reporting timeout errors.
ii. With indexes on foreign keys:
Let’s first create two indexes on the child table:
create index ix_dad_child on child_table(dad); create index ix_mum_child on child_table(mum); Session #1 (SID=11): SQL> delete from dad_table where id=1; 1 row deleted. Session #2 (SID=10): SQL> delete from mum_table where id=4; 1 row deleted. Session SYSDBA: * Note 1020012.6 - SID SERIAL# USERNAME Resource Type TAB Lock Held Lock Req. ---- ------- ---------- -------------------- ---------- -------------- -------------- 10 5596 FREDT TM - DML Enqueue DAD_TABLE Row Share 10 5596 FREDT TM - DML Enqueue MUM_TABLE Row Exclusive 10 5596 FREDT TM - DML Enqueue CHILD_TABL Row Exclusive 11 3311 FREDT TM - DML Enqueue DAD_TABLE Row Exclusive 11 3311 FREDT TM - DML Enqueue MUM_TABLE Row Share 11 3311 FREDT TM - DML Enqueue CHILD_TABL Row Exclusive * Tom's Blocker, Blockee and Object Name - no rows selected * V$Session_Wait - SQL> select sid, event from v$session_wait where event='enqueue'; no rows selected
Comments:
The two sessions are no longer in blocker-blockee state, and each can delete their different rows. The indexes defined on the foreign key columns have resolved the locking scenario introduced in this example.
Note 33453.1 explains this:
An Insert/Delete/Update on the parent table will only acquire a row level lock on the parent table if there is an index on the foreign key of the child table. The child table will have NO locks on it and so any type of modifications can be made to the child table…
If the child table specifies ON DELETE CASCADE, waiting and locking rules are the same as if you deleted from the child table after performing the delete from the parent. In 9.2.0 onwards, Oracle requires ‘mode 2 Row-S (SS)’ locks on the child table.
Metalink References:
Note:33453.1 – REFERENTIAL INTEGRITY AND LOCKING.
Note:16428.1 – Prevent Deadlock in Existing Parent- Child Tables By INDEXING PK and FK.
Note:223303.1 – Correction to 9.2.0 foreign key constraint locking behaviour, per documentation.
Note:29787.1 – VIEW: “V$LOCK” Reference Note.
Note:11828.1 – FOREIGN KEYS, INDEXES AND PARENT TABLE LOCKING.
