NZ DBA: Oracle Database Tuning & Tips

June 14, 2014

Undone by Undo

The Puzzle

After switching to a new undo tablespace, transactions would fail with ORA-01552.

The Background

During a data warehouse migration using GoldenGate, the target database returned an ORA-1628 error when applying transactions from the source.
The assigned DBA set a couple of initialisation parameters as prescribed by
Troubleshooting ORA-1628 – max # extents (32765) reached for rollback segment <SEGMENT_NAME> (Doc ID 1580182.1)

alter system set "_smu_debug_mode" = 4 scope=both sid='*';
alter system set "_rollback_segment_count"=999999 scope=both sid='*';

The second parameter caused this warning:

ORA-30018: Create Rollback Segment failed, USN 32760 is out of range

During the migration, the undo tablespace grew to a ridiculous size.  In an attempt to reclaim terabytes of space, the DBA tried to replace the undo tablespace with a smaller one.  However, after creating a new tablespace and switching to it, transactions failed with:

ORA-01552: cannot use system rollback segment for non-system tablespace...

So, the original and huge undo tablespace was retained for the ‘go-live’ while Oracle Support tried to work out what caused the ORA-01552.

After going live on the new database, ETL performance was disappointing due to the overhead of DDL for management of the undo segments and the domination of the buffer cache by undo blocks.

At first Oracle Support claimed that creating the maximum number of undo segments was beneficial.  However, when pushed, they could not explain exactly how having 32,760 undo segments helped in a database with a low number concurrent transactions.  How would it prevent hitting the maximum number of extents (ORA-1628)?  What was the cost?

After removing the undo parameters prescribed by 1580182.1, there was no improvement to performance and the ORA-01552 still prevented a switch to a smaller undo tablespace.

The Solution

I started investigating and found that when a new undo tablespace was created, no new rows were inserted into UNDO$ is because the limit had been reached:

select ts#,name from ts$ where name like 'UNDO%';

TS# NAME
--- --------
  2 UNDOTBS1
  4 UNDOTBS2
145 UNDOTBS3
break on report
compute sum of segments on report
select max(US#),count(1) segments,ts# from sys.undo$ group by ts#;

MAX(US#)     SEGMENTS TS#
---------- ---------- ---
32758           20753 2
32759           12006 4
0                   1 0
           ----------
            sum 32760

The UNDOTBS3 tablespace had no undo segments in it because the database couldn’t create any more!

Once the UNDOTBS1 tablespace was dropped, new undo segments could be created in other undo tablespaces, and the ORA-01552 was avoided.

I suggested that Oracle Support change the careless recommendations in Doc 1580182.1.  At the time of writing, Doc 1580182.1 now says:

alter system set "_rollback_segment_count"=1000 scope=spfile;

January 14, 2014

SQL Tuning Advisor and DML

Filed under: Oracle Database,Performance Tuning — Ari @ 5:07 pm
Tags: , ,

Does the SQL Tuning Advisor auto task tune DML statements?

Sort of; sometimes.  Read on.

The documentation says:

Oracle Database automatically runs the SQL Tuning Advisor on selected high-load SQL statements from the Automatic Workload Repository (AWR) that qualify as tuning candidates…..
<skip some lines>
….. Oracle Database analyzes statistics in the AWR and generates a list of potential SQL statements that are eligible for tuning. These statements include repeating high-load statements that have a significant impact on the system. Only SQL statements that have an execution plan with a high potential for improvement will be tuned. Recursive SQL and statements that have been tuned recently (in the last month) are ignored, as are parallel queries, DMLs, DDLs, and SQL statements with performance problems that are caused by concurrency issues.

However, I was alarmed to see statements like this being run by the SQL Tuning autotasks:

/* SQL Analyze(186,1) */ INSERT INTO TABLE_A SELECT * FROM ...

What would the implications be?  Consider locking, segment growth, redo, triggers, etc.

I could see the execution plan in the SGA started with:

 0 INSERT STATEMENT
 1  LOAD TABLE CONVENTIONAL

The session also had a DML lock on TABLE_A.

I tested with dbms_sqltune, trying to replicate the situation, but after opening the cursor a couple of times, the report just said:

Type of SQL statement not supported.

I logged a Service Request with Oracle Support who said:

…the sql tuning advisor will check the select part only and not the DML part,yes technically it is considered DML but actually internally it will not use the DML but will use the SELECT part…

Another clarification is for such DMLs, the SELECT parts are only used i,e for execution internally and not the DML part and therefore there will be no redo generation or any data update/delete/insert …

I can’t find this treatment of DMLs documented anywhere, so I suggested to Oracle that they correct their documentation and I created this post.

January 9, 2014

Memory vs SQL Loader and Wide Tables

Filed under: Oracle Database — Ari @ 12:33 pm
Tags: , ,

A customer migrated an ETL application to a new Linux server and found that a SQL Loader job was now failing with a return code of 137.

I couldn’t find anything useful on the internet, documentation or Oracle support site about this sqlldr exit code, and there was nothing in the sqlldr logs.

I ran the job myself and noticed that the free memory on the server disappeared very quickly and remained low until the sqlldr process died.  (I didn’t have root access to check OS message logs).

The table being loaded had more than fifty VARCHAR2(4000) columns and direct path method was used.  Using the default SQLLDR parameter values, the direct load buffer needed over 1GB!
Setting a lower value for COLUMNARRAYROWS in the control file for this table reduced the memory requirement for loading this table, and allowed the job to complete successfully.

OVM for SPARC vs TCP

Filed under: OVM SPARC,UNIX / Linux — Ari @ 12:05 pm
Tags: , , ,

When connecting to new OVM for SPARC (LDOMs), I found that there was a network problem, which appears not to be a known issue.
This post might be of interest to you if you plan to use a similar platform.
We have a primary control domain and an LDOM (Solaris VM) created on a T4-2 server. Solaris 11.1 is used on the primary domain and the LDOM.
Virtualised network devices are created over link aggregates with VLAN tagging over 10Gb Oracle CNA HBAs in the primary domains and are assigned to the LDOM.
The problem occurs with any outbound traffic on the 10G HBA.

The first symptoms were the inability to transfer large files from Windows using SFTP, and PuTTy SSH or VNC sessions being disconnected frequently and randomly.

I used Microsoft Network Monitor to capture the network traffic during these problems and found that acknowledgement packets from Solaris that should have been empty had two extra bytes added, and so the checksum was incorrect.  After a while, these checksum errors and the retransmissions would result in the TCP session being disconnected.

The problem affected multiple protocols, but only from the LDOMs, not the primary domain.

The closest issues I could find related to checksum offloading, (where the network card performs the checksum processing to reduce the load on the server’s CPUs).
Disabling checksum offloading on the LDOMs for the 10Gb CNA cards successfully worked around the problem.  (The 1GB cards were not affected).
Eg, Add

set ip:dohwcksum=0x0

to /etc/system in the LDOM and reboot.
I’ll update this post as Oracle Support provide more information.

OVM for SPARC vs CPU Metrics

A couple of months ago, I created an Oracle database cluster (11.2.0.4.0 RAC) on a Solaris 11 LDOM.

I found that the CPU time statistics in the database are all zero, (eg in V$SYS_TIME_MODEL, V$SYSSTAT and AWR reports).

I couldn’t find any notes or bugs that matched, but Oracle Support have just created Solaris 11.1.15.4.0 which they say will address this issue.  (An OS bug).

October 18, 2013

VxFS File Fragmentation

Filed under: VxFS — Ari @ 10:14 pm
Tags: , , ,

This post might interest you if your organisation has Oracle databases running on Veritas filesystems (VxFS).

What Does Veritas File Fragmentation Mean?

The extents allocated to a datafile are not contiguous in the filesystem.  If the extents are too small and scattered then large or sequential I/Os (eg, full tablescans) might be inefficient.

Since learning about how to avoid UFS Fragmentation for Oracle datafiles, I was able to recognise a similar problem with a database on Veritas volume/filesystem.  (Solaris 10).  Many of the notes I made about UFS Fragmentation apply to Veritas fragmentation, so check that page for more information.

A Real World Example

The customer’s DBA had complained that the performance of a training database was significantly less than a test database; both being (RMAN) duplicates.  There were very few differences between the two, and they used the same storage (XIV), but the difference had been narrowed down to I/O performance.

Here is an example, using a full scan of a big table, showing the difference between ORATRN and ORATST:

alter session set db_file_multiblock_read_count=128;
select /*+ full(t) parallel(t) nocache */ count(1) from A_BIG_TABLE t;

Output from vxstat -i 1

 OPERATIONS          BLOCKS           AVG TIME(ms)
 TYP NAME              READ     WRITE      READ     WRITE   READ  WRITE
Fri Oct 18 11:09:36 2013
 vol oratrn           2224         5    429840        80  17.46   2.20
Fri Oct 18 11:09:37 2013
 vol oratrn           1890         4    355216        36  20.78   1.32
Fri Oct 18 11:09:38 2013
 vol oratrn           2008         4    313488        34  19.12   1.11
Fri Oct 18 11:09:39 2013
 vol oratrn           2325         7    328352        84  15.96   1.51
Fri Oct 18 11:09:40 2013
 vol oratrn           1960         7    310064        84  19.62   1.33
Fri Oct 18 11:09:41 2013
 vol oratrn           1895         7    320480        88  19.63   2.62

Fri Oct 18 10:48:45 2013
 vol oratst           1157         2   2107904         8  26.73   3.07
Fri Oct 18 10:48:46 2013
 vol oratst           1178         2   2154944         4  27.32   9.16
Fri Oct 18 10:48:47 2013
 vol oratst           1167         3   2172352        48  26.82   1.80
Fri Oct 18 10:48:48 2013
 vol oratst           1279         4   2346320         8  25.14   7.13
Fri Oct 18 10:48:49 2013
 vol oratst           1223         2   2219488        32  25.97   9.91

The throughput is much higher in the ORATST filesystem, and the average I/O size is larger.

Truss of the database session showed pread system calls of 1MB (128x8KB DB blocks) for both databases, and the Veritas settings were the same.  (The XIV distributes space in 1MB chunks, and 1MB is a common maximum for I/Os, so this should be an efficient value).

I suspected the files were more fragmented in the ORATRN volume than in the ORATST volume because the I/Os were broken up into smaller pieces, despite the environment being almost identical.  I wasn’t sure though, because Veritas is an extent based filesystem, so I wonder if the database files could become fragmented in the same way as files in UFS can.
The customer’s DBA confirmed that the ORATRN database was created with several RMAN channels, while the ORATST database was created with only one.

How can we see the file fragmentation?  I found a standard Veritas command called fsmap.

$: find . | /opt/VRTSvxfs/sbin/fsmap -aH -
                    Volume  Extent Type     File Offset     Extent Size     File
                    oratrn         Data         0 Bytes       264.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratrn         Data       264.00 KB       128.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratrn         Data       392.00 KB       128.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratrn         Data       520.00 KB       128.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratrn         Data       648.00 KB       384.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratrn         Data         1.01 MB       128.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratrn         Data         1.13 MB       304.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratrn         Data         1.43 MB        80.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratrn         Data         1.51 MB       128.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratrn         Data         1.63 MB       384.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratrn         Data         2.01 MB       128.00 KB     ./TWO_GB_DATAFILE.DBF
 ... etc

$: find . | /opt/VRTSvxfs/sbin/fsmap -aH -
                    Volume  Extent Type     File Offset     Extent Size     File
                    oratst         Data         0 Bytes         8.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         8.00 KB        32.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data        40.00 KB        64.00 KB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       104.00 KB         5.31 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         5.41 MB        24.81 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data        30.23 MB        25.88 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data        56.10 MB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data        72.10 MB        19.25 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data        91.35 MB        27.69 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       119.04 MB        23.38 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       142.41 MB        32.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       174.41 MB        17.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       191.41 MB        34.75 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       226.16 MB        42.19 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       268.35 MB        17.94 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       286.29 MB        38.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       324.29 MB        32.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       356.29 MB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       372.29 MB        32.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       404.29 MB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       420.29 MB        18.31 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       438.60 MB        16.25 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       454.85 MB        21.75 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       476.60 MB        17.38 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       493.98 MB        23.75 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       517.73 MB        23.50 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       541.23 MB        20.12 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       561.35 MB        16.25 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       577.60 MB        16.69 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       594.29 MB        16.12 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       610.41 MB        18.12 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       628.54 MB        22.88 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       651.41 MB        34.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       685.41 MB        43.06 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       728.48 MB        45.12 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       773.60 MB        23.88 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       797.48 MB        24.25 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       821.73 MB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       837.73 MB        18.12 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       855.85 MB        22.12 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       877.98 MB        64.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data       941.98 MB        96.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.01 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.03 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.04 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.06 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.08 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.09 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.11 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.12 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.14 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.15 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.17 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.19 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.20 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.22 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.23 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.25 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.26 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.28 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.29 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.31 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.33 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.34 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.36 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.37 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.39 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.40 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.42 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.44 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.45 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.47 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.48 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.50 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.51 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.53 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.54 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.56 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.58 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.59 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.61 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.62 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.64 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.65 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.67 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.69 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.70 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.72 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.73 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.75 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.76 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.78 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.79 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.81 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.83 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.84 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.86 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.87 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.89 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.90 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.92 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.94 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.95 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.97 GB        16.00 MB     ./TWO_GB_DATAFILE.DBF
                    oratst         Data         1.98 GB        10.02 MB     ./TWO_GB_DATAFILE.DBF

After copying the ORATRN files to a new filesystem using rsync, the extent sizes were up to 256MB each, eg:

$: find . | /opt/VRTSvxfs/sbin/fsmap -aH -
                   oranew          Data         0 Bytes        24.00 KB     ./TWO_GB_DATAFILE.DBF
                   oranew          Data        24.00 KB       246.62 MB     ./TWO_GB_DATAFILE.DBF
                   oranew          Data       246.65 MB       256.00 MB     ./TWO_GB_DATAFILE.DBF
                   oranew          Data       502.65 MB       256.00 MB     ./TWO_GB_DATAFILE.DBF
... etc

Also see Symantec’s note on this issue.

Update

During another project, I had the opportunity to examine how RMAN caused this fragmentation.

The database version was 9.2, so RMAN wrote from a 256KB output/disk buffer using four asynchronous threads.  The average VxFS extent size of the restored datafiles was 256KB.

Later versions of Oracle use larger and configurable buffers, so the problem will not be as severe.

We are going to try to use the VxFS tunables “initial_extent_size” and “max_seqio_extent_size” to reduce the fragmentation during the next restore test.

June 30, 2013

Row Migration vs Datapump

Symptoms

Datapump takes a long time to export a small table, waiting for db file sequential read events many more times than there are blocks in the table.
The table doesn’t have any LOBs.

This problem has been seen in 10.2, 11.2 and 12.1. I couldn’t find a bug for this; the closest to an acknowledgement of the problem is MOS note 1086414.1 .

Cause

The problematic table has a high percentage of migrated rows and Datapump is using the direct path method to export the table, possibly because of the table’s small size.

A trace will show that direct path reads are used to get the first table blocks, but then each migrated row will be fetched by executing a sequential block read. Blocks will be read multiple times, once for each row migrated to it. Subsequent reads are not satisfied from cache, so each migrated row results in a physical block read!

A migrated row has no Header flag, just the First and Last flags in the trace file for a block dump, eg:

fb: —-FL–

Workarounds

  1. Defragment the table’s rows with a CTAS or move.
  2. Use expdp with access_method=external_table

Test Steps

The problem can be replicated by following these steps:

create table test_tab (col1 number, col2 varchar2(10));
begin
for i in 1..1000000
loop
insert into test_tab values (i,null);
end loop;
commit;
update test_tab set col2='xxxxx';
commit;
end;
/
@?/rdbms/admin/utlchain
analyze table test_tab list chained rows;
select count(1) from chained_rows;

expdp / directory=DATA_PUMP_DIR tables=test_tab dumpfile=slow.dmp

expdp / directory=DATA_PUMP_DIR tables=test_tab access_method=external_table dumpfile=fast.dmp

September 9, 2012

Diagnostic Directory Disaster

Have you thought about where your database takes a dump?

It is common to install the Oracle software on internal disks and place the database files on external storage.  The DIAGNOSTIC_DEST (and Automatic Diagnostic Repository) is stored under $ORACLE_BASE or $ORACLE_HOME by default, so without further consideration this ends up on local disk too.

Often there are only a couple of internal disks, which are mirrored.  This should be adequate for the OS and Oracle binaries, but not for operations with sustained high IOPS.

In some situations, many Oracle processes may write a lot of data to different files in the ADR concurrently, which may overwhelm a single internal disk.

I’ve seen an extreme example on a Solaris server with multiple test databases sharing a single (mirrored) internal disk and UFS filesystems for their ADR, Oracle binaries, swap space and OS  files.  When one highly parallel statement started hitting ORA-0600 errors, each parallel slave process started writing trace and core files to the ADR.  The server became very sluggish; many OS commands were affected.  A shutdown abort of the affected instance left behind the parallel slave processes, and even kill -9 wouldn’t terminate them.  The internal disks were swamped; the service times for I/Os were reaching 40 seconds while the disk heads thrashed back and forward!  This situation continued for 30 minutes before all the processes finished writing and finally terminated.

Enabling SQL tracing for an instance could also put pressure on the internal disk/s to cope.

If the ADR / DIAGNOSTIC_DEST was on SAN storage, (multiple spindles, large cache), then the spike in I/O activity would be better accommodated.

For Oracle clusters there is another reason to avoid internal/local disk for the ADR: it easier to manage RAC instances with logs (alert, listener, etc) on shared (and highly available) storage, to avoid having to log into every server.

The issue could be mitigated by setting a low value for MAX_DUMP_FILE_SIZE, but in production environments, you may not want to truncate trace files so that the entire trace file can be sent to Oracle Support without having to replicate the issue, (eg internal errors causing an instance crash or data corruption).

Pointing DIAGNOSTIC_DEST at a ZFS volume might mitigate the use of internal disks because of the sequential nature of the ZFS writes.

August 2, 2012

Solaris Cluster vs Network Speed

Filed under: Solaris Cluster,UNIX / Linux — Ari @ 1:56 pm
Tags:

While investigating rsync/ssh performance, I discovered that our Solaris Clusters had just over half the expected throughput on their gigabit NICs.

This was caused by the cluster/streams related module clhbsndr being automatically pushed to each public NIC via /etc/iu.ap .

Symptoms:

  • Round trip time via ping was higher than non-cluster servers on the same network
  • The throughput measured with iperf (and other tools) was around 400-600MBit/s instead of 940Mbit/s
  • The clhbsndr module is loaded for the public interfaces, eg
ifconfig e1000g1 modlist

0 arp
1 ip
2 clhbsndr
3 e1000g

Status

We have confirmed that bug 6625886 is the cause and are now waiting for Oracle to say if the module can safely be removed from the public interfaces of a Solaris 10 cluster, eg:

ifconfig e1000g1 modremove clhbsndr@2

Encryption vs Transfer Speed

Filed under: ssh — Ari @ 1:46 pm
Tags:

Do you want to make scp and rsync go faster?

LAN

One easy method to squeeze a little more speed out of file transfer tools using ssh by trying a different encryption method.

For example, we saved 15 seconds per gigabyte sent across our LAN on our SPARC Solaris servers, by using:

rsync --rsh="ssh -c arcfour" ...

or

scp -c arcfour source dest

WAN

If your problem is latency across a WAN, then HPN-SSH might be the solution.

Also, if you want secure authentication but don’t need the data encrypted, then you could use SSH with no encryption.  HPN-SSH allows the use of a ‘none” cipher.

For example, on a network with 32.5 ms RTT (ping) and Gigabit NICs, scp transfer rates increased from 1.35MB/s to around 100MB/s with HPN-SSH, 4MB buffers (TCP and SSH), and the “NONE CIPHER” enabled.

HPN_SSH Settings
  • HPNDisabled no
  • NoneEnabled=yes (optional)
Kernel Settings

All these were set to 4MB from their defaults:

  • tcp_cwnd_max
  • tcp_max_buf
  • tcp_recv_hiwat
  • tcp_xmit_hiwat

(The send and receive buffers could be set per route instead to differentiate between LAN and WAN connections).

Example of syntax (remove TcpRcvBufPoll=no on Linux and Windows):

/opt/openssh-5.9p1-hpn/bin/scp -oTcpRcvBufPoll=no -oNoneSwitch=yes -oNoneEnabled=yes sourcefile remoteserver:/destpath/destfile

If the NONE cipher is not appropriate, then HPN-SSH’s multi-threaded AES cipher might be the fastest solution for your server if it has multiple cores.  Alternatively, I’ve found arcfour used less CPU so it was better for several concurrent rsync and scp transfers, because it offered greater throughput on busy processors.

Next Page »

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.