Beware When Installing a New Oracle Home on Windows

When running Oracle databases on UNIX or Linux, it is easy to install a new Oracle Home without an outage.  The work can be done during working hours, long before any change window or outage to existing databases and applications.  The switch to a the new Oracle Home is also straight forward.

However, on Windows, there are a number of considerations for a cautious DBA, and a longer outage for upgrades may be unavoidable.

I’ve written some brief notes, based on my experience patching 11.2.0.1 to 11.2.0.4.21 on Windows Server 2008: Patching/Upgrading Oracle On Windows

 

OERR Finally Available on Windows

I have relied on oerr to quickly find the meaning of error messages (ORA-, TNS-, RMAN-) on UNIX and Linux for decades.

How annoying that Oracle on Windows doesn’t provide this functionality!  Sure, I can use the error documentation, or SQLCL (oerr is built in), but it wastes time.

I downloaded Oracle Database 12c Release 1 (12.1.0.2.0) for Microsoft Windows (x64), and found oerr now exists, but the message files don’t!

There is an easy solution: OERR on Windows 12.1

SQL*Plus on Windows

Younger DBAs, or those that started on Windows, may rely on GUI tools like Enterprise Manager and SQL Developer.  However, this old DBA is entrenched in the command line world.  GUI tools have their uses, but using a suite of scripts designed for SQL*Plus is the most flexible and portable practice.  Independence from tools that are not universally installed is especially useful for contractors, consultants or anyone that wants to change organisations during his/her career.

The Windows GUI version of SQL*Plus disappeared with 11g, but we still have the command line (DOS) version available.  Until SQLCL is ready for me (and me for SQLCL), I’ll be using SQL*Plus.  See the first of my Oracle on Windows tips: SQL*Plus on Windows 7

The Not-So-Dynamic SGA

Beware of DISM. It bites.
Pre-12c databases on Solaris use DISM to allow the SGA to be resized without restarting the database.  That is the theory at least, but is it really safe to do so?

Increasing the SGA dynamically can introduce a severe performance issue that can slow down the database so much that it eventually crashes.  That could be a nasty surprise for the DBA that attempts to increase the buffer cache or shared pool on a busy production database that is subjected to a heavier workload than usual.

Recommendation

Don’t use DISM!   Don’t touch that SGA_MAX_SIZE parameter.

Use ISM instead.  The instance will have a greater chance to obtain large pages and the whole SGA will be locked in RAM.

Even if you use DISM but never increase the SGA, you will probably end up with lots of small pages (bad for TLB/TSB misses).  What is the point if you can only safely increase the SGA size by restarting the instance?

The same goes for MEMORY_MAX_TARGET – but AMM is for ASM, trivial databases or “Oracle for Dummies” anyway.

The Details

If you want to know more, read on.

Environment

  • Solaris 10 10/08 s10s_u6wos_07b Zone running inside a Solaris 11.2 LDOM
  • RDBMS 11.2.0.2.11
  • SGA_MAX_SIZE parameter set
  • Veritas Filesystem with Direct I/O enabled

Background

A DR zone running standby databases shared the same LDOM as the test zone.  In order to make the most of the RAM, the standby databases had their SGAs reduced while they were not activated.  The plan was to dynamically increase them just before or after a switchover/failover.  The SGA_MAX_SIZE was set to match the primary databases, but the individual components were lower to make more RAM available for use by the test databases and application processes.

Symptoms

Soon after increasing the shared pool and buffer cache sizes on the standby databases, the database processes started to use a lot of CPU.  The OS showed high system time, mutex contention and context switches.  The output from commands like ps -ef | grep pmon would pause for 30 seconds when it reached the affected database processes.

Cause

When the shared segment using DISM is created, swap space is reserved for the whole segment.  The used granules will be allocated and the underlying pages locked in RAM.  The free granules may mostly be unallocated.

col component format a32
select min(BASEADDR), max(BASEADDR), count(1) Granules, sum(a.gransize)/1048576 MB, a.GRANFLAGS, component, a.GRANSTATE
from x$ksmge a, x$kmgsct b
where a.grantype = b.grantype (+)
group by a.GRANFLAGS, component, a.GRANSTATE
order by 1,2
/
MIN(BASEADDR)    MAX(BASEADDR)      GRANULES         MB  GRANFLAGS COMPONENT                        GRANSTATE
---------------- ---------------- ---------- ---------- ---------- -------------------------------- ----------------
0000000400000000 0000000454000000         85       1360          0                                  FREE
0000000455000000 000000045C000000          8        128          4 DEFAULT buffer cache             ALLOC_LOCKED
000000045D000000 000000045D000000          1         16          4 java pool                        ALLOC_LOCKED
000000045E000000 000000045E000000          1         16          4 large pool                       ALLOC_LOCKED
000000045F000000 000000047E000000         32        512          4 shared pool                      ALLOC_LOCKED

pmap -xs on the dism process from the LDOM shows

         Address     Kbytes        RSS       Anon     Locked Pgsz Mode   Mapped File
0000000380000000      12288      12288          -       8192   4M rwxs-    [ dism shmid=0x7d000027 ]
0000000380C00000       4096       4096          -       4096   8K rwxs-    [ dism shmid=0x7d000027 ]
0000000400000000    1310720          -          -          -    - rwxs-    [ dism shmid=0x52000028 ]
0000000450000000     524288     524288          -     442368   8K rwxs-    [ dism shmid=0x52000028 ]
0000000470000000     151552     151552          -     151552   4M rwxs-    [ dism shmid=0x52000028 ]
0000000479400000      94208      94208          -      94208   8K rwxs-    [ dism shmid=0x52000028 ]
0000000480000000       4096       4096          -          -   4M rwxs-    [ dism shmid=0x52000029 ]

Now alter the buffer cache to add 128MB to the buffer cache:

         Address     Kbytes        RSS       Anon     Locked Pgsz Mode   Mapped File
0000000380000000      12288      12288          -       8192   4M rwxs-    [ dism shmid=0x7d000027 ]
0000000380C00000       4096       4096          -       4096   8K rwxs-    [ dism shmid=0x7d000027 ]
0000000400000000     262144     262144          -          -   8K rwxs-    [ dism shmid=0x52000028 ]
0000000410000000    1048576          -          -          -    - rwxs-    [ dism shmid=0x52000028 ]
0000000450000000     524288     524288          -     442368   8K rwxs-    [ dism shmid=0x52000028 ]
0000000470000000     151552     151552          -     151552   4M rwxs-    [ dism shmid=0x52000028 ]
0000000479400000      94208      94208          -      94208   8K rwxs-    [ dism shmid=0x52000028 ]
0000000480000000       4096       4096          -          -   4M rwxs-    [ dism shmid=0x52000029 ]

Notice that the newly mapped 256MB is resident but is using unlocked 8KB pages.

MIN(BASEADDR)    MAX(BASEADDR)      GRANULES         MB  GRANFLAGS COMPONENT                        GRANSTATE
---------------- ---------------- ---------- ---------- ---------- -------------------------------- ----------------
0000000400000000 0000000407000000          8        128          4 DEFAULT buffer cache             ALLOC
0000000408000000 0000000454000000         77       1232          0                                  FREE
0000000455000000 000000045C000000          8        128          4 DEFAULT buffer cache             ALLOC_LOCKED
000000045D000000 000000045D000000          1         16          4 java pool                        ALLOC_LOCKED
000000045E000000 000000045E000000          1         16          4 large pool                       ALLOC_LOCKED
000000045F000000 000000047E000000         32        512          4 shared pool                      ALLOC_LOCKED

Look for the granule state of “ALLOC” as opposed to “ALLOC_LOCKED”.  (When using ISM instead of DISM, “ALLOC” is okay – the granules are still locked in RAM).

In our case it wasn’t a problem with resource limits preventing the pages from being locked.  Truss shows that the ora_dism process doesn’t even try to lock the granules, (no memcntl system calls).  The output from prctl showed no resource limitation was reached for the project or zone.  Also, all the memory can be locked by restarted the database or by growing and shrinking the SGA components repeatedly without a restart.

Lockstat showed:

lockstat -A -D 10 -n 100000 -ckWw sleep 1
Adaptive mutex spin: 94006 events in 2.089 seconds (45004 events/sec)

Count indv cuml rcnt     nsec Hottest Lock           Hottest Caller
-------------------------------------------------------------------------------
78401  83%  83% 0.00   591979 0x10012cf59f00         anon_array_enter
 4226   4%  88% 0.00    23909 volgl_spin[16]         voliod_iohandle
 3658   4%  92% 0.00   188558 cpu_info_template_lock[8] read_kstat_data
 2200   2%  94% 0.00    34158 0x4000083e880          volpage_getlist_internal
 1554   2%  96% 0.00   162917 0x10017fff0d00         vdc_send_request
  966   1%  97% 0.00    11957 0x1001bdc31b08         i_ldc_invoke_cb
  868   1%  98% 0.00    56698 cpu_lock[8]            p_online_internal
  207   0%  98% 0.00     9404 0x300e9475100          sfmmu_hblk_alloc
  144   0%  98% 0.00   323738 0x10018fe95080         vdc_handle_cb
  139   0%  98% 0.00    26508 vx_worklist_lk[16]     vx_worklist_process
-------------------------------------------------------------------------------

Adaptive mutex block: 986 events in 2.089 seconds (472 events/sec)

Count indv cuml rcnt     nsec Hottest Lock           Hottest Caller
-------------------------------------------------------------------------------
  748  76%  76% 0.00   316080 0x10012cf59f00         anon_array_enter
  113  11%  87% 0.00 12666391427 cpu_lock[8]            getloadavg
   27   3%  90% 0.00  2371473 cpu_info_template_lock[8] read_kstat_data
   25   3%  93% 0.00   455025 0x10018ffe1b00         vdc_send_request
   23   2%  95% 0.00   318199 0x4000083e880          volpage_freelist
   22   2%  97% 0.00   431246 volgl_spin[16]         voliod_iohandle
    6   1%  98% 0.00 17120585 0x10012cef8140         anon_map_privatepages
    3   0%  98% 0.00   194315 0x1001da4a9df8         vol_nm_debug_log_entry
    3   0%  98% 0.00   305389 0x10016fff0c40         vdc_handle_cb
    2   0%  99% 0.00    90667 0x10025fd7b0d0         vx_rwsleep_rec_lock
-------------------------------------------------------------------------------

Affected database sessions showed stacks similar to this:

echo "::threadlist -v"|mdb -k >threadlist.out
    default_mutex_lock_delay+0x6c()
    mutex_vector_enter+0x17c()
    anon_array_enter+0xbc()
    spt_anon_getpages_task+0x328()
    vmtask_task+0x7c()
    vmtask_run_xjob+0x1bc()
    spt_anon_getpages+0x1f0()
    segspt_dismfault+0x2f4()
    as_lock_mwlist+0x1bc()
    mvec_activate+0x178()
    as_pagelock+0x48()
    vx_bp_lock_pages+0x48()
    vx_dio_physio+0x2e8()
    vx_dio_rdwri+0x668()
    vx_dio_read+0x220()
    vx_read1+0x6f0()
    vx_read+0x404()
    fop_read+0x84()
    lo_read+0x58()
    fop_read+0x84()
    pread+0x224()
    syscall_trap+0xc8()

Increasing the SGA_TARGET value dynamically has the same effect.

Resolution

The quickest fix is to shut down the instance and restart it with the same memory parameters.

Conjecture

The Oracle developers may have purposely avoided locking newly allocated 8k pages immediately to help Solaris coalesce the fragmented memory to create large pages.  However, this coalescing behaviour caused performance issues of its own in Solaris 10.  (Search for set mpss_coalesce_disable=1 and set pg_contig_disable=1 ).

Oracle 12c uses OSM instead of DISM, which is a big clue as to how we should configure older versions.

References

ISM or DISM Misconfiguration can Slow Down Oracle Database Performance (Doc ID 1472108.1)

SuperCluster – OSM ( Optimized Shared Memory ) (12c) is Supported on SuperCluster; DISM ( Dynamic Intimate Shared Memory )(11g) is not (Doc ID 1468297.1)

Oracle Scratchpad post about OSM and 8KB pages

Note that BUG:15794950 is the most common DISM problem referenced in MOS notes, but it isn’t relevant to this OS version.  Also, this case involves performance issues with 8k pages, not large pages.

Dealing with Millions of RMAN Backupsets is Painful

Overview

RMAN backups, restores and maintenance activities were taking far too long for data warehouse on Oracle.
The throughput from disk and to the tape library were not the only factors; meta-data processing, latency communicating with the media manager and RMAN configuration were significant.

Changing a few RMAN configuration parameters and creating new indexes in the RMAN repository database made a dramatic improvement.

The environment:

  • Oracle 10gR2 Database with 7,500 datafiles.
  • Daily level 1 backups and weekly level 0 backups.
  • Backed up to a virtual tape library (VTL) with de-duplication and compression enabled.
  • VTL vendor specified FILESPERSET=1 to allow the VTL’s de-duplication technology to save space.
  • Using an 11.2 RMAN catalogue database that also records backups for hundreds of other databases.
  • 365 day backup retention.
  • Datafiles stored on NFS (no OS caching).

Issue – Too Many Backup Sets

After a year of backing up thousands of datafiles and archive logs each day, there were 3,000,000 backup sets in the RMAN repository for this database.

  1. Many meta-data operations were slow, so that backups would seem to stall for an hour before beginning to send data to the VTL.
  2. Backups of other databases using the same RMAN catalogue were also affected, although not to the same degree.
  3. Deleting a week’s cycle of obsolete RMAN backups for the data warehouse took more than four days.
  4. Before each datafile was backed up, the control file would be scanned.  The control file had grown to 500MB because of all the backup objects, and the NFS file system didn’t provide OS caching, so 3.75 terabytes of control file data was read from the storage during each backup just to process the meta-data.

Solutions

The proliferation of small tablespaces and datafiles was an obvious design flaw, but not one that we could address in the medium term because they are created automatically by the third party data warehouse application.

Did I really need to set FILESPERSET=1?  The VTL de-duplication technology breaks each backup piece into chunks (256KB?) and replaces the repeated chunks with pointers, so the vendor recommends not multiplexing, compressing or encrypting the RMAN backups.  However, setting MAXOPENFILES=1 parameter also disables multiplexing, while still allowing a backup set to contain multiple datafiles.  The datafiles are now concatenated into the backup pieces instead of being interleaved.  I don’t expect any alignment issues because the RMAN output buffer is 256KB in this case.

The VTL vendor also requested that backup pieces be kept below 100GB to reduce locking and capacity management issues.  As a DBA, I wamt to limit the time it takes to restore part of the database (eg a datafile or tablespace).  At first I thought I could use MAXPIECESIZE to achieve these aims without compromising on the number of backup sets, (i.e. have multiple backup pieces per backup set).  However, I was misled by some Oracle MOS notes that confused “backup pieces” with “backup sets”.   Datafiles are mapped to backup sets, not to each backup piece, even when multiplexing is not involved.  The worst case for restoring a single datafile would be to read through all the backup pieces in a set only to find the datafile at the end of the last piece.  Instead, I used MAXSETSIZE and set it to be slightly larger than the greatest datafile size.

Now there were a reasonable number of backup sets being created each day, we just had to wait for the old backups to age out of the repository.  What could I do to speed up RMAN catalogue access in the mean time?  Separating this database’s backup repository into a different schema didn’t help because some of the execution plans were shared across schemata.  The same goes for creating SQL Profiles – one plan didn’t always suit every DB’s RMAN repository data.

I did create some indexes, which may be of use to others with large RMAN repositories:

create index BCF_I98 on BCF(DBINC_KEY, BS_KEY) tablespace RMAN_CATALOG compress 1 ;
create index BCF_I99 on BCF(DBINC_KEY, CKP_SCN) tablespace RMAN_CATALOG compress 1 ;


create index BDF_I98  on BDF(DBINC_KEY, FILE#, BS_KEY) tablespace RMAN_CATALOG compress 2 parallel 4;
alter  index BDF_I98 noparallel;

create index BDF_I99  on BDF(DBINC_KEY, CKP_SCN) tablespace RMAN_CATALOG compress 1 parallel 4;
alter  index BDF_I99 noparallel;


create index BP_I98 on BP(HANDLE) tablespace RMAN_CATALOG parallel 4;
alter  index BP_I98 noparallel;

create index BP_I99 on BP(DB_KEY, status, DEVICE_TYPE, copy#, piece#, SITE_KEY, TAG, HaNDLE, bs_key) tablespace RMAN_CATALOG compress 7 parallel 4;
alter  index BP_I99 noparallel;



create index BRL_I99 on BRL(DBINC_KEY, LOW_SCN) tablespace RMAN_CATALOG compress 1 parallel 4;
alter  index BRL_I99 noparallel;


create index BS_I98 on BS(BCK_TYPE, status, PIECES, KEEP_OPTIONS, bs_key, COMPLETION_TIME) tablespace RMAN_CATALOG compress 4 parallel 4;
alter  index BS_I98 noparallel;

create index BS_I99 on BS(DB_KEY,STATUS,INPUT_FILE_SCAN_ONLY) tablespace RMAN_CATALOG compress parallel 4;
alter  index BS_I99 noparallel;


create index BSF_I99 on BSF(DB_KEY, BS_KEY) tablespace RMAN_CATALOG compress 1;

create index RLH_I99 on RLH(DBINC_KEY,LOW_SCN) tablespace RMAN_CATALOG compress 1 parallel 4;
alter  index RLH_I99 noparallel;


create index RSR_I99 on RSR(RSR_END,DBINC_KEY) tablespace RMAN_CATALOG;

create index DBINC_I99 on DBINC(DB_KEY,DBINC_KEY) tablespace RMAN_CATALOG;

Update

The team looking after the VTL didn’t notice any reduction in the effectiveness of the de-duplication, so this this approach was successful.  Six months later, the database was rebuilt with 1/10th the number of datafiles and tablespaces, resolving this issue and a few others.

Undone by Undo

The Puzzle

After switching to a new undo tablespace, transactions would fail with ORA-01552.

The Background

During a data warehouse migration using GoldenGate, the target database returned an ORA-1628 error when applying transactions from the source.
The assigned DBA set a couple of initialisation parameters as prescribed by
Troubleshooting ORA-1628 – max # extents (32765) reached for rollback segment <SEGMENT_NAME> (Doc ID 1580182.1)

alter system set "_smu_debug_mode" = 4 scope=both sid='*';
alter system set "_rollback_segment_count"=999999 scope=both sid='*';

The second parameter caused this warning:

ORA-30018: Create Rollback Segment failed, USN 32760 is out of range

During the migration, the undo tablespace grew to a ridiculous size.  In an attempt to reclaim terabytes of space, the DBA tried to replace the undo tablespace with a smaller one.  However, after creating a new tablespace and switching to it, transactions failed with:

ORA-01552: cannot use system rollback segment for non-system tablespace...

So, the original and huge undo tablespace was retained for the ‘go-live’ while Oracle Support tried to work out what caused the ORA-01552.

After going live on the new database, ETL performance was disappointing due to the overhead of DDL for management of the undo segments and the domination of the buffer cache by undo blocks.

At first Oracle Support claimed that creating the maximum number of undo segments was beneficial.  However, when pushed, they could not explain exactly how having 32,760 undo segments helped in a database with a low number concurrent transactions.  How would it prevent hitting the maximum number of extents (ORA-1628)?  What was the cost?

After removing the undo parameters prescribed by 1580182.1, there was no improvement to performance and the ORA-01552 still prevented a switch to a smaller undo tablespace.

The Solution

I started investigating and found that when a new undo tablespace was created, no new rows were inserted into UNDO$ is because the limit had been reached:

select ts#,name from ts$ where name like 'UNDO%';

TS# NAME
--- --------
  2 UNDOTBS1
  4 UNDOTBS2
145 UNDOTBS3
break on report
compute sum of segments on report
select max(US#),count(1) segments,ts# from sys.undo$ group by ts#;

MAX(US#)     SEGMENTS TS#
---------- ---------- ---
32758           20753 2
32759           12006 4
0                   1 0
           ----------
            sum 32760

The UNDOTBS3 tablespace had no undo segments in it because the database couldn’t create any more!

Once the UNDOTBS1 tablespace was dropped, new undo segments could be created in other undo tablespaces, and the ORA-01552 was avoided.

I suggested that Oracle Support change the careless recommendations in Doc 1580182.1.  At the time of writing, Doc 1580182.1 now says:

alter system set "_rollback_segment_count"=1000 scope=spfile;