When you backup your Oracle database using RMAN to create compressed backup sets, it could take a long time if you don’t use multiple channels to take advantage of all of the server’s CPU power.

If the datafiles are vastly different in size, then the backup completion could be delayed while one channel finishes compressing one of the largest files.

A solution in 11g is to use RMAN’s section size option so that each large datafile could be compressed by multiple cores.  This feature was probably created to allow bigfile tablespaces in ASM to be backed up quickly, but it is also useful for speeding up backups on UFS.

Or is it?

When restoring from an RMAN backup that was created with the section size option, the Oracle server process opens each datafile with the O_DSYNC flag!  (Data writes must complete before signalling the server process).

This disables direct I/O (because RMAN is filling holes in a sparse file when restoring a datafile) and so a single writer lock is enforced.  Blocks are processed one at a time through the OS cache (longer code path), there are more and smaller writes, and channels working on each section of the same file have to wait for each other to obtain a file lock.

For the system I was working on, this increased the restore time by a factor of eight or nine.

I have created a service request and a bug (13566617) with Oracle, who have confirmed this is an issue on Solaris.
The SR notes say it can’t be replicated on Linux because the datafile is opened with the O_SYNC flag instead.  (I’ve asked them to review that statement because only the flag name is different.  I am not in a position to test performance on Linux, but it looks like the same situation).

If Oracle changed their code to open the datafile being restored without O_DSYNC, then they would need to call fdsync at the end of writing each section.  Perhaps they didn’t want to do this when multiple processes (channels) could be restoring different sections at the same time.  With direct I/O and without O_DSYNC, single writer locks would still be used unless the datafile blocks were preallocated before restoration started.  (Not a better solution in some environments I imagine, so I doubt Oracle will want to write datafiles twice during restoration).  Section size is most useful for compressed backups, so not all of the benefits of parallelism would be lost if the single writer lock remained, but at least we could get the benefit of large, direct I/Os.  Even environments without direct I/O would benefit from larger write sizes (maxcontig) if the O_DSYNC flag was omitted.

Advertisements

2 thoughts on “RMAN’s Section Size and UFS Don’t Mix

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s