This post is about your options for monitoring your Oracle databases running on Windows with System Center Operations Manager (SCOM). I’ll start with why you might want to rule out Oracle Enterprise Manager (OEM).
Why not OEM?
OEM is becoming more common, and it is appropriate for organisations with large Oracle estates, and possibly for smaller ones that don’t require high availability or disaster recovery. However, some times is just isn’t a good fit. These are the reasons I chose not to use OEM for monitoring a particular organisation’s Oracle databases:
- SCOM was already the main monitoring and alerting tool. It could create tickets in the service desk system, send email notifications and phone on-call staff.
- OEM can plug into SCOM, but the Oracle connector costs extra, and would add complexity.
- To make OEM highly available (RAC) and recoverable in a disaster (Data Guard), two or three extra servers would be required. In most cases, this configuration requires extra Oracle licences.
- OEM is a bloated solution at the best of times, with a lot of maintenance, bugs and quirks that result in unnecessary call outs for DBAs. In this case the OEM installation would have been more complex (eg RAC) than the Oracle databases it would monitor. The DBA team (new to Oracle) would need to learn a lot just to support and use OEM.
SCOM Management Packs for Oracle
At the time of writing, there were three commercially provided management packs (MPs) available on the market. (I saw someone had also written their own MP, but it wasn’t freely available and seemed to be too much effort to create).
My colleague and I evaluated all three MPs, and were very disappointed. None of them were fit for purpose.
- Two MPs failed to detect the database instance not open.
- Two MPs failed to detect errors in the alert log. Both of them may have worked eventually if we’d spent more time with the support staff.
I lost confidence in the first MP when told that the alert log had to be under 2MB until the relevant bug could be fixed (several weeks!?).
The other MP required every ORA-nnnnn message had to be listed in a file next to a custom description, rather than using regular expressions!?
- One MP was missing key sections / instructions in the documentation. (Even important features were lacking from the marketing material).
- One MP created too many sessions, so if the server was busy, the sessions could accumulate.
- One MP ran extremely inefficient SQL, eg measuring segment size by adding up the size of the extents. (Imagine doing that on a data warehouse).
- One MP offered templates for custom monitors and rules, which seemed promising, but they were too inflexible and we couldn’t get them to work as intended.
- Support was slow in our case, because of the time zone difference.
A Simple Solution
SCOM comes with the standard functionality to monitor Windows event logs. If we can write Oracle database related notifications to a custom event log, then the SCOM integration is simple.
It didn’t take long to put together a simple but flexible Oracle monitoring solution using standard components:
- Task Scheduler – runs a Powershell Script
- Powershell script – runs SQL
- Powershell script – processes the output
- Powershell script – writes to the event log (and optionally sends emails)
- SCOM – raises alerts based on the event log records
- The connection to SCOM is simple, fast and reliable (based on standard functionality)
- The Oracle monitoring can have new metrics added and bugs fixed quickly
- The Oracle monitoring can be calibrated to each instance to reduce unnecessary alerts
- The Powershell code required is not complex. It is basically parsing text files (from SQL*Plus output), so a DBA with UNIX shell scripting experience can put it together easily
- It’s free!