Archive for the 'Zabbix' Category

Happy Alarms

Configuring the alarms from a health monitoring system can be challenging. The idea is to create alarms that will get the operator’s attention, won’t get ignored, are sent to the appropriate parties, and are clear and unambiguous.

To make this even more complex, numerous systems use email to send out alarms in a distributed manner. There may be a portable backup storage box with configured alarms in a non-standard format that still needs to be handled by someone monitoring the network. To make matters worse, often such devices have email alarms configured to an individual’s email address. This can cause problems when there is turnover.

As part of that I had explored all manner of ways of indicating the status of a system.

Borrowing from the Six Sigma tool set, one scheme involved using a 1,3,9 scale for ranking the severity of an item. A 1-3-9 scale forces the ranking of severity into meaningful categories. A 1-10 scale or similar provides room ambiguity.

Many systems use the existing syslog “standards” for ranking the severity of messages. This had to be incorporated.

For example:

AUTH,EMERGENCY
GENERAL,CRITICAL
AUTH,INFO

It made sense to develop a scheme that would incorporate the syslog “standards”, the 1-3-3 scale, and provide unambiguous information to someone who had never seen an alarm unambiguous data on the severity of an alarm.

A number of distribution lists were created based on a target groups.

The following are some examples:

ALL_ALL_CRITICAL
ALL_MGMT_CRITICAL
DEVELOPERS_MGMT_CRITICAL
DEVELOPERS_MGMT_INFO
OPERATIONS_STAFF_INFO
SECURITY_STAFF_EMERG

The last thing was designing the actual messages. It was decided that it would be important to specify fields in emails in the event that automated processing / parsing systems would have some role in reviewing messages from distributed systems in the future.

Here is a sample message:

“PROBLEM: sw3.local.X.com
Interface(10125) inside is Down at least 2 min on Switch: sw3.local.X.com (192.168.10.X).
Details:
Monitors that are down include: Interface(10125) inside Monitors that are up include: Ping,SNMP,HTTP,Telnet,Interface(1) Vlan1,Interface(100) Vlan100 (192.168.10.253),Interface(5010) Port-channel10,Interface(5011) Port-channel11,Interface(5015) Port-channel15,Interface(5016) Port-channel16,Interface(10101) dmz,Interface(10118) Inside – Alltel,Interface(10127) inside,Interface(10131) inside,Interface(10133) prd-003-vmi4, Channel-Group 10,Interface(10134) prd-003-vmi4, Channel-Group 10,Interface(10135) prd-004-vmi4, Channel-Group 11,Interface(10136) prd-004-vmi4, Channel-Group 11,Interface(10145) GigabitEthernet0/45,Interface(10146) sw-1 dmz trunking port,Interface(10147) sw2 inside trunking port,Interface(10148) storage trunking port,Interface(10501) Null0,”

This system has been in place for some time and seems to work well.

I kept thinking about this and realized that one of things to make this register and have people react a bit better still.

As I was thinking about this, I was shocked to discover that one of the Exchange Servers had become self-aware. Not one to waste an opportunity, I asked it about additional ways to improve this process. It reminded me that humans have emotions and perhaps that another way to improve the Health Monitoring system was by associating emotion with the status of various alarms.

So instead of saying that the DISK on Server A is RESTORE, instead we might say “Server A is relieved that it’s disk was replaced before a total system crash!”.

This self-aware exchange server, which we have now dubbed Fred, has a weird sense of humor.

Zabbix – IReport – Custom Reporting Tutorial – Part 1

This is the first  art of a series of articles to get people started with JasperReports and connected to the Zabbix  health monitoring system. In the second part I’ll explain how to access your database to run simple queries and show them in a very simple report created with iReport.

The first thing we need to do to get moving is to have iReport ready. So if you haven’t got it installed do so. Next step is to have the database driver in your iReport library folder. This means that you will need a jdbc jar packaged driver copied to the ./lib/ directory of iReport’s base folder. This step is really important not doing so will not allow you to connect to the database. Most databases provide their own jdbc driver, there are many tutorials and references that speak about jdbc.

Now is the time to configure access to your database. You have to go to the menu “Data -> Connections/Datasources”. Then you must click the “New” button.

When you get this dialog, the first thing to do is to give a name to the connection “Name”. Next is to specify the driver to connect to your database.Then you’ll have to modify the JDBC URL to access your database (You can use the wizard to do this too). Finally you must specify the username and password with permission to get access to the database. It’s convenient to test the connection before saving.

With your connection ready, is time to query the database. In my case I will be using PostgreSQL and I will build a simple query to get the database table names. For this you must go to the menu “Data -> Report Query”.

When you insert your SQL query, in my case “SELECT * FROM pg_tables ORDER BY 1” iReport automatically gets the metadata for your query and stores them as report fields so you can use them during the development of your report.

Finally with the retrieved report fields we can now create our report. This is the result: