Friday, August 24, 2012

An example of analyzing issues with Tivoli Data Warehouse

I had an interesting project this week to get Tivoli Data Warehouse (TDW) operational for a customer.  With the assistance of an IBM colleague (thank you Andrew), we got TDW fully configured and (seemingly) connected.  The Warehouse proxy and Summarization/Pruning agents were started.  The Warehouse proxy log showed that it was connecting to the TEMS.  The Warehouse Proxy configuration workspace showed that everything was green and connected.  Real time monitoring was working great.  We could get 24 hours worth of history, but nothing more.  So why were we still not seeing any data being written to the TDW?

We looked at the Warehouse proxy log on the TDW box.  There were no obvious error messages.  But there also were none of the eyecatchers we noted in earlier posts (such as that TDW database objects were being created).  So if the Warehouse Proxy had no errors, what was the issue?

In this shop, the agents were mainframe TEMAs.  The next step was to look at the RKLVLOGs of the agents running on z/OS.  When I looked through the logs I noticed messages, such as the following, that would occur every hour.  Since we had set the warehouse interval for one hour for this collection, this seemed like more than just a coincidence.

+4EBF9D4E.0002 ERROR MESSAGE: "RPC Error"

(4EBF9D4E.0003-13B4:khdxdacl.cpp,577,"routeExportRequest") Export for object  (table POOLS appl KIP) failed in createRouteRequest, Status = 8.
(4EBF9D63.0000-13B4:kdcc1sr.c,460,"rpc__sar") Connection failure: "ip.pipe:#xx.xxx.x.xxx:63358", 1C010001:1DE00045, 21, 100(3), FFFF/4119,

The conclusion was that this message seemed to indicate that the TEMA was attempting to connect to the warehouse infrastructure to send the data to the TDW, but was getting a connection error.  And this error was happening every hour. 

The problem?  In this case, there was an internal firewall that was blocking off the port we needed (in this example 63358 - see above).  Although real time monitoring was working, and the TDW configuration panel showed green, we still needed this port to be able to send data to the TDW.  Once the port was opened, the TDW worked perfectly.

The moral of the story?  When analyzing these types of issues it is important to look at things from multiple perspectives, look at all the logs, and don't forget about things like firewalls and other security challenges.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.