Friday, April 23, 2010

Upcoming System z technology summits

Another round of System z technology summits will be happening in the next few weeks. The System z technology summit is a nice event with multiple tracks on System z, Database, Application development, and of course Systems Management. There is a lot of good technical content (meaning not just marketing fluff). Upcoming cities include Columbus, Costa Mesa, and San Francisco. I will be presenting "Leveraging The Tivoli Enterprise Portal" at the event in Columbus.

If you are interested, we would love to have you attend. Here are the dates, places, and links to sign up for the events:

Columbus, OH - May 11 - www.ibm.com/events/software/systemz/seminar/TechSum1
Costa Mesa, CA - June 16 - www.ibm.com/events/software/systemz/seminar/TechSum3
San Francisco, CA - May 13 - www.ibm.com/events/software/systemz/seminar/TechSum2

Thursday, April 22, 2010

Article on OMEGAMON XE For Storage V4.20

Here's a link to a nice article in IBM Systems magazine on the capabilities and facilities of OMEGAMON XE For Storage V4.20. The article provides a good overview of the tool, and goes into what was added in V4.20. Here's a link to the article:

http://www.ibmsystemsmag.com/mainframe/marchapril10/focusonstorage/32105printp1.aspx

OMEGAMON z/OS DASD monitoring considerations


We've discussed OMEGAMON CICS and DB2 DASD collections considerations. Now let's take a look at OMEGAMON z/OS. OMEGAMON z/OS collects it's device and performance information, such as MSR times, from it's API to RMF. RMF collects the device level statistics, and OMEGAMON displays and analyzes it. DASD information in OMEGAMON z/OS may be seen in both the 3270 (classic and CUA) and in the Tivoli Portal GUI interface.

Recently I was asked by a customer about a message that started showing up on their Tivoli Portal event console. Here is an example of what they were seeing, a situation called "KM5_No_SYSPLEX_DASD_FILTER_WARN". What this situation highlighted was that monitoring DASD devices without a filter that eliminates some of the devices can lead to potential high CPU or storage problems within the monitoring infrastructure. The situation notifies the user that OMEGAMON z/OS does not collect shared DASD device data unless a DASD filter situation has been created, and is active.
So the next question may be, how do you enable the DASD filter situation? The procedure is pretty well documented in the OMEGAMON z/OS User Guide. Here is a link to the appropriate documentation pages:
The Users Guide also has some good recommedations on situation settings, such as collection intervals. I suggest you take the time to review theses options. Keep in mind that the more data you collect, and the more frequently you collect it, the more potential monitoring overhead.
I will talk more about OMEGAMON z/OS DASD monitoring in a subsequent post.

Saturday, April 17, 2010

Cost of monitoring considerations for DASD

DASD is a major area of interest when we are talking about information needed for performance analysis on z/OS, relative to considering the cost of monitoring on z/OS. When you look at the OMEGAMON suite on z/OS consider the following: OMEGAMON z/OS monitors DASD, OMEGAMON Storage monitors DASD, OMEGAMON DB2 monitors DASD, OMEGAMON CICS monitors DASD, OMEGAMON IMS monitors DASD. See a potential pattern here? Each of the core OMEGAMON monitors provides some form of DASD or I/O analysis, and this is for very good reasons. Understanding I/O is an essential component of understanding the performance of the system and the workload. The trade-off comes with understanding the optimal method to monitor DASD relative to the cost.

As the prior paragraph implies, if you have the full suite of OMEGAMON tools, you have the potential to monitor the same string of DASD devices multiple times. So that brings us to the first recommendation: monitor DASD once, not multiple times. If you use OMEGAMON z/OS as your primary DASD analysis tool, then you can consider being more selective when enabling DASD analysis in OMEGAMON CICS or OMEGAMON DB2. In a prior post I mentioned how you can leave Object Analysis in OMEGAMON DB2 off by default, but turn on as needed. That is a good strategy since there are times where the Object Analysis data is useful, but you probably don't need to collect it on an ongoing basis.

Now let's consider OMEGAMON CICS. If you have OMEGAMON z/OS, then you don't really need to set DASD exceptions in OMEGAMON CICS, plus you may want to carefully consider enabling I/O clocks and counters for CICS task history (ONDV - online data viewing). When setting up task history parms in the KC2GLBxx module, consider your options related to I/O. DB2_CLOCKS_AND_COUNTERS, as well as VSAM, ADABASE, DLI, etc. are potentially expensive data collection options. If you don't really need it, then don't turn it on.

There are more considerations specific to OMEGAMON Storage and OMEGAMON z/OS, and I will cover this in a later post.

Thursday, April 15, 2010

An upcoming webcast on z/OS security management

I wouldn't be the first person to mention the strength of security on z/OS as one of its inherent advantages. RACF is powerful, detailed, and flexible. But, to be honest, I'm not sure I would use the term "ease of use" when talking about RACF. That's where tools like IBM Tivoli zSecure come into play.

On April 22nd, at 11 AM ET, there will be a free webcast on "Cost-effective mainframe security compliance with zSecure suite v1.11". The webcast will cover how you can reduce costs with simplified mainframe and RACF security administration, reduce exposures with automated security policy controls that enforce privileged user monitoring and other best practices, and manage risks with closed loop monitoring and remediation to reduce security threats. The speakers are Jamie Pease, IT Specialist, System z Security, and Glinda Cummings, Worldwide Tivoli Senior Product Manager .

If you are interested in simplifying z/OS security management in your shop, this might be a topic of interest. And the price of the webcast is right, it's free. Here's a link to sign up for the webcast:

ibm.com/software/systemz/telecon/22apr

Optimizing the cost of monitoring on z/OS

One of the things I commonly work on with my customers is optimizing the cost of monitoring. When you think about it, there is an inherent balancing act between the cost of monitoring, and the analytic value of the information collected and presented by the monitoring suite. If you ever read the classic 1966 Robert Heinlein novel "The Moon Is A Harsh Mistress" (one of my favorite books when I was a kid), the saying was "Tanstaafl", roughly meaning "there ain't no free lunch". Monitoring and performance management methods may adhere to this saying, as well.

I often get asked questions like "what is the monitoring overhead for the tool?", or "what is the cost if I enable a given function of the tool?". The answer is the eternal running joke of all technology vendors, "it depends". The answer, while often being a bit of a dodge, is actually a truism. The cost and overhead of the tool is often in direct correspondence to how the user chooses to employ and use the tool.

Now that I've done the "it depends" statement, there are some general rules of thumb that may be applied. Having a long standing DB2 and relational database background, there are certain analogies I sometimes draw between database optimization, and monitoring optimization. One of the key ones is, the more data I request and the more data I store and/or act on, the will result often times be a higher cost of collection, and potentially greater overhead. If I'm coding a DB2 Select call that results in a large tablespace scan than pulls back a large multi-million row result set, that will usually run longer and cost more than a singleton select that does index access and only pulls out a single row.

You can apply the same logic to monitoring. From a realtime perspective, if I'm monitoring the MSR (millisecond response time) for thousands of DASD volumes on an ongoing basis, that will be more expensive than if I just look at a single volume in detail, as needed. From a history perspective, the more history data I gather, and the more data I store as a result, the result may be a higher cost of history collection. But let's not forget alerting. The more alerts, and the more information I alert on, and the larger the number of managed systems I alert on, the result will potentially be a higher cost of alerting.

What I plan to do over the next few weeks is do a series of postings on this balancing act. In essence to help you answer the question: "what is the cost of monitoring, versus the diagnostic value of the data?". I will be covering all the core OMEGAMON tools on z/OS, and we will look at real time collection, historical collection, and alert management.

Friday, April 9, 2010

OMEGAMON IMS V4.20 provides new situation options


One of the nice enhancements in OMEGAMON IMS V4.20 was finally getting Bottleneck Analysis information in the TEP. This is useful for a variety of reasons. Bottleneck Analysis is a key analytic component of the tool, and provides the user with information on where the workload is spending its time, therefore giving you an indication of where you may want to direct your tuning efforts.

Plus, having Bottlenck Analysis data in the TEP, means the data may be used for situation creation. Here is an example of how to create a situation using Bottleneck Analysis data. By creating a situation like this you can look for workloads with high percentile wait times (such as waits for CPU, waits for locks, or waits for other resources), and include that information in the situation logic, making the siutation more meaningful and flexible.

Friday, April 2, 2010

A handy link to find Tivoli documentation

I often get requests for where to find links for certain documentation, such as OMEGAMON doc. Here is one link that pulls together the doc links for a whole bunch of IBM Tivoli products. I have this link on the right side on the blog page under "Useful Links".

http://www.ibm.com/developerworks/wikis/display/tivolidoccentral/Home