Thursday, June 3, 2010

More on DASD monitoring - OMEGAMON XE For Storage

We've discussed DASD monitoring and the cost of monitoring for several of the OMEGAMON solutions. Now lets consider OMEGAMON XE For Storage, and the cost of monitoring considerations for this solution. As the name implies, OMEGAMON XE For Storage is an OMEGAMON solution focused on the monitoring of DASD on the z/OS platform. But, OMEGAMON Storage provides more than just DASD monitoring. OMEGAMON Storage also provides information on virtual tape, channels, control units, cache controllers, HSM activity, shared DASD, and even more. Another thing to be aware of is that OMEGAMON Storage has been evolving over the years, and has morphed into a much broader storage management, as well as a monitoring tool. OMEGAMON Storage now includes integration with a array of Tivoli Storage solutions, including HSM and backup tools, via the Tivoli Enterprise Portal.

Referring back to a primary consideration I mentioned in earlier posts, the more data you ask for, the more potential cost from a cost of monitoring perspective. OMEGAMON Storage certainly falls into this area. OMEGAMON Storage provides the ability to monitor storage at many different levels: shared DASD, the individual LPAR, the channel level, the controller level, the UCB level, even down to the individual dataset level. The user also has the ability to monitor DASD, I/O, and space utilization from the perspective of the workload applications on z/OS, and also to make user defined groups for monitoring and space analysis.

Clearly, OMEGAMON Storage provides useful and detailed information. It is important that the user have a well conceived plan when deploying OMEGAMON Storage to avoid redundant monitoring cost. When installing OMEGAMON Storage, the tool is usually installed on each LPAR in the z/OS environment. In a shared DASD environment, probably the first recommendation is to have the OMEGAMON Storage instance on a specified LPAR in the shared environment be the primary collection point for DASD information, thus avoiding redundant collection of DASD information on multiple LPARs.

There are quite a few more considerations for the set up and optimization of OMEGAMON Storage, and I will get into these in more detail in later posts.

Wednesday, May 26, 2010

Pulse Comes To You

Pulse is the annual IBM event that covers the wide array of Tivoli solutions. If you did not get a chance to go to Las Vegas to participate in Pulse, now you get a chance check out Pulse as part of Pulse Comes To You. Pulse Comes To You is a series of events at various locations all around the world, and provide you with a chance to learn more about systems management and IBM Tivoli solutions. Pulse Comes To You will be happening in a variety of cities in the next 2 months. For a list of locations, and how to sign up (it's free), here is a link for more information:

http://www-01.ibm.com/software/tivoli/pulse/pulsecomestoyou/2010/

Tuesday, May 25, 2010

Upcoming webcast on mainframe network management

If you are interested in learning more about mainframe network management, there is an upcoming webcast on "Proactive problem determination and automation for IBM System z TCP/IP networks".

This is a free webcast sponsored by IBM that will help you learn how to better manage and optimize mainframe network throughput and technologies with IBM Tivoli System z network management solutions, and achieve the highest degree of mainframe network performance. You’ll come away with a better understanding of how IBM Tivoli can help you get the most out of System z networking components such as Enterprise Extender (EE), Open Systems Adapter (OSA) network controllers and TN3270 applications.

If you are interested, the webcast is June 3rd, at 11 AM ET. Here is a link to sign up for the event:


ibm.com/software/systemz/telecon/3jun

Monday, May 24, 2010

(You may not realize) NetView offers considerable TCP/IP management capabilities

I was having a discussion with colleagues last week, and the topic of NetView came up. It seems that many IBM customers do not realize that there is a considerable amount TCP/IP monitoring and management capabilities within the tool, along with the SNA functions that have been in the tool for many years. I've mentioned in earlier posts that I view NetView, as a complementary tool that works hand in hand with OMEGAMON XE For Mainframe Networks. Both tools provide considerable network management options, and both tools integrate via the Tivoli Enterprise Portal.

If you want a little more info on NetView, here is a link to a short YouTube video on the capabilities of NetView V5.4:

http://www.youtube.com/watch?v=go58kv5o88w

Friday, May 21, 2010

DASD monitoring considerations for OMEGAMON IMS

So far I've talked about the "Big Three" (z/OS, CICS, DB2). Now, I will start to address the DASD and I/O monitoring considerations for some of the other OMEGAMON tools.

OMEGAMON XE for IMS provides DASD and I/O related information in several different areas. In the real time 3270 displays, OMEGAMON IMS provides IMS database I/O rates, IMS Log I/O rate info, Long and Short Message Queue rate info, Fast Path Database I/O info, plus information on the various OSAM and VSAM database buffer pools. This information is often useful from a diagnostic and tuning standpoint, and there are no real overhead concerns in terms of collecting the data.

There are a couple areas where DASD, I/O, and I/O related information can impact the cost of monitoring. One area is Bottleneck Analysis. Bottleneck Analysis is a very useful and powerful analysis tool for understanding where IMS workload is spending it's time. One of the sub-options of Bottleneck Analysis is a database switch (DBSW option). If you have Bottleneck Analysis ON, but the database switch option OFF, you will save some CPU in the OMEGAMON IMS collector task. Another consideration is Epilog history. Epilog does a nice job of gathering historical performance analysis information, but you can save some cost of collection by turning off DASD collection in the Epilog history options. This is done by specifying the NORESC(DAS,DEV) option.

Probably the biggie, related to database and I/O monitoring in OMEGAMON IMS is the Transaction Reporting Facility (TRF). If TRF is enabled, OMEGAMON IMS will typically generate records on transaction and database activity into the IMS log. This data is often useful for performance analysis and charge back, but it is potentially voluminous. If you turn it on, be aware of the options for TRF, and recognize that there will be costs in terms of additional CPU usage by the OMEGAMON collector task, and more data written to the IMS log files.

Wednesday, May 19, 2010

DASD Considerations for OMEGAMON DB2

So far we've discussed DASD monitoring considerations for OMEGAMON z/OS and OMEGAMON CICS. Now let's consider OMEGAMON DB2.

There are multiple aspects to consider when we are talking about OMEGAMON DB2. OMEGAMON DB2 collects DASD relevant data for such things as DB2 log I/O, EDM pool I/O information, and object I/O data when doing drill down analysis of virtual pool statistics. This information is provided essentially out of the box, and does not have major overhead considerations.

There are some areas where DASD and I/O monitoring can add overhead, and you do have the ability to control if the data collection is on or off. The first major facility is Object Analysis. Object Analysis is an I/O and getpage analysis facility that will look at all the I/O and getpage activity being done on the subsystem, and correlate that getpage and I/O activity by object, and by DB2 thread. Object Analysis does have an ongoing cost of collection. It does not use DB2 traces, but if the Object Analysis collector is allowed to run all the time, it will add to the CPU usage of the OMEGAMON DB2 collector task. In some shops this is not a big issue, in other (usually larger) shops, it is a consideration. You can optionally configure Object Analysis so that it is off by default, but you may start it, as needed. This is a good strategy for those who want to reduce the cost of monitoring, but still have access to the data when needed. I had an earlier post that describes how to configure Object Analysis to achieve this. Another option to consider with Object Analysis is the thread correlation option. If this is enabled, Object Analysis will use more resource, but I find the thread data to be quite useful.

For those DB2 data sharing users, there is another option to consider, Group Object Analysis. If Group Object Analysis is enabled, that means you are running Object Analysis at the level of each DB2 subsystem (i.e. member) within the data sharing group. That means you have the ongoing cost of running Object Analysis at that level, plus you also have the cost at the level of the OMEGAMON DB2 agent task of correlating the data, in particular if thread correlation is enabled. Group Object Analysis is useful data for a data sharing shop, but understand that you will be pulling potentially a fair amount of data on an ongoing basis.

Now let's consider history. In addition to the Accounting and Statistics trace options in the Near Term History setup options, you also have the option to enable such things as SQL, Sort, and Scan data collection. My recommendation, in general, is to set the Scan option to off. The data will give you an indication of some aspects of scan activity done by the thread, but be advised this data is collected via DB2 IFCID traces, and may add more cost to running Near Term History.

Thursday, May 13, 2010

OMEGAMON Currency Support for z/OS 1.11

If you are looking at going to z/OS V1.11, here is some information on recommended maintenance for OMEGAMON V4.1 and OMEGAMON V4.2. Follow the link for more information:

http://www-01.ibm.com/support/docview.wss?uid=swg21382493

Tuesday, May 11, 2010

Windows on System z

I had heard a bit about this a while back. Supposedly there was a Share presentation in 2009 that mentioned the feasibility of this. Apparently, there is now an appliance that will allow you to run Windows on System z hardware, in a similar manner to how you run Linux on z. I'm not sure yet of its potential, but it sounds interesting, to say the least.

Here is a link to an article on this:

http://www.mainframezone.com/it-management/windows-and-other-x86-operating-systems-on-system-z

Thursday, May 6, 2010

More on DASD monitoring with OMEGAMON z/OS


In earlier posts I've talked about the cost of DASD monitoring, and the notion of the more data you ask for, the more it will potentially cost. This concept can apply, as well, when using the OMEGAMON z/OS CUA and Classic interfaces when monitoring DASD.

It's not unusual for large shops to have thousands of DASD devices connected in their environment. What that means is if you do a lot of ongoing monitoring of these thousands of devices, there is the potential for more OMEGAMON cycle usage to gather and display all this data.

One way to reduce this concern is to take advantage of filter options when displaying DASD devices. Do you really need to see every device, or just ones that meet certain criteria, such as high MSR (milli-second response) times, or high I/O rates? Some basic benchmark testing I've done on IBM systems, have shown measurable OMEGAMON CPU savings by using filter options to reduce the amount of data a devices displayed. This is especially true if you, like many users, like to display DASD devices and watch them in auto-update mode.

The example I show here is an example of using the filter options of the CUA interface to focus on the DASD devices of most interest. You can do a similar technique in the Classic 3270 interface.

Wednesday, May 5, 2010

Managing Workload on Linux on System z seminars

If you are interested in learning more about Linux on System z, there are series of upcoming seminars on the topic. "Managing Mission-Critical Workload on Linux on System z" is a free technology update to learn through case studies how IBM and its Business Partners are implementing virtualized enterprises using Linux on System z.


Some of the objectives of the seminar inlcude understanding how to increase system utilization to avoid investing in and powering unneeded hardware, how to give technical, management and business teams relevant views of the data they need, and how to investigate performance of all mainframe and distributed systems.


Dates and locations are as follows:

Dallas, May 11

https://www-950.ibm.com/events/wwe/grp/grp004.nsf/v16_agenda?openform&seminar=QAFNJTES&locale=en_US

Minneapolis, May 18

https://www-950.ibm.com/events/wwe/grp/grp004.nsf/v16_agenda?openform&seminar=Q72PMKES&locale=en_US

Atlanta, May 20

https://www-950.ibm.com/events/wwe/grp/grp004.nsf/v16_agenda?openform&seminar=F86QRFES&locale=en_US

Houston, May 25

https://www-950.ibm.com/events/wwe/grp/grp004.nsf/v16_agenda?openform&seminar=62BRBDES&locale=en_US

NYC, June 1

https://www-950.ibm.com/events/wwe/grp/grp004.nsf/v16_agenda?openform&seminar=4ZEAU5ES&locale=en_US

Boston, July 7

https://www-950.ibm.com/events/wwe/grp/grp004.nsf/v16_agenda?openform&seminar=296A58ES&locale=en_US

Friday, April 23, 2010

Upcoming System z technology summits

Another round of System z technology summits will be happening in the next few weeks. The System z technology summit is a nice event with multiple tracks on System z, Database, Application development, and of course Systems Management. There is a lot of good technical content (meaning not just marketing fluff). Upcoming cities include Columbus, Costa Mesa, and San Francisco. I will be presenting "Leveraging The Tivoli Enterprise Portal" at the event in Columbus.

If you are interested, we would love to have you attend. Here are the dates, places, and links to sign up for the events:

Columbus, OH - May 11 - www.ibm.com/events/software/systemz/seminar/TechSum1
Costa Mesa, CA - June 16 - www.ibm.com/events/software/systemz/seminar/TechSum3
San Francisco, CA - May 13 - www.ibm.com/events/software/systemz/seminar/TechSum2

Thursday, April 22, 2010

Article on OMEGAMON XE For Storage V4.20

Here's a link to a nice article in IBM Systems magazine on the capabilities and facilities of OMEGAMON XE For Storage V4.20. The article provides a good overview of the tool, and goes into what was added in V4.20. Here's a link to the article:

http://www.ibmsystemsmag.com/mainframe/marchapril10/focusonstorage/32105printp1.aspx

OMEGAMON z/OS DASD monitoring considerations


We've discussed OMEGAMON CICS and DB2 DASD collections considerations. Now let's take a look at OMEGAMON z/OS. OMEGAMON z/OS collects it's device and performance information, such as MSR times, from it's API to RMF. RMF collects the device level statistics, and OMEGAMON displays and analyzes it. DASD information in OMEGAMON z/OS may be seen in both the 3270 (classic and CUA) and in the Tivoli Portal GUI interface.

Recently I was asked by a customer about a message that started showing up on their Tivoli Portal event console. Here is an example of what they were seeing, a situation called "KM5_No_SYSPLEX_DASD_FILTER_WARN". What this situation highlighted was that monitoring DASD devices without a filter that eliminates some of the devices can lead to potential high CPU or storage problems within the monitoring infrastructure. The situation notifies the user that OMEGAMON z/OS does not collect shared DASD device data unless a DASD filter situation has been created, and is active.
So the next question may be, how do you enable the DASD filter situation? The procedure is pretty well documented in the OMEGAMON z/OS User Guide. Here is a link to the appropriate documentation pages:
The Users Guide also has some good recommedations on situation settings, such as collection intervals. I suggest you take the time to review theses options. Keep in mind that the more data you collect, and the more frequently you collect it, the more potential monitoring overhead.
I will talk more about OMEGAMON z/OS DASD monitoring in a subsequent post.

Saturday, April 17, 2010

Cost of monitoring considerations for DASD

DASD is a major area of interest when we are talking about information needed for performance analysis on z/OS, relative to considering the cost of monitoring on z/OS. When you look at the OMEGAMON suite on z/OS consider the following: OMEGAMON z/OS monitors DASD, OMEGAMON Storage monitors DASD, OMEGAMON DB2 monitors DASD, OMEGAMON CICS monitors DASD, OMEGAMON IMS monitors DASD. See a potential pattern here? Each of the core OMEGAMON monitors provides some form of DASD or I/O analysis, and this is for very good reasons. Understanding I/O is an essential component of understanding the performance of the system and the workload. The trade-off comes with understanding the optimal method to monitor DASD relative to the cost.

As the prior paragraph implies, if you have the full suite of OMEGAMON tools, you have the potential to monitor the same string of DASD devices multiple times. So that brings us to the first recommendation: monitor DASD once, not multiple times. If you use OMEGAMON z/OS as your primary DASD analysis tool, then you can consider being more selective when enabling DASD analysis in OMEGAMON CICS or OMEGAMON DB2. In a prior post I mentioned how you can leave Object Analysis in OMEGAMON DB2 off by default, but turn on as needed. That is a good strategy since there are times where the Object Analysis data is useful, but you probably don't need to collect it on an ongoing basis.

Now let's consider OMEGAMON CICS. If you have OMEGAMON z/OS, then you don't really need to set DASD exceptions in OMEGAMON CICS, plus you may want to carefully consider enabling I/O clocks and counters for CICS task history (ONDV - online data viewing). When setting up task history parms in the KC2GLBxx module, consider your options related to I/O. DB2_CLOCKS_AND_COUNTERS, as well as VSAM, ADABASE, DLI, etc. are potentially expensive data collection options. If you don't really need it, then don't turn it on.

There are more considerations specific to OMEGAMON Storage and OMEGAMON z/OS, and I will cover this in a later post.

Thursday, April 15, 2010

An upcoming webcast on z/OS security management

I wouldn't be the first person to mention the strength of security on z/OS as one of its inherent advantages. RACF is powerful, detailed, and flexible. But, to be honest, I'm not sure I would use the term "ease of use" when talking about RACF. That's where tools like IBM Tivoli zSecure come into play.

On April 22nd, at 11 AM ET, there will be a free webcast on "Cost-effective mainframe security compliance with zSecure suite v1.11". The webcast will cover how you can reduce costs with simplified mainframe and RACF security administration, reduce exposures with automated security policy controls that enforce privileged user monitoring and other best practices, and manage risks with closed loop monitoring and remediation to reduce security threats. The speakers are Jamie Pease, IT Specialist, System z Security, and Glinda Cummings, Worldwide Tivoli Senior Product Manager .

If you are interested in simplifying z/OS security management in your shop, this might be a topic of interest. And the price of the webcast is right, it's free. Here's a link to sign up for the webcast:

ibm.com/software/systemz/telecon/22apr

Optimizing the cost of monitoring on z/OS

One of the things I commonly work on with my customers is optimizing the cost of monitoring. When you think about it, there is an inherent balancing act between the cost of monitoring, and the analytic value of the information collected and presented by the monitoring suite. If you ever read the classic 1966 Robert Heinlein novel "The Moon Is A Harsh Mistress" (one of my favorite books when I was a kid), the saying was "Tanstaafl", roughly meaning "there ain't no free lunch". Monitoring and performance management methods may adhere to this saying, as well.

I often get asked questions like "what is the monitoring overhead for the tool?", or "what is the cost if I enable a given function of the tool?". The answer is the eternal running joke of all technology vendors, "it depends". The answer, while often being a bit of a dodge, is actually a truism. The cost and overhead of the tool is often in direct correspondence to how the user chooses to employ and use the tool.

Now that I've done the "it depends" statement, there are some general rules of thumb that may be applied. Having a long standing DB2 and relational database background, there are certain analogies I sometimes draw between database optimization, and monitoring optimization. One of the key ones is, the more data I request and the more data I store and/or act on, the will result often times be a higher cost of collection, and potentially greater overhead. If I'm coding a DB2 Select call that results in a large tablespace scan than pulls back a large multi-million row result set, that will usually run longer and cost more than a singleton select that does index access and only pulls out a single row.

You can apply the same logic to monitoring. From a realtime perspective, if I'm monitoring the MSR (millisecond response time) for thousands of DASD volumes on an ongoing basis, that will be more expensive than if I just look at a single volume in detail, as needed. From a history perspective, the more history data I gather, and the more data I store as a result, the result may be a higher cost of history collection. But let's not forget alerting. The more alerts, and the more information I alert on, and the larger the number of managed systems I alert on, the result will potentially be a higher cost of alerting.

What I plan to do over the next few weeks is do a series of postings on this balancing act. In essence to help you answer the question: "what is the cost of monitoring, versus the diagnostic value of the data?". I will be covering all the core OMEGAMON tools on z/OS, and we will look at real time collection, historical collection, and alert management.

Friday, April 9, 2010

OMEGAMON IMS V4.20 provides new situation options


One of the nice enhancements in OMEGAMON IMS V4.20 was finally getting Bottleneck Analysis information in the TEP. This is useful for a variety of reasons. Bottleneck Analysis is a key analytic component of the tool, and provides the user with information on where the workload is spending its time, therefore giving you an indication of where you may want to direct your tuning efforts.

Plus, having Bottlenck Analysis data in the TEP, means the data may be used for situation creation. Here is an example of how to create a situation using Bottleneck Analysis data. By creating a situation like this you can look for workloads with high percentile wait times (such as waits for CPU, waits for locks, or waits for other resources), and include that information in the situation logic, making the siutation more meaningful and flexible.

Friday, April 2, 2010

A handy link to find Tivoli documentation

I often get requests for where to find links for certain documentation, such as OMEGAMON doc. Here is one link that pulls together the doc links for a whole bunch of IBM Tivoli products. I have this link on the right side on the blog page under "Useful Links".

http://www.ibm.com/developerworks/wikis/display/tivolidoccentral/Home

Wednesday, March 31, 2010

An easy way to generate OMEGAMON DB2 report JCL


OMEGAMON DB2 (the original Candle product) and IBM DB2 PM/PE were merged together into a single set of solutions a few releases ago. The current solution is now called OMEGAMON XE for DB2 PM/PE, to reflect that the tools have been merged.

When the merger occurred, the historical reporting mechanisms of DB2 PM/PE became the core historical functions of OMEGAMON DB2. If you were a new user to the solution, that meant you had to learn a new reporting language, and create new reporting JCL. But, there is an easy way to create OMEGAMON DB2 reporting JCL. If you invoke the ISPF interface that comes with OMEGAMON DB2, you have an option to generate reporting JCL using the ISPF interface.

The example shown here shows how you can invoke this function. You then specify what reports you want, and hit F5 to generate the JCL. You can then edit and submit the resultant JCL.

This is a convenient and easy way to get started with the history functions of OMEGAMON DB2, and quickly create reports using the tool.

Tuesday, March 30, 2010

Recommended maintenance levels for OMEGAMON

The following link is to a page which pulls together a lot of useful information on recommended maintenance levels for the various OMEGAMON monitoring tools (both V4.10 and V4.20). There are also links on this page to interim fixes, and other interesting stuff.

http://www-01.ibm.com/support/docview.wss?uid=swg21290883&myns=swgtiv&mynp=OCSSQH9J&mynp=OCSSZJZF&mynp=OCSSRLD6&mynp=OCSSZ8F3&mynp=OCSSSQ3B&mynp=OCSSUSPS&mynp=OCSSUSPA&mynp=OCSSXS8U&mynp=OCSS2JNN&mynp=OCSS2JFP&mynp=OCSS2JL7&mynp=OCSSLSDR&mync=R