There is an interesting technote that discusses collection overhead considerations when doing dataset level monitoring using OMEGAMON Storage. The symptoms may be high CPU in the KCNDLxxx address space, and you may see the following message:
KDFS023A CANDLE DATASET I/O KDFSMIBF ROUTINE IEXP FAILED 0000000C 00000000
What this indicates is that the dataset collection process is overwhelmed with data and is unable to catch up.
This goes back to posts I had made about a year ago relevant to optimal monitoring strategies. In general, the more data you request, the more it will cost to collect and display/manage that data. So consider carefully what you ask for, you may get more than you bargained for.
In this technote the author references the "OMEGAMON XE For Storage Tuning Guide".
The technote also suggests such things as collecting data every so often (example every 20 I/Os, SAMPCT=20), versus every I/O. Or, collect data based upon an exception criteria, such as a high MSR time for a device.
With so many active UCBs in many shops and potentially so many datasets, it makes sense to have a strategy for collection, versus just turning it on for everything.
Here's a link to the technote: