The remaining sections of this chapter discuss the important issue of monitoring. Monitoring tools are used to obtain input parameters for performance models from measurement data. They are also used to validate performance predictions made by the models.
• Table of Contents
Performance by Design: Computer Capacity Planning by Example
By Daniel A. Menascé, Virgilio A.F. Almeida, Lawrence W. Dowdy
Publisher: Prentice Hall PTR Pub Date: January 05, 2004
ISBN: 0-13-090673-5 Pages: 552
Individual organizations and society as a whole could face major breakdowns if IT systems do not meet their Quality of Service (QoS) requirements on performance, availability, security, and maintainability. Corporations stand to lose valuable income, and public interests could be put at great risk. System designers and analysts usually do not take QoS requirements into account when designing and/or analyzing computer systems, due mainly to their lack of awareness about the issues that affect
performance and the lack of a framework to reason about performance. This book describes how to map real-life systems (e.g., databases, data centers, e-commerce applications) into analytic performance models. The authors elaborate upon these models, and use them to help the reader thoroughly analyze and better understand potential performance issues.
< Day Day Up >
5.5 Monitoring Tools
Monitors are used for measuring the level of activity (i.e., workload intensities, device utilizations) of a
computer system [2]. Ideally, monitors should affect as little as possible the operation of the system being measured in order to minimally degrade its performance. Monitors are characterized by their type and mode. There are three types of monitors depending upon their implementation: hardware, software, and hybrid. There are two different data colletion modes: event trace and sampling.
5.5.1 Hardware Monitors
A hardware monitor is a specialized measurement tool that detects certain events (e.g., the setting of a register) within a computer system by sensing predefined signals (e.g., a high voltage at the register's control point). A hardware monitor captures the state of the computer system under study via electronic probes that are attached to its circuitry and records the measurements. The electronic probes sense the state of hardware components of the systems, such as registers, memory locations, and I/O channels. For example, a hardware monitor may detect a memory-read operation by sensing that the read probe to the memory module changes from the inactive to the active state [3, 4, 6].
The main advantages of hardware monitors are that they do not consume resources from the monitored system, they do not affect the operation or performance of the system, and they do not place any
overhead on the system. One of the major problems of hardware monitors is that software features (e.g., the completion of a specific job) are difficult to detect, since these monitors do not have access to
software-related information such as the identification of the process that triggered a given event. Thus, workloadspecific data and the number of transactions executed by a given class are difficult to obtain using a hardware monitor.
5.5.2 Software Monitors
A software monitor consists of routines inserted into the software, either at the user level or (more often) at the kernel level, of a computer system with the aim of recording status information and events of the system [1, 4]. These routines gather performance data about the execution of programs and/or about the components of the hardware. Software monitoring is activated either by the occurrence of specific events (e.g., an interrupt signaling an I/O completion) or by timer interrupts (e.g., to see if a particular disk is active or not every 5 msec), depending on the monitoring mode.
Software monitors can basically record any information that is available to programs and operating systems. This feature, together with the flexibility to select and reduce performance data, makes software monitors a powerful tool for analyzing computer systems. The IBM Resource Management Facility (RMF) and Windows XP's Performance Monitor are examples of software monitors that provide performance information such as throughput, device utilizations, I/O counts, and network activity. A drawback of software monitors is that they use the the very resources that they measure. Therefore, software monitors may (and sometimes significantly) interfere with the system being measured. Depending on the level of overhead introduced, software monitors may yield results of minimal value. Two special classes of software monitors are accounting systems and program analyzers. Each provides useful information that helps to parameterize QN models.
Accounting Systems. Accounting systems are tools primarily intended to apportion financial charges to
users of a system [7, 8]. They are usually an integral part of most multiuser operating systems. The IBM/SMF (System Management Facility) is a standard feature of IBM's MVS operating system, which collects and records data related to job executions. UNIX's sar (System Activity Report) is another example of an accounting system.
< Day Day Up >
• Table of Contents
Performance by Design: Computer Capacity Planning by Example
By Daniel A. Menascé, Virgilio A.F. Almeida, Lawrence W. Dowdy
Publisher: Prentice Hall PTR Pub Date: January 05, 2004
ISBN: 0-13-090673-5 Pages: 552
Individual organizations and society as a whole could face major breakdowns if IT systems do not meet their Quality of Service (QoS) requirements on performance, availability, security, and maintainability. Corporations stand to lose valuable income, and public interests could be put at great risk. System designers and analysts usually do not take QoS requirements into account when designing and/or analyzing computer systems, due mainly to their lack of awareness about the issues that affect
performance and the lack of a framework to reason about performance. This book describes how to map real-life systems (e.g., databases, data centers, e-commerce applications) into analytic performance models. The authors elaborate upon these models, and use them to help the reader thoroughly analyze and better understand potential performance issues.
< Day Day Up >
parameters in capacity planning studies. In general, accounting data include three groups of information.
Identification. Specifies user, program, project, accounting number, and class of the monitored
event.
Resource usage. Indicates the resources (e.g., CPU times, I/O operations) consumed by programs. Execution time. Records the start and completion times of program execution.
Although accounting monitors provide useful data, there are often problems with their use in
performance modeling. Accounting monitors typically do not capture the use of resources by operating systems. That is, they do not measure any unaccountable (i.e., non-user billable) system overhead. Another problem is the unique way that accounting systems view some special programs, such as database management systems (DBMS) and transaction monitors. These programs have transactions and processes that execute within them. Since accounting systems treat such special programs as single entities, they normally do not collect any information about what is executed inside these programs. However, in order to model transaction response time, information about individual transactions, such as arrival rates and service demands, are required. Thus, special monitors are required to examine the performance of some programs.
Program Analyzers. Program Analyzers are software tools that collect information about the execution
of individual programs. These analyzers can be used to identify the parts of a program that consume significant computing resources [8]. They are capable of observing and recording events internal to the execution of specific programs. In the case of transaction oriented systems, program analyzers provide information such as transaction counts, average transaction response time, mean CPU time per
transaction, mean number of I/O operations per transaction, and transaction mix. Examples of program analyzers include monitors for special programs such as IBM's database products (i.e., DB2 and IMS) and transaction processing products (i.e., CICS).
5.5.3 Hybrid Monitors
The combination of hardware and software monitors results in a hybrid monitor, which shares the best features of both types. In a hybrid monitor, software routines are responsible for sensing events and storing this information in special "monitoring registers". The hardware component of the monitor records the data stored in these registers, avoiding interference in the normal I/O activities of the system. Thus, the advantage of capturing specific job related events (i.e., the primary benefit of software monitors), without placing significant overhead or altering the performance of the system (i.e., primary benefits of hardware monitors), is possible using hybrid monitors. The primary disadvantages associated with hybrid monitors are the requirements of special hardware (e.g., monitoring registers) and more specialized software routines (i.e., to record a more limited set of program events). Unless hybrid monitors are designed as an integral part of the system architecture, their practical use is limited.
5.5.4 Event-trace Monitoring
Any system interrupt, such as an I/O interrupt indicating the completion of a disk read/write operation, can be viewed as an event that changes the state of a computer system. At the operating system level, the state of the system is usually defined as the number of processes that are "at" each system device, either in the device's ready queue, blocked in the device's waiting queue, or executing in the device. Examples of events at this level are OS system calls that changes a process' status (e.g., an I/O request that moves a process from executing in the CPU to the waiting queue at a disk). At a higher level, where the number of transactions in memory represents the state of the system, the completion of a
transaction (e.g., an interrupts to swap out a job) is an event. An event trace monitor collects information and chronicles the occurrence of specific events.
Usually, an event trace software monitor consists of special pieces of code inserted at specific points in the operating system, typically within interrupt service routines. Upon detection of an event, the special code generates a record containing information such as the date, time, and type of event. In addition, the record contains any relevant event-specific data. For instance, a record corresponding to the completion of a process might contain the CPU time used by the process, the number of page faults
• Table of Contents
Performance by Design: Computer Capacity Planning by Example
By Daniel A. Menascé, Virgilio A.F. Almeida, Lawrence W. Dowdy
Publisher: Prentice Hall PTR Pub Date: January 05, 2004
ISBN: 0-13-090673-5 Pages: 552
Individual organizations and society as a whole could face major breakdowns if IT systems do not meet their Quality of Service (QoS) requirements on performance, availability, security, and maintainability. Corporations stand to lose valuable income, and public interests could be put at great risk. System designers and analysts usually do not take QoS requirements into account when designing and/or analyzing computer systems, due mainly to their lack of awareness about the issues that affect
performance and the lack of a framework to reason about performance. This book describes how to map real-life systems (e.g., databases, data centers, e-commerce applications) into analytic performance models. The authors elaborate upon these models, and use them to help the reader thoroughly analyze and better understand potential performance issues.
< Day Day Up >
When the event rate becomes very high, the monitor routines are executed frequently. This may introduce significant overhead in the measurement process. Depending on the events selected and the event rate, the overhead may reach levels as high as 30% or more. Overheads up to 5% are regarded as acceptable for measurement activities [8]. Since the event rate cannot be controlled or predicted by the monitor, the measurement overhead, likewise, becomes unpredictable. This is one of the major
shortcomings of event trace monitors.
5.5.5 Sampling Monitoring
A sampling monitor collects information about a system (i.e., recorded state information) at specified time instants. Instead of being triggered by the occurrence of an internal event such as an I/O interrupt, the data collection routines of a sampling software monitor are triggered by an external timer event. Such events are activated at predetermined times, which are specified prior to the monitoring session. The sampling is driven by timer interrupts, based on a hardware clock.
The overhead introduced by a sampling monitor depends on two factors: the number of variables measured and the frequency of the sampling interval. With the ability to limit both factors, a sampling monitor is able to strictly control its overhead. However, long intervals result in low overhead. On the other hand, if the intervals are too long, the number of samples decreases and the confidence in the data collected, likewise, decreases. Thus, there exists a clear trade-off between overhead and quality of the measurements. Similarly, the higher the sampling rate, the higher the accuracy, and the higher the overhead. When compared to event trace monitoring, sampling provides a less detailed observation of a computer system but at a controllable overhead level. Errors may also be introduced because a certain percentage of potentially important interrupts are masked. For example, if some routines within the operating system cannot be interrupted by the timer, their contribution to the CPU utilization will not be accounted for by a sampling monitor [3].
Sampling monitors typically provide information that can be classified as system-level statistics: for example, the number of processes in execution and resource use, such as CPU and disk utilization. Process level statistics are better captured by event trace monitors, because it is easier to associate events to the start and completion of processes.
< Day Day Up >
• Table of Contents
Performance by Design: Computer Capacity Planning by Example
By Daniel A. Menascé, Virgilio A.F. Almeida, Lawrence W. Dowdy
Publisher: Prentice Hall PTR Pub Date: January 05, 2004
ISBN: 0-13-090673-5 Pages: 552
Individual organizations and society as a whole could face major breakdowns if IT systems do not meet their Quality of Service (QoS) requirements on performance, availability, security, and maintainability. Corporations stand to lose valuable income, and public interests could be put at great risk. System designers and analysts usually do not take QoS requirements into account when designing and/or analyzing computer systems, due mainly to their lack of awareness about the issues that affect
performance and the lack of a framework to reason about performance. This book describes how to map real-life systems (e.g., databases, data centers, e-commerce applications) into analytic performance models. The authors elaborate upon these models, and use them to help the reader thoroughly analyze and better understand potential performance issues.
< Day Day Up > < Day Day Up >
5.6 Measurements Techniques
As illustrated in Fig. 5.6, a measurement process involves three major steps: measurement specification, system instrumentation, and data analysis.
Specify measurements. In this step, the performance variables to be measured are selected. For
example, suppose that the behavior of a specific virtual memory policy is of interest. In this case, performance variables such as the page fault rate, the throughput of paging disks,and the average number of jobs competing for memory space, are required by the system model.
1.
Instrument and gather data. After selecting the variables to be observed, the system is
instrumented to gather the specified measurement data. This involves configuring the tools to measure the specified variables during the observation period and recording the required
information. A computer system can be viewed as a series of layers that create an environment for the execution of application programs, as shown in Fig. 5.6. Depending upon the variables selected, several measurement tools may be required at various layers. For instance, if transaction service demands are required, measurement tools are needed at both the operating system level and at the transaction/DBMS level.
2.
Analyze and transform data. Measurement tools gather potentially huge amounts of raw data,
corresponding to a detailed observation log of system activities. Usually, raw data specify time intervals, event counts, transaction IDs, bytes transferred, and resources consumed. To be useful, these data items have to be mapped to their corresponding logical functions. That is, these bulky data items must be analyzed and transformed into meaningful information. For instance,
information recorded by software measurement tools might include a record for each process that starts or completes during the observation period. These records must be manipulated to yield useful results, such as average execution time, number of processes executed, and device utilizations.
3.