• No se han encontrado resultados

Evaluación estadística para replicar el caso base

In document MONTERREY TECNOLÓGICO DE (página 59-77)

4 RESULTADOS Y DISCUSIÓN DE RESULTADOS

4.3 Evaluación estadística para replicar el caso base

This section covers troubleshooting of transaction problems. From an end user perspective, an application runs sluggishly, is unresponsive or can even seem to hang if there are issues with uncommitted transactions, long-lived cursors blocking garbage collection, a high number of active versions or blocked transactions.

3.8.1 Blocked Transactions

Blocked transactions are write transactions that are unable to be further processed because they need to acquire transactional locks (record or table locks) which are currently held by another write transaction. Note that transactions can also be blocked waiting for physical resources like network or disk. Those situations are not covered in this section.

3.8.1.1 Identify and Assess Blocked Transaction Issues

The first signs of blocked transactions are poor application response or alerts 49 or 59 are raised.

The initial indicators of blocked transactions are given by:

● Users reporting bad application responsiveness

● Alert 49 - Long-running blocking situations

● Alert 59 - Percentage of transactions blocked

To confirm the database performance is harmed by blocked transactions, you should check the following SAP HANA studio monitors under the Performance tab:

Load Monitor

Figure 19: SAP HANA Studio Load Monitor

The Blocked Transactions graph shows how many blocked transactions currently exist and existed in the past to a certain extent. See Load Monitoring.

Job Progress Monitor

To further track down the issue, look at the Job Progress monitor. It shows currently running SAP HANA background processes like Delta Table Merge. Since the Delta Table Merge needs to lock tables to proceed, it is a common cause for blocked transactions. Another job display by this monitor is the savepoint write which needs to pull a global database lock in its critical phase. See Job Progress Monitoring and Savepoint

Performance.

Session Monitor

The Session Monitor lists all currently opened SQL sessions (meaning user connections). In the context of blocked transaction troubleshooting, the columns “Blocked by Connection Id” and “Blocks No. of

Transactions” are of special interest. The first tells you whether the session is blocked by another session and identifies the ID of the blocking one. The latter gives you the corresponding information if a session blocks other sessions, and how many transactions are affected. See Session Monitoring.

Figure 20: SAP HANA Session Monitor

Blocked Transaction Monitor

The Blocked Transaction Monitor is the next drill down step. It only lists those transactions that are currently blocked. The ordering is done via a blocking/blocked relation. That means transactions that are blockers are highlighted. Directly beneath the blocked transaction are displayed:

Figure 21: Blocked Transaction Monitor

Example: In the figure above, you see transaction 126 (green) blocking multiple other transactions (red). Note that in this example transaction 126 was initiated by remote transaction 77 on another node. That means transaction 77 is the root of the blocked transaction chain. See Blocked Transaction Monitoring.

Thread Monitor

The Thread Monitor allows the most fine-grained view into the current situation by listing all threads in the system. Note that it is usually not necessary to drill into that level of detail. Threads contributing to a transaction that is currently blocked are marked by a warning sign in the “Status” column. To get additional information about the blocking situation, hover the mouse over the warning sign. See Thread Monitoring.

Figure 22: Threads contributing to Blocked Transactions

Related Information

Load Monitoring [page 140]

Job Progress Monitoring [page 139]

Savepoint Performance [page 50]

Session Monitoring [page 138]

Blocked Transaction Monitoring [page 137]

3.8.1.2 Troubleshooting Blocked Transactions

When troubleshooting blocked transactions, it is helpful to differentiate between situations where only single or a few transactions are blocked from the situation where a high percentage of all transactions are blocked.

3.8.1.2.1 Single or Few Transactions are Blocked

If you identified only a single or a few blocking transactions, there is likely an issue on application side.

A usual pattern is a flaw in the application coding that does not commit a write transaction. Such a transaction will be a blocker for any other transaction that needs to access the same database object. To release the situation you have to close the blocking transaction.

There are several possibilities to achieve this:

● Contact the Application User

The Session Monitor allows you to identify the user of the application. You can find this information in the

“Database User” column or, in case the application has its own user management (for example, SAP BW), in the “Application User” column. Contact the user and ask him whether he can close the application.

● Contact the Application Developer

As a follow-up, the author of the application should be contacted whether such situations can be avoided in the future by changing the application code.

3.8.1.2.1.1 Cancel the Session

If you are not able to contact the user to have them cancel the session, you can also cancel the session in the context menu of the Session Monitor. The current transaction will be rolled back.

The session cancellation may take some time to succeed. If it takes longer than 30 seconds, consider this as a bug and contact development support.

3.8.1.2.1.2 Kill the Client Appication

In case the session cancellation takes too long or does not complete at all, you can kill the client process that opened the session. This will terminate the blocking transaction as well.

As a prerequisite, you have to have access to the client machine. The information needed for this task can be retrieved from the Session Monitor.

See columns “Client IP” and “Client Process ID” to determine the host and process to be killed. Note that killing the client application is safe from a database consistency standpoint, the current transaction will be rolled back gracefully.

3.8.1.2.2 Many Transactions are Blocked

In the case that a large amount of transactions are blocked, the troubleshooting should take a slightly different approach.

First you need to determine whether there is a single or few blocking transactions that block a large amount of other transactions. For this, open the Blocked Transaction Monitor and check the amount of blocking

transactions. If you assess there is only a few blocking transactions, use the techniques described in Single of Few Transactions are Blocked to resolve the situation.

If there are many transactions in a blocking state, you need to find out whether a specific access pattern causes the situation. In case that multiple transactions try to access the same database objects with write operations, they block each other. To check if this situation exists, open the Blocked Transaction Monitor and analyze the “Waiting Schema Name”, “Waiting Object Name” and “Waiting Record Id” columns. If you find a fair amount of blocking transactions that block many other transactions you need to investigate if the following is possible:

● Change the client application(s) to avoid the access pattern

● If a background job is running that issues many write transactions (for example, a data load job):

Reschedule to a period with a low user load

● Partition tables that are accessed frequently to avoid clashes. See the SAP HANA Administration Guide for more details on partitioning.

In case you cannot identify specific transactions or specific database objects that lead to transactions being blocked, you have to assume a problem with the database itself or its configuration. One example is an issue with long savepoint durations. See Savepoint Performance for troubleshooting such issues.

Related Information

Single or Few Transactions are Blocked [page 76]

Savepoint Performance [page 50]

SAP HANA Administration Guide

3.8.2 Troubleshooting Blocked Transaction Issues that Occurred in the Past

Finding the root cause of blocked transaction situations that you have resolved is more difficult than troubleshooting issues that are currently happening. Tools such as the Load Monitor, system views and the SQL Plan cache are available to help you.

First use the Load Monitor to isolate the exact time frame where the issue happened. Using that information, investigate what happened at this specific time frame. You should check the following monitoring and StatisticServer views:

● _SYS_STATISTICS.HOST_BLOCKED_TRANSACTIONS: Analyze the columns

“WAITING_SCHEMA_NAME”, “WAITING_TABLE_NAME” and “WAITING_RECORD_ID” to identify the database objects that lead to blocked transactions

● SYS.M_DELTA_MERGE_STATISTICS: The column “START_TIME” and “EXECUTION_TIME” provide you with the information if there was a Delta Table Merge running A longer history can be found in the StatisticServer table _SYS_STATISTICS.HOST_DELTA_MERGE_STATISTICS

● SYS.SAVEPOINTS: Check if a savepoint was written during the time period. A longer history can be found in _SYS_STATISTICS.HOST_SAVEPOINTS

In addition the SAP HANA studio SQL Plan Cache monitor may be able to provide information about the statements that were involved in the situation:

Figure 23: Plan Cache Monitor

Only check entries that have “TOTAL_LOCK_WAIT_COUNT” > 0. For those entries, compare the column

“MAX_CURSOR_DURATION” against “AVG_CURSOR_DURATION”. If there is a significant difference, there was at least one situation where the transactions took much longer than average. This can be an indication that it was involved in the situation.

3.8.3 Multiversion Concurrency Control (MVCC) Issues

In this section you will learn how to troubleshoot issues arising from MVCC.

Multiversion Concurrency Control (MVCC) is a concept that ensures transactional data consistency by isolating transactions that are accessing the same data at the same time.

To do so, multiple versions of a record are kept in parallel. Issues with MVCC are usually caused by a high number of active versions. Old versions of data records are no longer needed if they are no longer part of a snapshot that can be seen by any running transaction. These versions are obsolete and need to be removed from time to time to free up memory.

This process is called Garbage Collection (GC) or Version Consolidation. It can happen that a transaction is blocking the garbage collection. The consequence is a high number of active versions and that can lead to system slow-down or out of memory issues.

3.8.3.1 Row Store Tables

Garbage collection is triggered after a transaction is committed and also periodically (every hour by default).

A transaction that is currently committing can be identified in the Threads tab (see System Performance Analysis). The Thread type will be “SqlExecutor” and the Thread method “commit”.

The periodic garbage collection can be identified by Thread Type” MVCCGarbageCollector”.

Note that the periodic garbage collection interval can be configured in the indexserver.ini file transaction section with the parameter mvcc_aged_checker_timeout.

Related Information

System Performance Analysis [page 134]

3.8.3.2 MVCC Problem Identification

There are a number of indicators of MVCC problems to check for.

Problems with high number of active versions can be identified by

● Users report an increase of response times

● The indexserver trace contains "There are too many un-collected versions. The transaction blocks the garbage collection of HANA database."

● By checking “Active Versions” in the Load Monitor (see Performance tab)

Figure 24: Load Monitor Showing Garbage Collection

Transactions blocking garbage collection can originate from:

● Long-running or unclosed cursors

● Long-running transactions with isolation mode “serializable” or ”repeatable read”

● Hanging threads

In order to validate there is a problem with MVCC, check the number of active versions in the Row StoreMVCC manager monitoring view. Note that in multihost environment, you have to check the master host.

select * from m_mvcc_tables where host='ld9989' and port='30003' and (name='NUM_VERSIONS' or name='MAX_VERSIONS_PER_RECORD' or

name='TABLE_ID_OF_MAX_NUM_VERSIONS');

Figure 25: MVCC Information on a Healthy System

If the number of active versions (NUM_VERSIONS) is greater than eight million, it is considered a problem and an overall slowdown of the system can be experienced. Similarly, if the maximum number of versions per record (MAX_VERSIONS_PER_RECORD) exceeds 8,000,000, this should be treated as a problem and a slowdown of accesses to a specific table is expected. Use TABLE_ID_OF_MAX_NUM_VERSIONS and join it against the SYS.TABLES system view to determine the table which is having the problem.

Related Information

Performance Trace [page 166]

3.8.3.3 Analysis of MVCC Issues

You have to find which transactions are blocking the garbage collection and to which connection they are related.

The following queries will return the transaction that may block the Garbage Collection. You have to check both.

SELECT top 1 host, port, connection_id, transaction_id, update_transaction_id, primary_transaction_id, transaction_type, isolation_level FROM M_TRANSACTIONS WHERE

MIN_MVCC_SNAPSHOT_TIMESTAMP > 0 order by min_mvcc_snapshot_timestamp desc;

SELECT top 1 host, port, connection_id, transaction_id, update_transaction_id, primary_transaction_id, transaction_type, isolation_level FROM M_TRANSACTIONS WHERE

MIN_MVCC_SNAPSHOT_TIMESTAMP = (SELECT MIN(VALUE) FROM M_MVCC_TABLES WHERE NAME = 'MIN_SNAPSHOT_TS') order by min_mvcc_snapshot_timestamp desc;

Figure 26: User Transaction Possibly Blocking Garbage Collection

In case of a user transaction being the candidate (TRANSACTION_TYPE=’USER TRANSACTION’), you can directly determine the connection ID the transaction belongs to (see an example in the Figure above).

Figure 27: External Transaction Possibly Blocking Garbage Collection

If the candidate’s transaction type is ‘EXTERNAL TRANSACTION’, use the following query to find out which other transaction spawned the candidate and determine its connection ID.

SELECT t.connection_id AS "Kill this connection id",

t.transaction_id AS "Belonging to user transaction id", e.transaction_id AS "To get rid of external transaction id"

FROM m_transactions t JOIN m_transactions e ON

e.primary_transaction_id = t.transaction_id AND e.volume_id = t.volume_id

WHERE e.transaction_type = 'EXTERNAL TRANSACTION' and e.transaction_id = <GC blocker transaction id>;

3.8.3.4 Solution of MVCC Issues

Solving MVCC issues is similar to solving blocked transaction issues. Use the following approaches in the given order for transactions where you know the connection ID.

1. Contact the user to stop his activity

2. Cancel the statement/cancel internal transaction 3. Cancel connection

4. Kill the client application

Note

There is no guaranteed that these measures will stop a transaction which blocks the garbage collection. If that is the case, contact development support immediately to get further help

Related Information

Resolving CPU Related Issues [page 38]

3.8.4 Version Garbage Collection Issues

Alerts 73, 74, and 75 help you to identify and resolve version space overflow issues.

Context

The following steps allow you to check whether or not the issue you have is related to a “version space overflow/skew” situation.

Procedure

1. The first step is to check the alerts.

The following alerts are indicators that could be the problem:

○ Alert 73 Rowstore version space overflow, determines the overflow ratio of the rowstore version space.

○ Alert 74 Metadata version space overflow, determines the overflow ratio of the metadata version space.

○ Alert 75 Rowstore version space skew, determines whether the rowstore version chain is too long.

If you use extended storage, a version space overflow may manifest as an out of space error for the delta dbspace, and will not trigger these alerts. See Resolve Out of Space Errors for Delta Dbspace in SAP HANA Dynamic Tiering: Administration Guide.

2. Identify the statement and connection blocking garbage collection.

a. Identify the connection or transaction that is blocking version garbage collection.

The System Information tab of SAP HANA studio provides a set of tabular views to display blocking connections based on pre-defined SQL queries.

○ The view MVCC Blocker Connection shows connections that may be blocking garbage collection.

○ The view MVCC Blocker Statement shows statements that may be blocking garbage collection.

○ The view MVCC Blocker Transaction shows transactions that may be blocking garbage collection.

In the first row of the table shown below if there is a "global" garbage collection blocker whose IDLE_TIME > 3600 seconds (1 hour) then investigate what the statement is doing and take the necessary corrective action.

In the second row, a table level garbage collection blocker whose IDLE_TIME > 3600 seconds (1 hour) can be seem. In this case you can query the following monitoring view to check how many versions the related table TEST2 has.

SELECT * FROM M_TABLE_SNAPSHOTS WHERE TABLE_NAME = 'TEST2' AND START_MVCC_TIMESTAMP = 142024

If the results of the query shows that VERSION_COUNT > 1 million, the blocking statement can cause a performance drop for table updates. In this case, investigate what the statement is doing and take the necessary corrective action. However, if VERSION_COUNT is less than 10,000 its impact on performance is negligible.

Note that the table only shows selected columns from the results of running MVCC Blocker Statement for the purposes of this example. If there is no blocker in MVCC Blocker Statement, then use MVCC Blocker Transaction. The blocker would be likely one of internal/external/serializable transaction.

Table 8:

GLOBAL 4000 200285 142113 SELECT * FROM

TEST1

Check VER­

SION_COUNT in M_MVCC_OVER­

VIEW

TABLE 5000 200375 142024 SELECT * FROM

TEST2 ORDER

Type IDLE_TIME_SEC CONNEC­

TION_ID

START_MVCC_

TIMESTAMP

STATE­

MENT_STRING INFO

TABLE 100 200478 142029 SELECT * FROM

TEST0 ORDER BY A

Check VER­

SION_COUNT in M_MVCC_TA­

BLE_SNAP-SHOTS

3. Kill the statement and/or connection that is blocking garbage collection.

a. Kill the connection

Check MVCC Blocker Statement to find out which statement is blocking version garbage collection and investigate what the blocker is doing. If the TYPE column is GLOBAL, then it is a global version garbage collection blocker. The column IDLE_TIME_SEC will help your decision. If TYPE is TABLE, it’s a blocker of the specific table. With M_TABLE_SNAPSHOTS, you can find out the number of versions for each table.

To kill the blocking connection use ALTER SYSTEM DISCONNECT SESSION 'CONNECTION_ID' b. Kill the transaction, if killing the connection does not succeed or CONNECTION_ID does not exist..

Check MVCC Blocker Transaction to find out which transaction blocks global version garbage

collection and investigate what the blocker is doing. The column LIFE_TIME_SEC column will help your decision.

To cancel a transaction that you have identified as the cause of the blockage open the Emergency Information tab in the SAP HANA studio, right click on the transaction in the tabular view and choose Cancel Transaction <TRANSACTION ID>.

Note

Be aware that TRANSACTION ID is only unique on a specific node and therefore the ID might be reused. So make sure you are connected to the right node before you cancel the transaction.

You can find a version garbage collection blocker with thread info using these queries.

MVCC Blocker Statement

WITH STORED_MVCC_BLOCKER_STATEMENTS AS (SELECT

CASE

WHEN A.START_MVCC_TIMESTAMP = B.MIN_MVCC_SNAPSHOT_TIMESTAMP THEN 'GLOBAL'

ELSE 'TABLE' END TYPE,

CASE

WHEN A.START_MVCC_TIMESTAMP = B.MIN_MVCC_SNAPSHOT_TIMESTAMP THEN NULL

ELSE 'check M_TABLE_SNAPSHOTS for details' END INFO,

SECONDS_BETWEEN(A.LAST_ACTION_TIME, CURRENT_TIMESTAMP) IDLE_TIME_SEC, A.*

A.START_MVCC_TIMESTAMP > 0 AND A.STATEMENT_STATUS <> 'NONE'

AND A.START_MVCC_TIMESTAMP <= B.MIN_MVCC_SNAPSHOT_TIMESTAMP ORDER BY

IDLE_TIME_SEC DESC) SELECT *

FROM STORED_MVCC_BLOCKER_STATEMENTS AS A, M_SERVICE_THREADS AS B

WHERE

A.CONNECTION_ID = B.CONNECTION_ID;

MVCC Blocker Transaction

WITH STORED_MVCC_BLOCKER_TRANSACTIONS AS (SELECT SECONDS_BETWEEN(A.START_TIME,

CURRENT_TIMESTAMP) LIFE_TIME_SEC, A.*

A.MIN_MVCC_SNAPSHOT_TIMESTAMP > 0

AND A.MIN_MVCC_SNAPSHOT_TIMESTAMP = B.MIN_MVCC_SNAPSHOT_TIMESTAMP ORDER BY

LIFE_TIME_SEC DESC) SELECT *

FROM STORED_MVCC_BLOCKER_TRANSACTIONS AS A, M_SERVICE_THREADS AS B

WHERE

A.CONNECTION_ID = B.CONNECTION_ID;

MVCC Blocker Connection

CONNECTION_ID IN (SELECT

CONNECTION_ID IN (SELECT

In document MONTERREY TECNOLÓGICO DE (página 59-77)

Documento similar