• No se han encontrado resultados

Performance issues manifest usually in the way of pages taking time to load or upload to or download from a site that is timing out. To troubleshoot these:

1. First, check what the direct access experience looks like and confirm that the issue is only seen when going via the NetScaler.

2. Obtain a trace. The key here again is simultaneous traces. Taking them simultaneously will save you time in the long run as you start questioning where in the path the bottleneck is.

Once you have the trace, look for the following:

• MSS issues: MSS (Maximum Segment Size) is the maximum size for a TCP segment that a receiver advertising the value can receive. The NetScaler can be configured to advertise different values but by default it advertises 1,460. If you are seeing that one of the entities is advertising a much smaller number here, it is something to consider as a cause for performance issues:

MSS will be shown on the SYN from Client and SYN/ACK from server right in the Info section, but you can also look this up from the Options section of the TCP Frame:

The Options part in the TCP handshake packets will also show you vital information, such as whether Window scaling is enabled and what the scale factor (multiplier) is. If the application experiencing delays involves large transfers, you can try increasing the scale factor so that the receive windows are expanded to accept more data per acknowledgement.

[ 27 ]

• Networking issues: If you are plugging more than one network interface into the same broadcast domain, ensure that you are not introducing a loop.

This can very easily bring performance to its knees. Common issues are misconfigured or missing vlans, NIC flaps, or MBF-related issues. These are covered in greater detail in Chapter 5, Networking.

• TCP Window issues: Is it possible that the client, server, or the NetScaler running out of receive window? Usually, the Window size that each of the parties can receive will be 64 K to start with and decreases as it accumulates data that it needs to send, either onwards if it's the NetScaler, or to the application if it's the client or the server. If one of the parties is slow in consuming what is already sent to it, you could see Zero Window situations creeping in:

zero windows

Occasional zero windows are not a serious problem, as long as the receiver is able to quickly empty the receive buffer and send out a notification that it has free buffers to accept more data. The problem is when the zero window situation persists long enough that the sender has to give up, or if timeouts are getting hit. Take the following screenshot for example:

[ 28 ]

Here, SNIP has advertised a zero window to server, it cannot accept any further data, and the server is obliged to wait. If it thinks it has waited long enough, it will even send a probe to see whether the NetScaler is ready to accept more data (you can find such probes using the Wireshark filter, tcp.analysis.

zero_window_probe). NetScaler on its part, waits for a packet from the client indicating that the client is ready to accept more data or that it has processed the data it has previously received. That confirmation arrives in the form of an ACK. Following this ACK, NetScaler SNIP sends out a TCP Window update, telling the server that is ready to accept more data. The key is whether this recovery happens fast enough, if it doesn't the performance will drop.

Also, a high number of zero windows from the client can cause the NetScaler to reset the connection in order to protect its memory from saturation as that kind of TCP pattern is characteristic of a known TCP attack (sockstress).

This protection by the way is toggled using the command: set ns tcpparam -limitedPersist ENABLED/DISABLED.

• Intermediate single packet drops: A related issue is the situation where an intermediate device (such as a firewall) keeps dropping a particular packet seeing something suspicious about the packet. The difference compared to the earlier firewall issue we talked about is that this wouldn't be a simple 100% drop of packets, which actually is easier to spot. Instead, a large packet or simply an ACK from client or server is dropped continuously, which causes a retransmission loop until the connection fails.

These issues are best diagnosed by a trace and manually calculating SEQ and ACK numbers to find out whether each receiver is receiving and ACKing what the sender sends and whether that ACK is reaching the sender.

Some amount of retransmissions or DupAcks are inevitable on any busy production network; however, if you are seeing a high number of them in the same TCP stream, that is a cause for concern.

Also, if you are seeing ICMP messages indicating that the packets are too large, please enable PMTUD in the list modes to avoid fragmentation or drop due to unable-to-fragment issues. We discussed PMTUD in the first chapter.

[ 29 ]

• Surge queue building up on the service: In the output that follows, we can see that the requests are ending up in the surge queue. In the following scenario, that I have set up to demonstrate such a situation, I have set the MaxClients parameter to 1. This is telling the NetScaler not to send more requests to a service that is already processing one:

[ 30 ]

"Is the solution to immediately remove the MaxClient setting?" That

depends. The value you configure here is to protect the servers from getting saturated and preventing an extremely degraded performance or worse, the server from crashing due to the load. So a deeper understanding of what the server can handle is needed (working with the Server vendor if needed) to choose an appropriate value.

• NetScaler resource issues: Check how the CPU and memory are doing. The NetScaler is a hardened appliance with a very well-tuned TCP stack and, as such, it can handle millions of connections before it starts becoming a bottleneck. Nevertheless, you can certainly hit situations where the resources on the NetScaler are saturated. These can be in the form of memory leaks or CPU spikes. Please check out Chapter 8, Troubleshooting the NetScaler System later in the book where I cover these in detail.

Documento similar