Performance tuning max-spread-checks By default, haproxy tries to spread the start of health checks across the smallest health check interval of all the servers in a farm. The principle is to avoid hammering services running on the same server. But when using large check intervals (10 seconds or more), the last servers in the farm take some time before starting to be tested, which can be a problem. This parameter is used to enforce an upper bound on delay between the first and the last check, even if the servers' check intervals are larger. When servers run with shorter intervals, their intervals will be respected though. maxconn Sets the maximum per-process number of concurrent connections to . It is equivalent to the command-line argument "-n". Proxies will stop accepting connections when this limit is reached. The "ulimit-n" parameter is automatically adjusted according to this value. See also "ulimit-n". Note: the "select" poller cannot reliably use more than 1024 file descriptors on some platforms. If your platform only supports select and reports "select FAILED" on startup, you need to reduce maxconn until it works (slightly below 500 in general). If this value is not set, it will default to the value set in DEFAULT_MAXCONN at build time (reported in haproxy -vv) if no memory limit is enforced, or will be computed based on the memory limit, the buffer size, memory allocated to compression, SSL cache size, and use or not of SSL and the associated maxsslconn (which can also be automatic). maxconnrate Sets the maximum per-process number of connections per second to . Proxies will stop accepting connections when this limit is reached. It can be used to limit the global capacity regardless of each frontend capacity. It is important to note that this can only be used as a service protection measure, as there will not necessarily be a fair share between frontends when the limit is reached, so it's a good idea to also limit each frontend to some value close to its expected share. Also, lowering tune.maxaccept can improve fairness. maxcomprate Sets the maximum per-process input compression rate to kilobytes per second. For each session, if the maximum is reached, the compression level will be decreased during the session. If the maximum is reached at the beginning of a session, the session will not compress at all. If the maximum is not reached, the compression level will be increased up to tune.comp.maxlevel. A value of zero means there is no limit, this is the default value. maxcompcpuusage Sets the maximum CPU usage HAProxy can reach before stopping the compression for new requests or decreasing the compression level of current requests. It works like 'maxcomprate' but measures CPU usage instead of incoming data bandwidth. The value is expressed in percent of the CPU used by haproxy. In case of multiple processes (nbproc > 1), each process manages its individual usage. A value of 100 disable the limit. The default value is 100. Setting a lower value will prevent the compression work from slowing the whole process down and from introducing high latencies. maxpipes Sets the maximum per-process number of pipes to . Currently, pipes are only used by kernel-based tcp splicing. Since a pipe contains two file descriptors, the "ulimit-n" value will be increased accordingly. The default value is maxconn/4, which seems to be more than enough for most heavy usages. The splice code dynamically allocates and releases pipes, and can fall back to standard copy, so setting this value too low may only impact performance. maxsessrate Sets the maximum per-process number of sessions per second to . Proxies will stop accepting connections when this limit is reached. It can be used to limit the global capacity regardless of each frontend capacity. It is important to note that this can only be used as a service protection measure, as there will not necessarily be a fair share between frontends when the limit is reached, so it's a good idea to also limit each frontend to some value close to its expected share. Also, lowering tune.maxaccept can improve fairness. maxsslconn Sets the maximum per-process number of concurrent SSL connections to

. By default there is no SSL-specific limit, which means that the global maxconn setting will apply to all connections. Setting this limit avoids having openssl use too much memory and crash when malloc returns NULL (since it unfortunately does not reliably check for such conditions). Note that the limit applies both to incoming and outgoing connections, so one connection which is deciphered then ciphered accounts for 2 SSL connections. If this value is not set, but a memory limit is enforced, this value will be automatically computed based on the memory limit, maxconn, the buffer size, memory allocated to compression, SSL cache size, and use of SSL in either frontends, backends or both. If neither maxconn nor maxsslconn are specified when there is a memory limit, haproxy will automatically adjust these values so that 100% of the connections can be made over SSL with no risk, and will consider the sides where it is enabled (frontend, backend, both). maxsslrate Sets the maximum per-process number of SSL sessions per second to . SSL listeners will stop accepting connections when this limit is reached. It can be used to limit the global SSL CPU usage regardless of each frontend capacity. It is important to note that this can only be used as a service protection measure, as there will not necessarily be a fair share between frontends when the limit is reached, so it's a good idea to also limit each frontend to some value close to its expected share. It is also important to note that the sessions are accounted before they enter the SSL stack and not after, which also protects the stack against bad handshakes. Also, lowering tune.maxaccept can improve fairness. maxzlibmem Sets the maximum amount of RAM in megabytes per process usable by the zlib. When the maximum amount is reached, future sessions will not compress as long as RAM is unavailable. When sets to 0, there is no limit. The default value is 0. The value is available in bytes on the UNIX socket with "show info" on the line "MaxZlibMemUsage", the memory used by zlib is "ZlibMemUsage" in bytes. noepoll Disables the use of the "epoll" event polling system on Linux. It is equivalent to the command-line argument "-de". The next polling system used will generally be "poll". See also "nopoll". nokqueue Disables the use of the "kqueue" event polling system on BSD. It is equivalent to the command-line argument "-dk". The next polling system used will generally be "poll". See also "nopoll". nopoll Disables the use of the "poll" event polling system. It is equivalent to the command-line argument "-dp". The next polling system used will be "select". It should never be needed to disable "poll" since it's available on all platforms supported by HAProxy. See also "nokqueue" and "noepoll". nosplice Disables the use of kernel tcp splicing between sockets on Linux. It is equivalent to the command line argument "-dS". Data will then be copied using conventional and more portable recv/send calls. Kernel tcp splicing is limited to some very recent instances of kernel 2.6. Most versions between 2.6.25 and 2.6.28 are buggy and will forward corrupted data, so they must not be used. This option makes it easier to globally disable kernel splicing in case of doubt. See also "option splice-auto", "option splice-request" and "option splice-response". nogetaddrinfo Disables the use of getaddrinfo(3) for name resolving. It is equivalent to the command line argument "-dG". Deprecated gethostbyname(3) will be used. noreuseport Disables the use of SO_REUSEPORT - see socket(7). It is equivalent to the command line argument "-dR". spread-checks <0..50, in="" percent=""> Sometimes it is desirable to avoid sending agent and health checks to servers at exact intervals, for instance when many logical servers are located on the same physical server. With the help of this parameter, it becomes possible to add some randomness in the check interval between 0 and +/- 50%. A value between 2 and 5 seems to show good results. The default value remains at 0. tune.buffers.limit Sets a hard limit on the number of buffers which may be allocated per process. The default value is zero which means unlimited. The minimum non-zero value will always be greater than "tune.buffers.reserve" and should ideally always be about twice as large. Forcing this value can be particularly useful to limit the amount of memory a process may take, while retaining a sane behaviour. When this limit is reached, sessions which need a buffer wait for another one to be released by another session. Since buffers are dynamically allocated and released, the waiting time is very short and not perceptible provided that limits remain reasonable. In fact sometimes reducing the limit may even increase performance by increasing the CPU cache's efficiency. Tests have shown good results on average HTTP traffic with a limit to 1/10 of the expected global maxconn setting, which also significantly reduces memory usage. The memory savings come from the fact that a number of connections will not allocate 2*tune.bufsize. It is best not to touch this value unless advised to do so by an haproxy core developer. tune.buffers.reserve Sets the number of buffers which are pre-allocated and reserved for use only during memory shortage conditions resulting in failed memory allocations. The minimum value is 2 and is also the default. There is no reason a user would want to change this value, it's mostly aimed at haproxy core developers. tune.bufsize Sets the buffer size to this size (in bytes). Lower values allow more sessions to coexist in the same amount of RAM, and higher values allow some applications with very large cookies to work. The default value is 16384 and can be changed at build time. It is strongly recommended not to change this from the default value, as very low values will break some services such as statistics, and values larger than default size will increase memory usage, possibly causing the system to run out of memory. At least the global maxconn parameter should be decreased by the same factor as this one is increased. If HTTP request is larger than (tune.bufsize - tune.maxrewrite), haproxy will return HTTP 400 (Bad Request) error. Similarly if an HTTP response is larger than this size, haproxy will return HTTP 502 (Bad Gateway). tune.chksize Sets the check buffer size to this size (in bytes). Higher values may help find string or regex patterns in very large pages, though doing so may imply more memory and CPU usage. The default value is 16384 and can be changed at build time. It is not recommended to change this value, but to use better checks whenever possible. tune.comp.maxlevel Sets the maximum compression level. The compression level affects CPU usage during compression. This value affects CPU usage during compression. Each session using compression initializes the compression algorithm with this value. The default value is 1. tune.http.cookielen Sets the maximum length of captured cookies. This is the maximum value that the "capture cookie xxx len yyy" will be allowed to take, and any upper value will automatically be truncated to this one. It is important not to set too high a value because all cookie captures still allocate this size whatever their configured value (they share a same pool). This value is per request per response, so the memory allocated is twice this value per connection. When not specified, the limit is set to 63 characters. It is recommended not to change this value. tune.http.maxhdr Sets the maximum number of headers in a request. When a request comes with a number of headers greater than this value (including the first line), it is rejected with a "400 Bad Request" status code. Similarly, too large responses are blocked with "502 Bad Gateway". The default value is 101, which is enough for all usages, considering that the widely deployed Apache server uses the same limit. It can be useful to push this limit further to temporarily allow a buggy application to work by the time it gets fixed. Keep in mind that each new header consumes 32bits of memory for each session, so don't push this limit too high. tune.idletimer Sets the duration after which haproxy will consider that an empty buffer is probably associated with an idle stream. This is used to optimally adjust some packet sizes while forwarding large and small data alternatively. The decision to use splice() or to send large buffers in SSL is modulated by this parameter. The value is in milliseconds between 0 and 65535. A value of zero means that haproxy will not try to detect idle streams. The default is 1000, which seems to correctly detect end user pauses (eg: read a page before clicking). There should be not reason for changing this value. Please check tune.ssl.maxrecord below. tune.lua.forced-yield This directive forces the Lua engine to execute a yield each of instructions executed. This permits interrupting a long script and allows the HAProxy scheduler to process other tasks like accepting connections or forwarding traffic. The default value is 10000 instructions. If HAProxy often executes some Lua code but more reactivity is required, this value can be lowered. If the Lua code is quite long and its result is absolutely required to process the data, the can be increased. tune.lua.maxmem Sets the maximum amount of RAM in megabytes per process usable by Lua. By default it is zero which means unlimited. It is important to set a limit to ensure that a bug in a script will not result in the system running out of memory. tune.lua.session-timeout This is the execution timeout for the Lua sessions. This is useful for preventing infinite loops or spending too much time in Lua. This timeout counts only the pure Lua runtime. If the Lua does a sleep, the sleep is not taked in account. The default timeout is 4s. tune.lua.task-timeout Purpose is the same as "tune.lua.session-timeout", but this timeout is dedicated to the tasks. By default, this timeout isn't set because a task may remain alive during of the lifetime of HAProxy. For example, a task used to check servers. tune.lua.service-timeout This is the execution timeout for the Lua services. This is useful for preventing infinite loops or spending too much time in Lua. This timeout counts only the pure Lua runtime. If the Lua does a sleep, the sleep is not taked in account. The default timeout is 4s. tune.maxaccept Sets the maximum number of consecutive connections a process may accept in a row before switching to other work. In single process mode, higher numbers give better performance at high connection rates. However in multi-process modes, keeping a bit of fairness between processes generally is better to increase performance. This value applies individually to each listener, so that the number of processes a listener is bound to is taken into account. This value defaults to 64. In multi-process mode, it is divided by twice the number of processes the listener is bound to. Setting this value to -1 completely disables the limitation. It should normally not be needed to tweak this value. tune.maxpollevents Sets the maximum amount of events that can be processed at once in a call to the polling system. The default value is adapted to the operating system. It has been noticed that reducing it below 200 tends to slightly decrease latency at the expense of network bandwidth, and increasing it above 200 tends to trade latency for slightly increased bandwidth. tune.maxrewrite Sets the reserved buffer space to this size in bytes. The reserved space is used for header rewriting or appending. The first reads on sockets will never fill more than bufsize-maxrewrite. Historically it has defaulted to half of bufsize, though that does not make much sense since there are rarely large numbers of headers to add. Setting it too high prevents processing of large requests or responses. Setting it too low prevents addition of new headers to already large requests or to POST requests. It is generally wise to set it to about 1024. It is automatically readjusted to half of bufsize if it is larger than that. This means you don't have to worry about it when changing bufsize. tune.pattern.cache-size Sets the size of the pattern lookup cache to entries. This is an LRU cache which reminds previous lookups and their results. It is used by ACLs and maps on slow pattern lookups, namely the ones using the "sub", "reg", "dir", "dom", "end", "bin" match methods as well as the case-insensitive strings. It applies to pattern expressions which means that it will be able to memorize the result of a lookup among all the patterns specified on a configuration line (including all those loaded from files). It automatically invalidates entries which are updated using HTTP actions or on the CLI. The default cache size is set to 10000 entries, which limits its footprint to about 5 MB on 32-bit systems and 8 MB on 64-bit systems. There is a very low risk of collision in this cache, which is in the order of the size of the cache divided by 2^64. Typically, at 10000 requests per second with the default cache size of 10000 entries, there's 1% chance that a brute force attack could cause a single collision after 60 years, or 0.1% after 6 years. This is considered much lower than the risk of a memory corruption caused by aging components. If this is not acceptable, the cache can be disabled by setting this parameter to 0. tune.pipesize Sets the kernel pipe buffer size to this size (in bytes). By default, pipes are the default size for the system. But sometimes when using TCP splicing, it can improve performance to increase pipe sizes, especially if it is suspected that pipes are not filled and that many calls to splice() are performed. This has an impact on the kernel's memory footprint, so this must not be changed if impacts are not understood. tune.rcvbuf.client tune.rcvbuf.server Forces the kernel socket receive buffer size on the client or the server side to the specified value in bytes. This value applies to all TCP/HTTP frontends and backends. It should normally never be set, and the default size (0) lets the kernel autotune this value depending on the amount of available memory. However it can sometimes help to set it to very low values (eg: 4096) in order to save kernel memory by preventing it from buffering too large amounts of received data. Lower values will significantly increase CPU usage though. tune.recv_enough Haproxy uses some hints to detect that a short read indicates the end of the socket buffers. One of them is that a read returns more than bytes, which defaults to 10136 (7 segments of 1448 each). This default value may be changed by this setting to better deal with workloads involving lots of short messages such as telnet or SSH sessions. tune.sndbuf.client tune.sndbuf.server Forces the kernel socket send buffer size on the client or the server side to the specified value in bytes. This value applies to all TCP/HTTP frontends and backends. It should normally never be set, and the default size (0) lets the kernel autotune this value depending on the amount of available memory. However it can sometimes help to set it to very low values (eg: 4096) in order to save kernel memory by preventing it from buffering too large amounts of received data. Lower values will significantly increase CPU usage though. Another use case is to prevent write timeouts with extremely slow clients due to the kernel waiting for a large part of the buffer to be read before notifying haproxy again. tune.ssl.cachesize Sets the size of the global SSL session cache, in a number of blocks. A block is large enough to contain an encoded session without peer certificate. An encoded session with peer certificate is stored in multiple blocks depending on the size of the peer certificate. A block uses approximately 200 bytes of memory. The default value may be forced at build time, otherwise defaults to 20000. When the cache is full, the most idle entries are purged and reassigned. Higher values reduce the occurrence of such a purge, hence the number of CPU-intensive SSL handshakes by ensuring that all users keep their session as long as possible. All entries are pre-allocated upon startup and are shared between all processes if "nbproc" is greater than 1. Setting this value to 0 disables the SSL session cache. tune.ssl.force-private-cache This boolean disables SSL session cache sharing between all processes. It should normally not be used since it will force many renegotiations due to clients hitting a random process. But it may be required on some operating systems where none of the SSL cache synchronization method may be used. In this case, adding a first layer of hash-based load balancing before the SSL layer might limit the impact of the lack of session sharing. tune.ssl.lifetime Sets how long a cached SSL session may remain valid. This time is expressed in seconds and defaults to 300 (5 min). It is important to understand that it does not guarantee that sessions will last that long, because if the cache is full, the longest idle sessions will be purged despite their configured lifetime. The real usefulness of this setting is to prevent sessions from being used for too long. tune.ssl.maxrecord Sets the maximum amount of bytes passed to SSL_write() at a time. Default value 0 means there is no limit. Over SSL/TLS, the client can decipher the data only once it has received a full record. With large records, it means that clients might have to download up to 16kB of data before starting to process them. Limiting the value can improve page load times on browsers located over high latency or low bandwidth networks. It is suggested to find optimal values which fit into 1 or 2 TCP segments (generally 1448 bytes over Ethernet with TCP timestamps enabled, or 1460 when timestamps are disabled), keeping in mind that SSL/TLS add some overhead. Typical values of 1419 and 2859 gave good results during tests. Use "strace -e trace=write" to find the best value. Haproxy will automatically switch to this setting after an idle stream has been detected (see tune.idletimer above). tune.ssl.default-dh-param Sets the maximum size of the Diffie-Hellman parameters used for generating the ephemeral/temporary Diffie-Hellman key in case of DHE key exchange. The final size will try to match the size of the server's RSA (or DSA) key (e.g, a 2048 bits temporary DH key for a 2048 bits RSA key), but will not exceed this maximum value. Default value if 1024. Only 1024 or higher values are allowed. Higher values will increase the CPU load, and values greater than 1024 bits are not supported by Java 7 and earlier clients. This value is not used if static Diffie-Hellman parameters are supplied either directly in the certificate file or by using the ssl-dh-param-file parameter. tune.ssl.ssl-ctx-cache-size Sets the size of the cache used to store generated certificates to entries. This is a LRU cache. Because generating a SSL certificate dynamically is expensive, they are cached. The default cache size is set to 1000 entries. tune.vars.global-max-size tune.vars.proc-max-size tune.vars.reqres-max-size tune.vars.sess-max-size tune.vars.txn-max-size These five tunes help to manage the maximum amount of memory used by the variables system. "global" limits the overall amount of memory available for all scopes. "proc" limits the memory for the process scope, "sess" limits the memory for the session scope, "txn" for the transaction scope, and "reqres" limits the memory for each request or response processing. Memory accounting is hierarchical, meaning more coarse grained limits include the finer grained ones: "proc" includes "sess", "sess" includes "txn", and "txn" includes "reqres".

For example, when "tune.vars.sess-max-size" is limited to 100, "tune.vars.txn-max-size" and "tune.vars.reqres-max-size" cannot exceed 100 either. If we create a variable "txn.var" that contains 100 bytes, all available space is consumed. Notice that exceeding the limits at runtime will not result in an error message, but values might be cut off or corrupted. So make sure to accurately plan for the amount of space needed to store all your variables. tune.zlib.memlevel Sets the memLevel parameter in zlib initialization for each session. It defines how much memory should be allocated for the internal compression state. A value of 1 uses minimum memory but is slow and reduces compression ratio, a value of 9 uses maximum memory for optimal speed. Can be a value between 1 and 9. The default value is 8. tune.zlib.windowsize Sets the window size (the size of the history buffer) as a parameter of the zlib initialization for each session. Larger values of this parameter result in better compression at the expense of memory usage. Can be a value between 8 and 15. The default value is 15.

性能调优

results matching ""

No results matching ""