VMware vFabric RabbitMQ clients cannot connect to vFabric RabbitMQ server (2004788)
Run the command top to determine if RabbitMQ is spinning frantically. Look for a process named bean.smp and see if it is consuming large amounts of CPU (70% to 100%) while the problem is occurring. Many things can cause high CPU usage. To narrow down the cause:
Check if your Rabbit clients are processing large messages.
- Run the command:
Where diagnostic refers to connections, channels, exchanges, queues. For complete listings, see http://www.rabbitmq.com/man/rabbitmqctl.1.man.html.
If you do not see a large difference between the command output and the client intended behavior on the number exchanges, queues, unprocessed messages in queue after the CRASH_REPORT, and you still have high CPU usage, proceed to step 2.
- As the user that is being used to run Rabbit, run the command:
Note: Depending on your system, this may be run as the rabbitmq user.
It is a problem is raised limits are not making it to the user/shell/process you want them to. Rabbit 2.x has a feature that announces handle limits at startup time with a log message similar to:
=INFO REPORT==== 2-Dec-2010::14:37:38 ===
Limiting to approx 156 file handles (138 sockets)
This example is from a machine where the output of ulimit -n is 256, a value that is likely too low for a broker that is going to be serving a lot of clients. You might want to increase the maximum open files limit. Increasing the open files limit is done through a command or through configuration changes on the operatins system. See your operating system documentation for more information on increasing the open files limit.
If you are using a version lower than Rabbit 2.x, you can upgrade to Rabbit 2.x. Rabbit 2.x more aggressively protects itself internally against file descriptor shortages. The acceptor process is implemented more defensively against connection spikes and gracefully refuses to accept new connections until the spike subsides. These improvements mean that although connections are temporarily established more slowly, the Rabbit remains usable, and returns to normal when the spike incident is over.
- Check the logs for memory alarms. Severe memory pressure can cause Rabbit to start refusing connections.
- To determine if large numbers of existing connections and if Rabbit has run out of sockets, run the commands:
Check your client and tune how you are handling connections. There may be connections that are not being used that need to be re-coded. If you are using a Java-based client, you may want to tune memory and check garbage collection settings.
- Check the rabbit-sasl.log and the main rabbit.log for any sign that the tcp listener/acceptor process has failed. In the logs, look for CRASH REPORT and entries of tcp_acceptor. The safest option is to raise the limit through ulimit.
In VMware vFabric RabbitMQ 2.5.0 and later, run this command to generate a full diagnostic report:
rabbitmqctl report > server_report.txt