Search the VMware Knowledge Base (KB)
View by Article ID

Starting VMware vRealize Log Insight 2.5 fails with the error: Failed to start Cassandra Server (2107243)

  • 4 Ratings

Symptoms

  • Starting VMware vRealize Log Insight (formerly known as vCenter Log Insight) 2.5 fails.
  • In the /storage/core/loginsight/var/runtime.log file , you see entries similar to:

    StartupException(description:com.vmware.loginsight.daemon.LogInsightDaemon$ StartupFailedException: Daemon startup failed: Failed to start Cassandra Server: StartupException(description:Unable to connect to Cassandra node at localhost:nnnn: com.vmware.loginsight.cassandra.CassandraException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1 (com.datastax.driver.core.TransportException: [localhost/127.0.0.1] Cannot connect))).) at
     com.vmware.loginsight.daemon.protocol.commands.DaemonCommands$waitUntilStarted_result.read(DaemonCommands.java: nnnn)
     at com.vmware.loginsight.daemon.protocol.commands.DaemonCommands$Client.recv_waitUntilStarted(DaemonCommands.java: nnnn)
     at com.vmware.loginsight.daemon.protocol.commands.DaemonCommands$Client.waitUntilStarted(DaemonCommands.java: nnnn)
     at com.vmware.loginsight.daemon.protocol.commands.DaemonCommands$PooledClient.waitUntilStarted(DaemonCommands.java: nnnn)
     ...
     at com.vmware.loginsight.admintool.LICommandlineAdminTool.main(LICommandlineAdminTool.java: nnnn)

Purpose

This article provides the steps to perform a Cassandra repair operation on Log Insight nodes.


Cause

This issue occurs when the Apache Cassandra service within the VMware Log Insight virtual appliance has not started promptly. This may be caused by performance problems, for example storage latency, in the underlying infrastructure or by consistency problem among cluster nodes.

Resolution

To resolve this issue, perform these steps to perform a Cassandra repair operation on the Log Insight cluster.

Note: If this problem is observed on a standalone, non-clustered Log Insight environment, perform all the steps on the single node.

To perform a Cassandra repair operation on the Log Insight cluster:

Determine whether the Log Insight service can be started on every cluster node

The Cassandra service normally runs as part of the Log Insight service. In a clustered environment, some nodes are likely to be operating normally. The steps for starting and stopping Cassandra service manually performed only on nodes where the Log Insight service does not start successfully.

  1. Open a console or SSH session to the Log Insight virtual appliance and log in as root.

  2. Determine whether the Log Insight service is running using this command:

    /etc/init.d/loginsight status
    daemonNotRunning

  3. Attempt to start the Log Insight service by running this command:

    /etc/init.d/loginsight start

    You see output similar to:

    Starting Log Insight...
    Log Insight started successfully.
    Web application running on port 80.


  4. On each node on which the Log Insight service fails to start with the Failed to start Cassandra Server error, manually start the Cassandra service.

Manually start the Cassandra service on remaining cluster nodes

This repair operation requires the Cassandra service to be running on every node and participating in cluster operations. Perform these steps to start all nodes on which the Log Insight service does not start correctly.

  1. Validate service state, set environment variable and switch current directory.

    1. Open a console or SSH session to the Log Insight virtual appliance and log in as root.

    2. Run this command to validate that the Log insight service is not running:

      /etc/init.d/loginsight status
      daemonNotRunning

    3. Set the CASSANDRA_CONF environment variable by running this command:

      export CASSANDRA_CONF=/storage/core/loginsight/cidata/cassandra/config

    4. Switch the current working directory by running this command:

      cd /usr/lib/loginsight/application/lib/apache-cassandra-*/bin

  2. Start the Cassandra service by running this command:

    ./cassandra

    Note: During startup, this command produces approximately 88 lines of output ending with Node localhost/127.0.0.1 state jump to normal. Press the enter key to return the shell prompt.

Repair the Cassandra cluster

Validate that the Cassandra service is up and running normally on every node in the cluster, then perform a repair operation. This is a cluster-wide operation. Perform these steps on only a single node of the Log Insight cluster.

  1. Validate service state, set environment variable and switch current directory.

    1. Open a console or SSH session to the Log Insight virtual appliance and log in as root.

    2. Validate that the Log Insight service is not running using this command:

      /etc/init.d/loginsight status
      daemonNotRunning

    3. Set the CASSANDRA_CONF environment variable by running this command:

      export CASSANDRA_CONF=/storage/core/loginsight/cidata/cassandra/config

    4. Switch the current working directory by running this command:

      cd /usr/lib/loginsight/application/lib/apache-cassandra-*/bin

  2. Verify that the cluster status is UP by running this command:

    ./nodetool status

    For example:

    You see output similar to:

    Datacenter: datacenter1
    =======================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address       Load    Tokens  Owns   Host ID                           Rack
    UN  10.11.12.100  151.99 MB  256  19.6%  75fd0bf8bd0556fcc3b073fed6aefb36  rack1
    UN  10.11.12.101  156.46 MB  256  20.6%  f79845ca3e7fab13448340b4403ad80f  rack1
    UN  10.11.12.102  135.12 MB  256  19.2%  fe90f17b7f8ff0f45df7aed699b50fbf  rack1
    UN  10.11.12.103  134.84 MB  256  20.6%  b1ab7493430c60e9ded8644f66bc4c11  rack1
    UN  10.11.12.104  127.27 MB  256  20.0%  63e4a2884fad9907fe87347c43ba341a  rack1


    Note: There is a row for each cluster node. The U status indicates the node is Up. The N state indicates the node is in a Normal. Run this command repeatedly until every node in the cluster comes online and is shown as Up and Normal. This make take a few minutes after the startup.

  3. Flush the commit log to on-disk tables by running this command:

    ./nodetool flush

  4. Repair inconsistencies across all replicas by running this command:

    ./nodetool repair

    Note: The repair operation requires intensive disk I/O and takes some time to run. If a RepairException error is reported, open a support request with VMware Support. For more information, see Filing a Support Request in My VMware (2006985).

Manually stop the Cassandra service and start Log Insight

The repair operation is completed on the Cassandra cluster. For each node in the cluster where Cassandra is started manually, stop the Cassandra service and start the Log Insight service normally.

  1. Validate service state, set environment variable and switch current directory.

    1. Open a console or SSH session to the Log Insight virtual appliance and log in as root.

    2. Validate that the Log Insight service is not running using this command:

      /etc/init.d/loginsight status
      daemonNotRunning

    3. Set the CASSANDRA_CONF environment variable by running command:

      export CASSANDRA_CONF=/storage/core/loginsight/cidata/cassandra/config

    4. Switch the current working directory by running this command:

      /usr/lib/loginsight/application/lib/apache-cassandra-*/bin

  2. Stop the Cassandra service by running this command:

    ./nodetool stopdaemon

  3. Start the Log Insight service by running this command:

    /etc/init.d/loginsight start

Additional Information

If the issue persists, verify if:

  1. The Cassandra database is not starting correctly. Confirm that the Master node of the Log Insight cluster node has a correct DNS resolution. For more information, see VMware vRealize Log Insight 2.5 startup fails with the error: Unable to gossip with any seeds (2117471).

  2. The Cassandra database is taking too long to start. Confirm that the Log Insight cluster node has sufficient resources as outlined in the Getting Started guide, and that the underlying vSphere environment is free of storage and compute contention.

  3. Open a support request with VMware Support to investigate further. For more information, see How to file a Support Request in My VMware (2006985).

See Also

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 4 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.

What can we do to improve this information? (4000 or fewer characters)




Please enter the Captcha code before clicking Submit.
  • 4 Ratings
Actions
KB: