Cassandra cluster setup in 2 minutes

Posted by Unknown on

Cassandra is a very popular member of distributed nosql dbms and is one of the most scalable, fastest, and very robust NoSQL database. The steps documented in this post are very basic in nature and you should consider tuning this for production grade cluster setup, however, this is good enough to smackdown and explore Cassandra's capabilities.

Basic Cluster Configuration:


Step 1: Setting up on a single node.


Replace the download url with your closest mirror.
Here is a sample command for version 2.5, this command will download, extract and rename the folder

wget http://mirrors.gigenet.com/apache/cassandra/2.0.5/apache-cassandra-2.0.5-bin.tar.gz && tar xvzf apache-cassandra-2.0.5-bin.tar.gz && mv apache-cassandra-2.0.5 cassandra25_node1_dc1

Step 2 (Optional): Edit configuration to modify following as per your standards.


conf/cassandra.yaml
data_file_directories:
    - /home/cassandra/data
commitlog_directory: /home/cassandra/data/commitlog
saved_caches_directory: /home/cassandra/saved_caches

conf/log4j-server.properties
log4j.appender.R.File: /home/cassandra/system.log

Repeat Step 1 and 2 in another machine/vdi

At this point we have a basic setup configured and you should be able to launch the nodes
independently, However, the nodes are not yet clustered and can not communicate with each other.
./bin/cassandra -f

Step 3: Cluster nodes


We need to make few more changes to our configuration file to let the nodes cluster
conf/cassandra.yaml
Provide a logical name for your cluster, E.g.
cluster_name: 'hari_cassandra_ring'

Seeds - For a cassandra node to participate in a cluster it has to know about one other node in the datacenter, this is called as "seed" node
in cassandra config file, this can be a comma separated list of servers, the documentation suggests to avoid a chicken and egg reference while defining the seed node
http://wiki.apache.org/cassandra/GettingStarted

E.g.
seeds: "192.168.0.119"

listen_address - This should be a private address that nodes connect to for inter node communication
for simple configuration we can leave this as the ip address or hostname of the node.
listen_address: 192.168.0.108

This is the rpc communication interface, for basic configuration we will leave this same as listen_address
rpc_address: 192.168.0.108

initial_token - This is another important aspect of cluster configuration and governs load distribution across nodes, for the purpose of this demo I will leave it as blank, you may refer cassandra documentation on how this can be defined based on the number of nodes within the data center.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configGenTokens_c.html

Step 3: Test cluster setup


You can now fire up one node at a time as follows "cassandra25_node1_dc1/bin/cassandra -f"

As you bring up more nodes we should be able to see similar messages indicating cluster node handshake.

INFO 22:06:20,974 Handshaking version with /192.168.0.108
 INFO 22:06:23,023 Node /192.168.0.108 is now part of the cluster
 INFO 22:06:23,047 Handshaking version with /192.168.0.108
 INFO 22:06:23,061 InetAddress /192.168.0.108 is now UP
 INFO 22:06:23,207 InetAddress /192.168.0.108 is now DOWN
 INFO 22:06:23,212 Handshaking version with /192.168.0.108
 INFO 22:06:24,037 InetAddress /192.168.0.108 is now UP
 INFO 22:06:53,449 [Stream #6e422a30-99e4-11e3-858d-e535fdb952e8] Received streaming plan for Bootstrap
 INFO 22:06:53,590 [Stream #6e422a30-99e4-11e3-858d-e535fdb952e8] Session with /192.168.0.108 is complete

Another command to check cluster / node status is nodetool command

./cassandra25_node1_dc1/bin/nodetool status

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  192.168.0.108  68.61 KB   256     100.0%            005d1cea-aa68-41b0-9a75-0051dd431930  rack1
UN  192.168.0.119  73.14 KB   256     100.0%            8ca40713-2eb5-44df-8a52-6cd838a492e3  rack1


9 comments:

  1. I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in APACHE SOLR
    , kindly contact us http://www.maxmunus.com/contact
    MaxMunus Offer World Class Virtual Instructor led training on APACHE SOLR . We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
    For Demo Contact us:
    Name : Arunkumar U
    Email : arun@maxmunus.com
    Skype id: training_maxmunus
    Contact No.-+91-9738507310
    Company Website –http://www.maxmunus.com



    ReplyDelete
  2. Thank you for sharing such knowledgeable post its not only helpful for the old but also new student for better preparation. I just share your post with my friends so that they can also read your post. Cassandra Certification Online Course.

    ReplyDelete
  3. Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic. Can you guess how much these celebrities are worth? Test your knowledge with Celebrity net worth
    .

    ReplyDelete
  4. The blog is very interesting, I was randomly searching to gather information. It is a must-read.
    VPS Server Hosting

    ReplyDelete
  5. I'm no expert, but I believe you just made an excellent point. You certainly fully understand what you're speaking about, and I can truly get behind that.
    Virtual Private Server

    ReplyDelete