CommitLog, MemTable, and SSTable are 3 core components of cassandra, they work in tandom to gurantee durability aspect of cassandra, this is more of less similar to RDBMS databases where in commit logs are used to replay a transaction in case of of db server crash.
Here is a very basic high level life cycle of the data from the time it is written from a cassandra client to the time it is persisted to SSTables.
The complexity is slightly higher in the cluster setup, however, for starters this is good enough to understand the internals of Casandra write and update flow.
Cassandra tool "nodetool" provides an option to explicitly flush the data in commit log or Memtable into SSTables,
This tool is supposed to be used for maintenance, however it is a nice utility that can be used during maintenance to ensure all pending transactions are
flushed out to the SSTables before a node shutdown.
Let us put cassandra's durability to test with a real world ecommerce use case.
Let us assume we have a keyspace that manages user cart or orders and given a scenario of node failure let us put cassandra's durability to test.
We will do following.
Load some sample data into cassandra and shutdown the databased before cassandra performs a flush to SSTables.
Make sure cassandra server is running, start by using following command in the foreground ./bin/cassandra -f
Once the data is loaded you can exit cqlsh and check your data directory, the location of data directory is defined in ./config/cassandra.yaml files "data_file_directories" key property.
E..g if mapped to your home directory, go to ~/cassandra/data/ecommerce/orders and you should not notice any files in this directory, usually you will find a couple of files related to SSTables in this location once the flush operation is completed.
We can terminate cassandra at this point to replicate a situation where in cassandra data is not yet flused to SSTables and is only available with in the commit log and Memtables in memory data store.
Now you can bring up cassandra and you should notice few interesting log messages indicating a replay of pending records from commit log to SSTables.
Once this operation is complete check the ~/cassandra/data/ecommerce/orders folders and you should not see the data inserted before the server crash.
or you can also check the data in sstables using the ./bin/sstables utility
Further Reading
https://wiki.apache.org/cassandra/MemtableSSTable
Here is a very basic high level life cycle of the data from the time it is written from a cassandra client to the time it is persisted to SSTables.
The complexity is slightly higher in the cluster setup, however, for starters this is good enough to understand the internals of Casandra write and update flow.
Step 1: Request is received by a random node in the cluster
Step 2: Node Writes data into the local commit log file in a sequential manner.
Step 3: Memtable gets updated in asynchronous mode.
Step 4: Memtable flushes the data to SSTables periodically, SStables is really the final persistance store for the data.
Step 5: Once data makes it way to SSTables the corresponding reference of the record in commit log and Memtable is flushed out.
Step 2: Node Writes data into the local commit log file in a sequential manner.
Step 3: Memtable gets updated in asynchronous mode.
Step 4: Memtable flushes the data to SSTables periodically, SStables is really the final persistance store for the data.
Step 5: Once data makes it way to SSTables the corresponding reference of the record in commit log and Memtable is flushed out.
Cassandra tool "nodetool" provides an option to explicitly flush the data in commit log or Memtable into SSTables,
This tool is supposed to be used for maintenance, however it is a nice utility that can be used during maintenance to ensure all pending transactions are
flushed out to the SSTables before a node shutdown.
Let us put cassandra's durability to test with a real world ecommerce use case.
Let us assume we have a keyspace that manages user cart or orders and given a scenario of node failure let us put cassandra's durability to test.
We will do following.
Load some sample data into cassandra and shutdown the databased before cassandra performs a flush to SSTables.
Make sure cassandra server is running, start by using following command in the foreground ./bin/cassandra -f
Once the data is loaded you can exit cqlsh and check your data directory, the location of data directory is defined in ./config/cassandra.yaml files "data_file_directories" key property.
E..g if mapped to your home directory, go to ~/cassandra/data/ecommerce/orders and you should not notice any files in this directory, usually you will find a couple of files related to SSTables in this location once the flush operation is completed.
We can terminate cassandra at this point to replicate a situation where in cassandra data is not yet flused to SSTables and is only available with in the commit log and Memtables in memory data store.
Now you can bring up cassandra and you should notice few interesting log messages indicating a replay of pending records from commit log to SSTables.
Once this operation is complete check the ~/cassandra/data/ecommerce/orders folders and you should not see the data inserted before the server crash.
Completed flushing /home/search/cassandra/data/system/compaction_history/system-compaction_history-jb-1-Data.db (237 bytes) for commitlog position ReplayPosition(segmentId=1395543674692, position=271)
You should be able to query the same from cqlsh as well.
cqlsh> select * from ecommerce.orders;
orders_id | users_id | emails | first_name | last_name | order_comments | order_log | order_status | order_total | promotions_total | shipping_total | tax_total
-----------+----------+--------------------------------+------------+-----------+------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------+--------------+-------------+------------------+----------------+-----------
4321 | 1234 | {'a@gmail.com', 'b@gmail.com'} | hariharan | vadivelu | {'comment_1': '2013-06-13 11:42:12-0500', 'comment_2': '2013-06-13 11:42:12-0500'} | {'created_on': '2013-06-13 11:42:12-0500', 'last_updated': '2013-06-13 11:42:12-0500'} | Pending | 20.3 | 5 | 2 | 1
321 | 123 | {'a@gmail.com', 'b@gmail.com'} | hariharan | vadivelu | {'comment_1': '2013-06-13 11:42:12-0500', 'comment_2': '2013-06-13 11:42:12-0500'} | {'created_on': '2013-06-13 11:42:12-0500', 'last_updated': '2013-06-13 11:42:12-0500'} | Pending | 20.3 | 5 | 2 | 1
(2 rows)
or you can also check the data in sstables using the ./bin/sstables utility
./sstable2json /home/search/cassandra/data/ecommerce/orders/ecommerce-orders-jb-1-Data.db
[
{"key": "000010e1","columns": [["1234:","",1395543482883000], ["1234:emails","1234:emails:!",1395543482882999,"t",1395543482], ["1234:emails:6140676d61696c2e636f6d","",1395543482883000], ["1234:emails:6240676d61696c2e636f6d","",1395543482883000], ["1234:first_name","hariharan",1395543482883000], ["1234:last_name","vadivelu",1395543482883000], ["1234:order_comments","1234:order_comments:!",1395543482882999,"t",1395543482], ["1234:order_comments:636f6d6d656e745f31","0000013f3e6a76a0",1395543482883000], ["1234:order_comments:636f6d6d656e745f32","0000013f3e6a76a0",1395543482883000], ["1234:order_log","1234:order_log:!",1395543482882999,"t",1395543482], ["1234:order_log:637265617465645f6f6e","0000013f3e6a76a0",1395543482883000], ["1234:order_log:6c6173745f75706461746564","0000013f3e6a76a0",1395543482883000], ["1234:order_status","Pending",1395543482883000], ["1234:order_total","20.3",1395543482883000], ["1234:promotions_total","5.0",1395543482883000], ["1234:shipping_total","2.0",1395543482883000], ["1234:tax_total","1.0",1395543482883000]]},
{"key": "00000141","columns": [["123:","",1395543482804000], ["123:emails","123:emails:!",1395543482803999,"t",1395543482], ["123:emails:6140676d61696c2e636f6d","",1395543482804000], ["123:emails:6240676d61696c2e636f6d","",1395543482804000], ["123:first_name","hariharan",1395543482804000], ["123:last_name","vadivelu",1395543482804000], ["123:order_comments","123:order_comments:!",1395543482803999,"t",1395543482], ["123:order_comments:636f6d6d656e745f31","0000013f3e6a76a0",1395543482804000], ["123:order_comments:636f6d6d656e745f32","0000013f3e6a76a0",1395543482804000], ["123:order_log","123:order_log:!",1395543482803999,"t",1395543482], ["123:order_log:637265617465645f6f6e","0000013f3e6a76a0",1395543482804000], ["123:order_log:6c6173745f75706461746564","0000013f3e6a76a0",1395543482804000], ["123:order_status","Pending",1395543482804000], ["123:order_total","20.3",1395543482804000], ["123:promotions_total","5.0",1395543482804000], ["123:shipping_total","2.0",1395543482804000], ["123:tax_total","1.0",1395543482804000]]}
]
https://wiki.apache.org/cassandra/MemtableSSTable
I was very pleased to find this website. I wanted to thank you for your time for this wonderful post!! I definitely enjoy reading it and I have you bookmarked to check out new stuff you blog post. The latest Tweets from Celebrity birthdays with 22 million users visit our website each month.
ReplyDeleteI am really happy with your blog because your article is very unique and powerful for new.
ReplyDelete3D Scanning Services
3D Laser Scanning Targets
Thanks For Posting This usefull Information Techdhyan
ReplyDeleteThank You
I'm no expert, but I believe you just made an excellent point. You certainly fully understand what your speaking about, and I can truly get behind that.
ReplyDeleteReliable Cloud Provider In Switzerland
Thanks for sharing really helps a lot.
ReplyDeleteVPS Web Hosting Services In Norway
I'm no expert, but I believe you just made an excellent point. You certainly fully understand what you're speaking about, and I can truly get behind that.
ReplyDeleteHosting Services
Are you looking to grow your business through exporting? here you can learn everything about export import industry - Import Export Coach
ReplyDeleteImport Export Coach
How to Find Buyers For Your Export Business
top 20 product export from India
Payment Terms in Export and Import Business
Import Export Procedures In India
Import Export Update
Free Import Export Course
how to export
what is iec code