TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

Cassandra vs HBase

Cassandra vs HBase for beginners and professionals with topics on architecture, relational vs no sql database, data model, cql, cqlsh, keyspace operations, table operations, installation, collections etc.

<< Back to CASSANDRA

Cassandra Vs HBase

The following table specifying the main differences between Cassandra and HBase:


HBase Cassandra
HBase is based on Bigtable (Google) Cassandra is based on DynamoDB (Amazon). It was initially developed at Facebook by former Amazon engineers. This is one reason why Cassandra supports multi data center.
HBase uses the Hadoop infrastructure (Zookeeper, NameNode, HDFS). Organizations that deploy Hadoop must have the knowledge of Hadoop and HBase Cassandra started and evolved separate from Hadoop and its infrastructure and operational knowledge requirements are different than Hadoop. However, for analytics, many Cassandra deployments use Cassandra + Storm (which uses zookeeper), and/or Cassandra + Hadoop.
The HBase-Hadoop infrastructure has several "moving parts" consisting of Zookeeper, Name Node, HBase master, and data nodes, Zookeeper is clustered and naturally fault tolerant. Name Node needs to be clustered to be fault tolerant. Cassandra uses a single node-type. All nodes are equal and perform all functions. Any node can act as a coordinator, ensuring no Spof. Adding storm or Hadoop, of course, adds complexity to the infrastructure.
HBase is well suited for doing range based scans.
Cassandra does not support range based row-scans which may be limiting in certain use-cases.
HBase provides for asynchronous replication of an HBase cluster across a wan. Cassandra random partitioning provides for row-replication of a single row across a wan.
HBase only supports ordered partitioning. Cassandra officially supports ordered partitioning, but no production user of Cassandra uses ordered partitioning due to the "hot spots" it creates and the operational difficulties such hot-spots cause.
Due to ordered partitioning, HBase will easily scale horizontally while still supporting Rowkey range scans. If data is stored in columns in Cassandra to support range scans, the practical limitation of a row size in Cassandra is 10's of megabytes.
HBase supports atomic compare and set. HBase supports transaction within a row. Cassandra does not support atomic compare and set.
HBase does not support read load balancing against a single row. A single row is served by exactly one region server at a time. Cassandra will support read load balancing against a single row.
Bloom filters can be used in HBase as another form of indexing. Cassandra uses bloom filters for key lookup.
Triggers are supported by the coprocessor capability in HBase. Cassandra does not support co-processor-like functionality.

Next TopicRDBMS vs Cassandra




Related Links:


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf