KEMBAR78
Apache Cassandra overview | PDF
Apache Cassandra
overview
by Taras Tymoshchuk, software developer at ElifTech
Introduction
What is Apache Cassandra?
Apache Cassandra™ is a free
Distributed…
High performance…
Extremely scalable…
Fault tolerant (i.e. no single point of failure)…
post-relational database solution. Cassandra can serve as both
real-time datastore (the “system of record”) for
online/transactional applications, and as a read-intensive
database for business intelligence systems.
Top Use Cases
● Internet of things applications – Cassandra is perfect for consuming lots of fast
incoming data from devices, sensors and similar mechanisms that exist in many
different locations.
● Product catalogs and retail apps – Cassandra is the database of choice for many
retailers that need durable shopping cart protection, fast product catalog input and
lookups, and similar retail app support.
● User activity tracking and monitoring – many media and entertainment companies
use Cassandra to track and monitor the activity of their users’ interactions with their
movies, music, website and online applications.
● Messaging – Cassandra serves as the database backbone for numerous mobile
phone and messaging providers’ applications.
● Social media analytics and recommendation engines – many online companies,
websites, and social media providers use Cassandra to ingest, analyze, and provide
analysis and recommendations to their customers.
Key Cassandra Features and Benefits
● Gigabyte to Petabyte scalability
● Linear performance
● No SPOF
● Easy replication / data distribution
● Multi datacenter and cloud capable
● No need for separate caching layer
● Tunable data consistency
● Flexible schema design
● Data compaction
● CQL language (like SQL)
● Support for key languages and platforms
● No need for special hardware or
software
Architecture Overview
In Cassandra, all nodes play an identical role; there is no concept of a master node.
Cassandra’s built-for-scale architecture means that it is capable of handling large
amounts of data and thousands of concurrent users.
Cassandra’s architecture also means that, unlike other master-slave or sharded systems,
it has no single point of failure and therefore is capable of offering true continuous
availability and uptime.
CQL
Astyanix / Hector API:
SliceQuery<string,string,string>query=...
query.set Key (“x”)
query.set Column Family (“y”)
CQL:
SELECT A FROM Y WHERE ID=”X”
Cassandra Data Objects
Overview
Cassandra data model
COL1 VAL1 (TS1)
COL2 VAL2 (TS2)KEY
Writing Data
Reading Data
Rake
● Bad implemented range scan, Cassandra can not currently transfer
data;
● Compaction backing a request;
● Many settings made on the cluster level, type, storage strategy and
etc.;
● Counters.
Thank you for your attention!
Find us at eliftech.com
Have a question? Contact us:
info@eliftech.com

Apache Cassandra overview

  • 1.
    Apache Cassandra overview by TarasTymoshchuk, software developer at ElifTech
  • 2.
    Introduction What is ApacheCassandra? Apache Cassandra™ is a free Distributed… High performance… Extremely scalable… Fault tolerant (i.e. no single point of failure)… post-relational database solution. Cassandra can serve as both real-time datastore (the “system of record”) for online/transactional applications, and as a read-intensive database for business intelligence systems.
  • 3.
    Top Use Cases ●Internet of things applications – Cassandra is perfect for consuming lots of fast incoming data from devices, sensors and similar mechanisms that exist in many different locations. ● Product catalogs and retail apps – Cassandra is the database of choice for many retailers that need durable shopping cart protection, fast product catalog input and lookups, and similar retail app support. ● User activity tracking and monitoring – many media and entertainment companies use Cassandra to track and monitor the activity of their users’ interactions with their movies, music, website and online applications. ● Messaging – Cassandra serves as the database backbone for numerous mobile phone and messaging providers’ applications. ● Social media analytics and recommendation engines – many online companies, websites, and social media providers use Cassandra to ingest, analyze, and provide analysis and recommendations to their customers.
  • 4.
    Key Cassandra Featuresand Benefits ● Gigabyte to Petabyte scalability ● Linear performance ● No SPOF ● Easy replication / data distribution ● Multi datacenter and cloud capable ● No need for separate caching layer ● Tunable data consistency ● Flexible schema design ● Data compaction ● CQL language (like SQL) ● Support for key languages and platforms ● No need for special hardware or software
  • 5.
    Architecture Overview In Cassandra,all nodes play an identical role; there is no concept of a master node. Cassandra’s built-for-scale architecture means that it is capable of handling large amounts of data and thousands of concurrent users. Cassandra’s architecture also means that, unlike other master-slave or sharded systems, it has no single point of failure and therefore is capable of offering true continuous availability and uptime.
  • 6.
    CQL Astyanix / HectorAPI: SliceQuery<string,string,string>query=... query.set Key (“x”) query.set Column Family (“y”) CQL: SELECT A FROM Y WHERE ID=”X”
  • 7.
  • 8.
    Overview Cassandra data model COL1VAL1 (TS1) COL2 VAL2 (TS2)KEY
  • 9.
  • 10.
  • 11.
    Rake ● Bad implementedrange scan, Cassandra can not currently transfer data; ● Compaction backing a request; ● Many settings made on the cluster level, type, storage strategy and etc.; ● Counters.
  • 12.
    Thank you foryour attention! Find us at eliftech.com Have a question? Contact us: info@eliftech.com