Apache cassandra - future without boundaries (part3)

August 6, 2015 www.ExigenServices.com
Apache Cassandra – Future without
Boundaries

2 www.ExigenServices.com
V. Architecture (part 2)

SEDA Architecture
SEDA – Staged event-driven architecture
1. Every unit of work is split into several
stages that are executed in parallel
threads.
2. Each stage consist of input and output
event queue, event handler and stage
controller.

SEDA Architecture advantages
 Well conditioned system load
 Preventing resources from being overcommitted.

SEDA in Cassandra - Usages
1. Read
2. Mutation
3. Gossip
4. Anti – Entropy
….

SEDA in Cassandra - Design
 Stage Manager presents Map between stage
names and Java 5 thread pool executers.
 Each controller with queue is presented by
ThreadPoolExecuter that can be configured
through JMX.

VI. Advanced column types

TTL column attribute
 TTL column is column value of which expires after
given period of time.
 Useful to store session token.
set test[row1][col2] = 'val2' with ttl=60;

Counter column
 In eventual consistent environment old versions of
column values are overridden by new one, but
counters should be cumulative.
 Counter columns are intended to support
increment/decrement operations in eventual
consistent environment without losing any of
them.

CounterColumn internals
CounterColumn structure:
name
…….
[
(replicaId1, counter1, logical clock1),
(replicaId2, counter2, logical clock2),
………………..
(replicaId3, counter3, logical clock3)
]

CounterColumn write (before)
UPDATE CounterCF SET count_me = count_me + 2
WHERE key = 'counter1‘
[
(A, 10, 2),
(B, 3, 4),
(C, 6, 7)
]

CounterColumn write (after)
 A is leader
[
(A, 10 + 2, 2 + 1),
(B, 3, 4),
(C, 6, 7)
]

CounterColumn Read
All Memtables and SSTables are read through using
following algorithm:
 All tuples with local replicaId will be summarized,
tuple with maximum logical clock value will be
chosen for foreign replica.
 Counters of foreign replicas are updated during
read repair , during replicate on write procedure or
by AES

CounterColumn read - example
 Memtable - (A, 12, 4) (B, 3, 5) (C, 10, 3)
 SSTable1 – (A, 5, 3) (B, 1, 6) (C, 5, 4)
 SSTable2 – (A, 2, 2) (B, 2, 4) (C, 6, 2)
Result:
(A, 12, 4) + (B, 1,6) + (C, 5, 4) =12 + 1 + 5 = 18

VI. Working with Cassandra

Installing and launching Cassandra
 Download from
http://cassandra.apache.org/download/

 Launching server:
bin/cassandra.bat
bin/cassandra.sh
– use “-f” key to run sever in foreground, so that all of the server
logs will print to standard out
– is started with single node cluster called “Test Cluster” listening
on port 9160

 Starting command-line client interface:
bin/cassandra-cli.bat
bin/cassandra-cli.sh
– you see [username@keyspace] at the beginning of every line

Creating a cluster
In configuration file cassandra.yaml specify:
 seeds – the list of seeds for the cluster
 rpc_address and listen_address – network
addresses

Creating a cluster
 initial_token – defining the node’s token range
 auto_bootstrap – enables auto-migration of data
to the new node

nodetool ring
Use nodetool for view configuration
~$ nodetool -h localhost -p 8080 ring
Address Status State Load Owns Range Ring
850705…
10.203.71.154 Up Normal 2.53 KB 50.00 0 |<--|
10.203.55.186 Up Normal 1.33 KB 50.00 850705… |-->|

Connecting to server
 Connect from command line:
connect <HOSTNAME>/<PORT> [<USERNAME> ‘<PASSWORD>’];
Examples:
connect localhost/9160;
connect 127.0.0.1/9160 user ‘password’;
 Connect when staring command line client:
cassandra-cli
–h,––host <HOSTNAME>
–p,––port <PORT>
–k,––keyspace <KEYSPACE>
–u,––username <USERNAME>
–p,––password <PASSWORD>

Describing environment
 show cluster name;
 show keyspaces;
 show api version;
 describe cluster;
 describe keyspace [<KEYSPACE>];

Create keyspace
 create keyspace <KEYSPACE>;
 create keyspace <KEYSPACE> with
<ATTR1> = <VAL1> and
<ATTR2> = <VAL2> ...;
 Attributes:
– placement_strategy
– strategy_options
– …

Create keyspace
Example:
create keyspace Keyspace1
with placement_strategy =
‘org.apache.cassandra.locator.NetworkTopologyStrategy’
and strategy_options =
[{replication_factor: 4}];

Update keyspace
 Update attributes of created keyspace:
update keyspace <KEYSPACE> with
<ATTR2> = <VAL2> ...;

Switch to keyspace
 use <KEYSPACE>;
 use <KEYSPACE> [<USERNAME> ‘<PASSWORD>’];
 If you don’t specify username and password then
credentials supplied to the ‘connect’ statement will
be used
 If the server doesn’t support authentication it will
ignore credentials

Switch to keyspace
 Example:
use Keyspace1 user1 ‘qwerty123’;
When you use keyspace you’ll see [user1@Keyspace1] at the
beginning of every line

Create column family
 create column family <COL_FAMILY>;
 create column family <COL_FAMILY> with
<ATTR2> = <VAL1> ...;
 Example:
create column family Users with
column_type = Super and
comparator = UTF8Type and
rows_cached = 1000;

Update column family
 When column family is created you can update its
attributes:
update column family <COL_FAMILY> with
<ATTR2> = <VAL1> ...;

Writing data
 To write data use set command:
set Customers[‘ivan’][‘name’] = ‘Ivan’;
set Customers[‘makar’][‘info’][‘age’] = 96;

Reading data
 To read data use get command:
get Customers[‘ivan’][‘name’];
- this will display ‘Ivan’
get Customers[‘makar’];
- this will display all columns for key ‘makar’

Reading data
 To list a range of rows use list command:
list Customers;
list Customers[a:];
list Customers[a:c] limit 40;
- you can specify limit of rows that will be displayed (default - 100)

Reading data
 To get columns number use count command:
count Customers[‘ivan’]
- this will display number of columns for key ‘ivan’

Deleting data
 To delete a row, a column or a subcolumn use del
command:
del Customers[‘ivan’];
- this will delete all columns for key ‘ivan’
del Customers[‘ivan’][‘name’];
- this will delete column name for key ‘ivan’
del Customers[‘ivan’][‘accounts’][‘2312784829312343’];
- this will delete a subcolumn with an account number from ‘accounts’
column for key ‘ivan’

Deleting data
 To delete all data in a column family use truncate
command:
truncate Customers;

Drop column family or keyspace
drop column family Customers;
drop keyspace Keyspace1;

Comparators and validators
 Comparators – compare column names
 Validators – validate column values

Comparators and validators
 You can specify comparator for column family
and all subcolumns in column family (one for all)
 You can specify validators for each known
column of column family
 You can specify default validator for column
family that will be used for columns for which
validators aren’t specified
 You can specify key validator which will validate
row keys

Attributes of column family
– column_type: can be Standard or Super
(default - Standard)
– comparator: specifies how column names will be
compared for sort order
– column_metadata: defines the validation and indexes
for known columns
– default_validation_class: validator to use for values in
columns which are not listed in the column_metadata.
(default – BytesType)
– key_validation_class: validator for keys

Column metadata
You can define validators for each known column in the
family
create column family User
with column_metadata = [
{column_name: name, validation_class: UTF8Type},
{column_name: age, validation_class: IntegerType},
{column_name: birth, validation_class: UTF8Type}
];
Columns not listed in this section are validated with
default_validation_class

Secondary indexes
 Allows queries by value
get users where name = ‘Some user';
 Can be created in background

Creating index
 Define it in column metadata
For example in cassandra-cli:
create column family users with
comparator=UTF8Type and column_metadata=[{
column_name: birth_date,
validation_class: LongType,
index_type: KEYS
}];

Some restrictions
 Cassandra use hash indexes instead of btree
indexes.
Thus, in where condition at least one indexed field
with operator “=“ must be present
So, you can’t use
get users where birth_date > 1970;
but can
get users where birth_date = 1990 and karma > 50;

Index types
 KEYS
 BITMAP (will be supported in future releases)
Id Gender
Bitmaps
F M
1 Female 1 0
2 Male 0 1
3 Male 0 1
4 Unspecified 0 0
5 Female 1 0

Resources
 Home of Apache Cassandra Project
http://cassandra.apache.org/
 Apache Cassandra Wiki http://wiki.apache.org/cassandra/
 Documentation provided by DataStax
http://www.datastax.com/docs/0.8/
 Good explanation of creation secondary indexes
http://www.anuff.com/2010/07/secondary-indexes-in-
cassandra.html
 Eben Hewitt “Cassandra: The Definitive Guide”, O’REILLY,
2010, ISBN: 978-1-449-39041-9

Authors
 Lev Sivashov - lsivashov@gmail.com
 Andrey Lomakin - lomakin.andrey@gmail.com,
twitter: @Andrey_Lomakin
LinkedIn: http://www.linkedin.com/in/andreylomakin
 Artem Orobets – enisher@gmail.com
twitter: @Dr_EniSh
 Anton Veretennik - tennik@gmail.com

Apache cassandra - future without boundaries (part3)

More Related Content

What's hot

Viewers also liked

Similar to Apache cassandra - future without boundaries (part3)

More from Return on Intelligence

Recently uploaded

Apache cassandra - future without boundaries (part3)