KEMBAR78
Cassandra Tutorial | PDF
Apache Cassandra in Action




 Jonathan Ellis
 jbellis@datastax.com / @spyced
Why Cassandra?
•
    Relational databases are not designed to
    scale
•
    B-trees are slow
    –
        and require read-before-write
(“The eBay Architecture,” Randy Shoup and Dan Pritchett)
Reader
                    Memtable
     Writer




    Commitlog




The Log-Structured Merge-Tree,
Bigtable: A Distributed Storage
System for Structured Data
Dynamo, 2007
Bigtable, 2006




                           OSS, 2008




         Incubator, 2009       TLP, 2010
Cassandra in production
•
    Digital Reasoning: NLP + entity analytics
•
    OpenWave: enterprise messaging
•
    OpenX: largest publisher-side ad network in the
    world
•
    Cloudkick: performance data & aggregation
•
    SimpleGEO: location-as-API
•
    Ooyala: video analytics and business intelligence
•
    ngmoco: massively multiplayer game worlds
FUD?
•
    “Cassandra is only appropriate for
    unimportant data.”
Durabilty
•
    Write to commitlog
    –
        fsync is cheap since it’s append-only
•
    Write to memtable
•
    [amortized] flush memtable to sstable
SSTable format, briefly


       <key 127>
       <key 255>             <row data 0>
       ...                   <row data 1>
                             ...
                             <row data 127>
                             ...
                             <row data 255>
                             ...


                   Sorted [clustered] by row key
Scaling
W   A




T
        L
W   A




            F


T
        L
W           A




                    F
        (A-L]



T
                L
W           A




        (A-F]       F


T
        (F-L]   L
Key “C”
              W   A




                      F


          T
                  L
Reliability
•
    No single points of failure
•
    Multiple datacenters
•
    Monitorable
Some headlines
•
    “Resyncing Broken MySQL Replication”
•
    “How To Repair MySQL Replication”
•
    “Fixing Broken MySQL Database Replication”
•
    “Replication on Linux broken after db restore”
•
    “MySQL :: Repairing broken replication”
Good architecture solves multiple
problems at once
•
    Availability in single datacenter
•
    Availability in multiple datacenters
Y
                        Key “C”
            A
    W



U
                    F



    T
                L
        P
Y
                           Key “C”
               A
    W



U
                       F




               X
    T   hint
                   L
        P
Y
            A
    W



U
                    F



    T
                L
        P
Y
                        Key “C”
            A
    W



U
                    F



    T
                L
        P
Y
                    Key “C”
            A
    W



U
                F


    T
            L
        P
Tuneable consistency
•
    ONE, QUORUM, ALL
•
    R+W>N
•
    Choose availability vs consistency (and latency)
Monitorable
JMX
OpsCenter
When do you need Cassandra?
•
    Ian Eure: “If you’re deploying memcache on top of your
    database, you’re inventing your own ad-hoc, difficult to
    maintain NoSQL data store”
Not Only SQL
•
    Curt Monash: “ACID-compliant transaction integrity
    commonly costs more in terms of DBMS licenses and many other
    components of TCO (Total Cost of Ownership) than [scalable
    NoSQL]. Worse, it can actually hurt application uptime,
    by forcing your system to pull in its horns and stop functioning in the
    face of failures that a non-transactional system might smoothly work
    around. Other flavors of “complexity can be a bad thing” apply as
    well. Thus, transaction integrity can be more trouble
    than it’s worth.” [Curt’s emphasis]
Keyspaces & ColumnFamilies
•
    Conceptually, like “schemas” and “tables”
Inside CFs, columns are dynamic
•
    Twitter: “Fifteen months ago, it took two
    weeks to perform ALTER TABLE on the
    statuses [tweets] table.”
ColumnFamilies
•
    Static
    –
        Object data
•
    Dynamic
    –
        Precalculated query results
“static” columnfamilies

                      Users
   zznate    Password: *    Name: Nate

   driftx    Password: *   Name: Brandon

   thobbs    Password: *    Name: Tyler

   jbellis   Password: *   Name: Jonathan   Site: riptano.com
“dynamic” columnfamilies

                     Following
zznate    driftx:   thobbs:

driftx

thobbs    zznate:

jbellis   driftx:   mdennis:   pcmanus   thobbs:   xedin:   zznate
Inserting
•
    Really “insert or update”
•
    Not a key/value store – update as much of
    the row as you want
Example: twissandra
•
    http://twissandra.com
CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    username VARCHAR(64),
    password VARCHAR(64)
);

CREATE TABLE following (
    user INTEGER REFERENCES user(id),
    followed INTEGER REFERENCES user(id)
);

CREATE TABLE tweets (
    id INTEGER,
    user INTEGER REFERENCES user(id),
    body VARCHAR(140),
    timestamp TIMESTAMP
);
Cassandrified
create column family users with comparator = UTF8Type
and column_metadata = [{column_name: password,
validation_class: UTF8Type}]

create column family tweets with comparator = UTF8Type
and column_metadata = [{column_name: body, validation_class:
UTF8Type}, {column_name: username, validation_class:
UTF8Type}]

create column family friends with comparator = UTF8Type
create column family followers with comparator = UTF8Type

create column family userline with comparator = LongType and
default_validation_class = UUIDType
create column family timeline with comparator = LongType and
default_validation_class = UUIDType
Connecting
CLIENT = pycassa.connect_thread_local('Twissandra')

USER = pycassa.ColumnFamily(CLIENT, 'User')
User
RowKey: ericflo
=> (column=password, value=****,
timestamp=1289446382541473)

-------------------
RowKey: jbellis
=> (column=password, value=****,
timestamp=1289446438490709)


uname = 'jericevans'
password = '**********'

columns = {'password': password}

USER.insert(uname, columns)
Natural keys vs surrogate
Friends and Followers
RowKey: ericflo

=> (column=jbellis, value=1289446467611029,
timestamp=1289446467611064)

=> (column=b6n, value=1289446467611031,
timestamp=1289446467611080)

to_uname = 'ericflo'

FRIENDS.insert(uname, {to_uname: time.time()})
FOLLOWERS.insert(to_uname, {uname: time.time()})
zznate    driftx:   thobbs:

driftx

thobbs    zznate:

jbellis   driftx:   mdenni    pcmanu   thobbs:   xedin:   zznat
                      s:        s:                          e:
Tweets
RowKey: 92dbeb50-ed45-11df-a6d0-000c29864c4f

=> (column=body, value=Four score and seven years ago,
timestamp=1289446891681799)

=> (column=username, value=alincoln,
timestamp=1289446891681799)

-------------------
RowKey: d418a66e-edc5-11df-ae6c-000c29864c4f

=> (column=body, value=Do geese see God?,
timestamp=1289501976713199)

=> (column=username, value=pdrome,
timestamp=1289501976713199)
Userline
RowKey: ericflo

=> (column=1289446393708810, value=6a0b4834-ed44-11df-
bc31-000c29864c4f, timestamp=1289446393710212)

=> (column=1289446397693831, value=6c6b5916-ed44-11df-
bc31-000c29864c4f, timestamp=1289446397694646)

=> (column=1289446891681780, value=92dbeb50-ed45-11df-
a6d0-000c29864c4f, timestamp=1289446891685065)

=> (column=1289446897315887, value=96379f92-ed45-11df-
a6d0-000c29864c4f, timestamp=1289446897317676)
Userline


zznate    1289847840615: 3f19757a-c89d...   1289847887086: a20fcf52-595c...


driftx

thobbs    1289847887086: a20fcf52-595c...


jbellis   1289847840615: 3f19757a-c89d...   128984784425: 844e75e2-b546...
Timeline
RowKey: ericflo

=> (column=1289446393708810, value=6a0b4834-ed44-11df-
bc31-000c29864c4f, timestamp=1289446393710212)

=> (column=1289446397693831, value=6c6b5916-ed44-11df-
bc31-000c29864c4f, timestamp=1289446397694646)

=> (column=1289446891681780, value=92dbeb50-ed45-11df-
a6d0-000c29864c4f, timestamp=1289446891685065)

=> (column=1289446897315887, value=96379f92-ed45-11df-
a6d0-000c29864c4f, timestamp=1289446897317676)
Adding a tweet
tweet_id = str(uuid())
body = '@ericflo thanks for Twissandra, it helps!'
timestamp = long(time.time() * 1e6)

columns = {'uname': useruuid, 'body': body}
TWEET.insert(tweet_id, columns)

columns = {ts: tweet_id}
USERLINE.insert(uname, columns)

TIMELINE.insert(uname, columns)
for follower_uname in FOLLOWERS.get(uname, 5000):
    TIMELINE.insert(follower_uname, columns)
Reads
timeline = USERLINE.get(uname, column_reversed=True)
tweets = TWEET.multiget(timeline.values())


start = request.GET.get('start')
limit = NUM_PER_PAGE

timeline = TIMELINE.get(uname, column_start=start,
column_count=limit, column_reversed=True)
tweets = TWEET.multiget(timeline.values())
Programatically
•
    Don't use thrift directly
•
    Higher level clients have a lot of features you
    want
    –
        Knowledge about data types
    –
        Connection pooling
    –
        Automatic retries
    –
        Logging
Raw thrift API: Connecting
def get_client(host='127.0.0.1', port=9170):
    socket = TSocket.TSocket(host, port)
    transport = TTransport.TBufferedTransport(socket)
    transport.open()
    protocol =
TBinaryProtocol.TBinaryProtocolAccelerated(transport)
    client = Cassandra.Client(protocol)
    return client
Raw thrift API: Inserting
data = {'id': useruuid, ...}
columns = [Column(k, v, time.time())
           for (k, v) in data.items()]
mutations = [Mutation(ColumnOrSuperColumn(column=c))
             for c in columns]
rows = {useruuid: {'User': mutations}}

client.batch_mutate('Twissandra', rows,
ConsistencyLevel.ONE)
API layers
•
    libpq    •
                 Thrift
•
    JDBC     •
                 Hector
•
    JPA      •
                 Hector object-
                 mapper
Running twissandra
•
    Login: notroot/notroot
    –
        (root/riptano)


•
    cd twissandra
•
    python manage.py runserver &
•
    Navigate to http://127.0.0.1:8000
•
    Login as jim/jim, tom/tom, or create your own
One more thing
•
    !PUBLIC! userline
Exercise 1
•
    $ cassandra-cli --host localhost
•
    ] use twissandra;
    ] help;
    ] help list;
    ] help get;
    ] help del;
•
    Delete the most recent tweet
    –
        How would you find this w/o looking at the UI?
Exercise 2
•
    User jim is following user tom, but
    twissandra doesn't populate Timeline with
    tweets from before the follow action.
•
    Insert a tweet from tom before the follow
    action into jim's timeline
Secondary (column) indexes
Exercise 3
•
    Add a state column to the Tweet column
    family definition, with an index (index_type
    KEYS).
    –
        Hint: a no-op update column family on Tweet would be
        update column family Tweet with
        column_metadata=[{column_name:body,
        validation_class:UTF8Type}, {column_name:username,
        validation_class:UTF8Type}]
•
    Set the state column on several tweets to TX.
    Select them using get … where.
Language support
•
    Python
    –
        pycassa
    –
        telephus
•
    Ruby
    –
        Speed is a negative
•
    Java
    –
        Hector
•
    PHP
    –
        phpcassa
Done yet?
•
    Still doing 1+N queries per page
•
    Solution: Supercolumns
Applying SuperColumns to Twissandra

jbellis   1289847840615
            1289847844275      1289847844275     1289847887086
                                                 1289847844275
                 Id:
                  Id:               Id:
                                     Id:                Id:
                                                      Id:
          3f19757a-c89d...
              3f19757a-       844e75e2-b546...
                                 3f19757a-       a20fcf52-595c...
                                                  3f19757a-
               c89d...             c89d...          c89d...
              uname:
               uname:             uname:
                                   uname:          uname:
                                                    uname:
              zznate
               zznate              driftx
                                   zznate           zznate
                                                     zznate

                body:
                 body:            body:
                                    body:             body:
                                                       body:
          O Do geese see
            stone be not so   Rise geese see
                               Do to vote sir    Do Igeese see
                                                      prefer pi
                  ...                ...                ...
Supercolumns: limitations
•
    Requires reading an entire SC (not the entire
    row) from disk even if you just want one
    subcolumn
UUIDs
•
    Column names should be uuids, not longs,
    to avoid collisions
•
    Version 1 UUIDs can be sorted by time
    (“TimeUUID”)
•
    Any UUID can be sorted by its raw bytes
    (“LexicalUUID”)
    –
        Usually Version 4
    –
        Slightly less overhead
Lucandra
•
    What documents contain term X?
    –
        … and term Y?
    –
        … or start with Z?
Fields and Terms

<doc>
  <field name=”title”>apache talk</field>
  <field name=”date”>20110201</field>
</doc>


   feld      term       freq     position
   title    apache       1          0
   title      talk       1          1
   date    20110201      1          0
Lucandra ColumnFamilies
create column family documents with comparator = BytesType;

Create column family terminfo with column_type = Super and
comparator = BytesType and subcomparator = BytesType;
Lucandra data
Document Key      col name         value
"documentId" => { fieldName , value }

Term Key          col name         value
"field/term" => { documentId , position vector }
Lucandra queries
•
    get_slice
•
    get_range_slices
•
    No silver bullet
FAQ: counting
•
    UUIDs + batch process
•
    column-per-app-server
•
    counter API (after 1.0 is out)
Locking
•
    Zookeeper
•
    Cages: http://code.google.com/p/cages/
•
    Not suitable for multi-DC
UUIDs

counter1   672e34a2-ba33...   b681a0b1-58f2...


counter2   3f19757a-c89d...   844e75e2-b546...   a20fcf52-595c...




counter1    aggregated: 27


counter2    aggregated: 42
Column per appserver

counter1   672e34a2-ba33: 12    b681a0b1-58f2: 4   1872c1c2-38f1: 9


counter2   3f19757a-c89d: 7    844e75e2-b546: 11
Counter API

 key   counter1: (14, 13, 9)   counter2: (11, 15, 17)
General Tips
●
    Start with queries, work backwards
●
    Avoid storing extra “timestamp” columns
●
    Insert instead of check-then-insert
●
    Use client-side clock to your advantage
●
    use TTL
●
    Learn to love wide rows
Cassandra Tutorial

Cassandra Tutorial

  • 1.
    Apache Cassandra inAction Jonathan Ellis jbellis@datastax.com / @spyced
  • 2.
    Why Cassandra? • Relational databases are not designed to scale • B-trees are slow – and require read-before-write
  • 9.
    (“The eBay Architecture,”Randy Shoup and Dan Pritchett)
  • 14.
    Reader Memtable Writer Commitlog The Log-Structured Merge-Tree, Bigtable: A Distributed Storage System for Structured Data
  • 15.
    Dynamo, 2007 Bigtable, 2006 OSS, 2008 Incubator, 2009 TLP, 2010
  • 16.
    Cassandra in production • Digital Reasoning: NLP + entity analytics • OpenWave: enterprise messaging • OpenX: largest publisher-side ad network in the world • Cloudkick: performance data & aggregation • SimpleGEO: location-as-API • Ooyala: video analytics and business intelligence • ngmoco: massively multiplayer game worlds
  • 17.
    FUD? • “Cassandra is only appropriate for unimportant data.”
  • 18.
    Durabilty • Write to commitlog – fsync is cheap since it’s append-only • Write to memtable • [amortized] flush memtable to sstable
  • 19.
    SSTable format, briefly <key 127> <key 255> <row data 0> ... <row data 1> ... <row data 127> ... <row data 255> ... Sorted [clustered] by row key
  • 20.
  • 21.
    W A T L
  • 22.
    W A F T L
  • 23.
    W A F (A-L] T L
  • 24.
    W A (A-F] F T (F-L] L
  • 25.
    Key “C” W A F T L
  • 26.
    Reliability • No single points of failure • Multiple datacenters • Monitorable
  • 27.
    Some headlines • “Resyncing Broken MySQL Replication” • “How To Repair MySQL Replication” • “Fixing Broken MySQL Database Replication” • “Replication on Linux broken after db restore” • “MySQL :: Repairing broken replication”
  • 30.
    Good architecture solvesmultiple problems at once • Availability in single datacenter • Availability in multiple datacenters
  • 31.
    Y Key “C” A W U F T L P
  • 32.
    Y Key “C” A W U F X T hint L P
  • 33.
    Y A W U F T L P
  • 35.
    Y Key “C” A W U F T L P
  • 36.
    Y Key “C” A W U F T L P
  • 37.
    Tuneable consistency • ONE, QUORUM, ALL • R+W>N • Choose availability vs consistency (and latency)
  • 38.
  • 39.
  • 40.
  • 41.
    When do youneed Cassandra? • Ian Eure: “If you’re deploying memcache on top of your database, you’re inventing your own ad-hoc, difficult to maintain NoSQL data store”
  • 42.
    Not Only SQL • Curt Monash: “ACID-compliant transaction integrity commonly costs more in terms of DBMS licenses and many other components of TCO (Total Cost of Ownership) than [scalable NoSQL]. Worse, it can actually hurt application uptime, by forcing your system to pull in its horns and stop functioning in the face of failures that a non-transactional system might smoothly work around. Other flavors of “complexity can be a bad thing” apply as well. Thus, transaction integrity can be more trouble than it’s worth.” [Curt’s emphasis]
  • 44.
    Keyspaces & ColumnFamilies • Conceptually, like “schemas” and “tables”
  • 45.
    Inside CFs, columnsare dynamic • Twitter: “Fifteen months ago, it took two weeks to perform ALTER TABLE on the statuses [tweets] table.”
  • 46.
    ColumnFamilies • Static – Object data • Dynamic – Precalculated query results
  • 47.
    “static” columnfamilies Users zznate Password: * Name: Nate driftx Password: * Name: Brandon thobbs Password: * Name: Tyler jbellis Password: * Name: Jonathan Site: riptano.com
  • 48.
    “dynamic” columnfamilies Following zznate driftx: thobbs: driftx thobbs zznate: jbellis driftx: mdennis: pcmanus thobbs: xedin: zznate
  • 49.
    Inserting • Really “insert or update” • Not a key/value store – update as much of the row as you want
  • 50.
    Example: twissandra • http://twissandra.com
  • 51.
    CREATE TABLE users( id INTEGER PRIMARY KEY, username VARCHAR(64), password VARCHAR(64) ); CREATE TABLE following ( user INTEGER REFERENCES user(id), followed INTEGER REFERENCES user(id) ); CREATE TABLE tweets ( id INTEGER, user INTEGER REFERENCES user(id), body VARCHAR(140), timestamp TIMESTAMP );
  • 52.
    Cassandrified create column familyusers with comparator = UTF8Type and column_metadata = [{column_name: password, validation_class: UTF8Type}] create column family tweets with comparator = UTF8Type and column_metadata = [{column_name: body, validation_class: UTF8Type}, {column_name: username, validation_class: UTF8Type}] create column family friends with comparator = UTF8Type create column family followers with comparator = UTF8Type create column family userline with comparator = LongType and default_validation_class = UUIDType create column family timeline with comparator = LongType and default_validation_class = UUIDType
  • 53.
  • 54.
    User RowKey: ericflo => (column=password,value=****, timestamp=1289446382541473) ------------------- RowKey: jbellis => (column=password, value=****, timestamp=1289446438490709) uname = 'jericevans' password = '**********' columns = {'password': password} USER.insert(uname, columns)
  • 55.
    Natural keys vssurrogate
  • 56.
    Friends and Followers RowKey:ericflo => (column=jbellis, value=1289446467611029, timestamp=1289446467611064) => (column=b6n, value=1289446467611031, timestamp=1289446467611080) to_uname = 'ericflo' FRIENDS.insert(uname, {to_uname: time.time()}) FOLLOWERS.insert(to_uname, {uname: time.time()})
  • 57.
    zznate driftx: thobbs: driftx thobbs zznate: jbellis driftx: mdenni pcmanu thobbs: xedin: zznat s: s: e:
  • 58.
    Tweets RowKey: 92dbeb50-ed45-11df-a6d0-000c29864c4f => (column=body,value=Four score and seven years ago, timestamp=1289446891681799) => (column=username, value=alincoln, timestamp=1289446891681799) ------------------- RowKey: d418a66e-edc5-11df-ae6c-000c29864c4f => (column=body, value=Do geese see God?, timestamp=1289501976713199) => (column=username, value=pdrome, timestamp=1289501976713199)
  • 59.
    Userline RowKey: ericflo => (column=1289446393708810,value=6a0b4834-ed44-11df- bc31-000c29864c4f, timestamp=1289446393710212) => (column=1289446397693831, value=6c6b5916-ed44-11df- bc31-000c29864c4f, timestamp=1289446397694646) => (column=1289446891681780, value=92dbeb50-ed45-11df- a6d0-000c29864c4f, timestamp=1289446891685065) => (column=1289446897315887, value=96379f92-ed45-11df- a6d0-000c29864c4f, timestamp=1289446897317676)
  • 60.
    Userline zznate 1289847840615: 3f19757a-c89d... 1289847887086: a20fcf52-595c... driftx thobbs 1289847887086: a20fcf52-595c... jbellis 1289847840615: 3f19757a-c89d... 128984784425: 844e75e2-b546...
  • 62.
    Timeline RowKey: ericflo => (column=1289446393708810,value=6a0b4834-ed44-11df- bc31-000c29864c4f, timestamp=1289446393710212) => (column=1289446397693831, value=6c6b5916-ed44-11df- bc31-000c29864c4f, timestamp=1289446397694646) => (column=1289446891681780, value=92dbeb50-ed45-11df- a6d0-000c29864c4f, timestamp=1289446891685065) => (column=1289446897315887, value=96379f92-ed45-11df- a6d0-000c29864c4f, timestamp=1289446897317676)
  • 63.
    Adding a tweet tweet_id= str(uuid()) body = '@ericflo thanks for Twissandra, it helps!' timestamp = long(time.time() * 1e6) columns = {'uname': useruuid, 'body': body} TWEET.insert(tweet_id, columns) columns = {ts: tweet_id} USERLINE.insert(uname, columns) TIMELINE.insert(uname, columns) for follower_uname in FOLLOWERS.get(uname, 5000): TIMELINE.insert(follower_uname, columns)
  • 64.
    Reads timeline = USERLINE.get(uname,column_reversed=True) tweets = TWEET.multiget(timeline.values()) start = request.GET.get('start') limit = NUM_PER_PAGE timeline = TIMELINE.get(uname, column_start=start, column_count=limit, column_reversed=True) tweets = TWEET.multiget(timeline.values())
  • 65.
    Programatically • Don't use thrift directly • Higher level clients have a lot of features you want – Knowledge about data types – Connection pooling – Automatic retries – Logging
  • 66.
    Raw thrift API:Connecting def get_client(host='127.0.0.1', port=9170): socket = TSocket.TSocket(host, port) transport = TTransport.TBufferedTransport(socket) transport.open() protocol = TBinaryProtocol.TBinaryProtocolAccelerated(transport) client = Cassandra.Client(protocol) return client
  • 67.
    Raw thrift API:Inserting data = {'id': useruuid, ...} columns = [Column(k, v, time.time()) for (k, v) in data.items()] mutations = [Mutation(ColumnOrSuperColumn(column=c)) for c in columns] rows = {useruuid: {'User': mutations}} client.batch_mutate('Twissandra', rows, ConsistencyLevel.ONE)
  • 68.
    API layers • libpq • Thrift • JDBC • Hector • JPA • Hector object- mapper
  • 69.
    Running twissandra • Login: notroot/notroot – (root/riptano) • cd twissandra • python manage.py runserver & • Navigate to http://127.0.0.1:8000 • Login as jim/jim, tom/tom, or create your own
  • 70.
    One more thing • !PUBLIC! userline
  • 71.
    Exercise 1 • $ cassandra-cli --host localhost • ] use twissandra; ] help; ] help list; ] help get; ] help del; • Delete the most recent tweet – How would you find this w/o looking at the UI?
  • 72.
    Exercise 2 • User jim is following user tom, but twissandra doesn't populate Timeline with tweets from before the follow action. • Insert a tweet from tom before the follow action into jim's timeline
  • 73.
  • 74.
    Exercise 3 • Add a state column to the Tweet column family definition, with an index (index_type KEYS). – Hint: a no-op update column family on Tweet would be update column family Tweet with column_metadata=[{column_name:body, validation_class:UTF8Type}, {column_name:username, validation_class:UTF8Type}] • Set the state column on several tweets to TX. Select them using get … where.
  • 75.
    Language support • Python – pycassa – telephus • Ruby – Speed is a negative • Java – Hector • PHP – phpcassa
  • 76.
    Done yet? • Still doing 1+N queries per page • Solution: Supercolumns
  • 77.
    Applying SuperColumns toTwissandra jbellis 1289847840615 1289847844275 1289847844275 1289847887086 1289847844275 Id: Id: Id: Id: Id: Id: 3f19757a-c89d... 3f19757a- 844e75e2-b546... 3f19757a- a20fcf52-595c... 3f19757a- c89d... c89d... c89d... uname: uname: uname: uname: uname: uname: zznate zznate driftx zznate zznate zznate body: body: body: body: body: body: O Do geese see stone be not so Rise geese see Do to vote sir Do Igeese see prefer pi ... ... ...
  • 78.
    Supercolumns: limitations • Requires reading an entire SC (not the entire row) from disk even if you just want one subcolumn
  • 79.
    UUIDs • Column names should be uuids, not longs, to avoid collisions • Version 1 UUIDs can be sorted by time (“TimeUUID”) • Any UUID can be sorted by its raw bytes (“LexicalUUID”) – Usually Version 4 – Slightly less overhead
  • 80.
    Lucandra • What documents contain term X? – … and term Y? – … or start with Z?
  • 81.
    Fields and Terms <doc> <field name=”title”>apache talk</field> <field name=”date”>20110201</field> </doc> feld term freq position title apache 1 0 title talk 1 1 date 20110201 1 0
  • 82.
    Lucandra ColumnFamilies create columnfamily documents with comparator = BytesType; Create column family terminfo with column_type = Super and comparator = BytesType and subcomparator = BytesType;
  • 83.
    Lucandra data Document Key col name value "documentId" => { fieldName , value } Term Key col name value "field/term" => { documentId , position vector }
  • 84.
    Lucandra queries • get_slice • get_range_slices • No silver bullet
  • 85.
    FAQ: counting • UUIDs + batch process • column-per-app-server • counter API (after 1.0 is out)
  • 86.
    Locking • Zookeeper • Cages: http://code.google.com/p/cages/ • Not suitable for multi-DC
  • 87.
    UUIDs counter1 672e34a2-ba33... b681a0b1-58f2... counter2 3f19757a-c89d... 844e75e2-b546... a20fcf52-595c... counter1 aggregated: 27 counter2 aggregated: 42
  • 88.
    Column per appserver counter1 672e34a2-ba33: 12 b681a0b1-58f2: 4 1872c1c2-38f1: 9 counter2 3f19757a-c89d: 7 844e75e2-b546: 11
  • 89.
    Counter API key counter1: (14, 13, 9) counter2: (11, 15, 17)
  • 90.
    General Tips ● Start with queries, work backwards ● Avoid storing extra “timestamp” columns ● Insert instead of check-then-insert ● Use client-side clock to your advantage ● use TTL ● Learn to love wide rows