KEMBAR78
Cassandra Codebase 2011 | PPTX
Codebase 2011Getting to know the codebaseGary Dusbabek@gdusbabek
Questions?
OutlineHow to contributeInternalsSome thoughts
How to Contribute
How to Contributehttp://wiki.apache.org/cassandra/HowToContributeJIRA: “lhf” label (Low hanging fruit)Scratch your itch
How to ContributeRun the testsant testnosetests test/system/test_thrift_server.py
How to Contributehttp://wiki.apache.org/cassandra/CodeStyleAvoid:Reformatting white spaceRenaming things everywhereUnrelated changes
How to ContributeUse gitAttach patchesgit format-patch as jira attachments.Group them sensibly
How to ContributeSomeone will review your codeUsually a committerPersistence helpsDon’t get your feelings hurtIt usually takes a few rounds
How to ContributeParticipate!#cassandra-dev on freenodedev@cassandra.apache.org
Internals
ServicesRing Operations (StorageService)Storage Operations (StorageProxy)
Startup Sequencebin/cassandraFinds cassandra.in.sh$CLASSPATH (mandatory)$CASSANDRA_HOME$CASSANDRA_CONF (mandatory)Executes $CASSANDRA_CONF/cassandra-env.shSets heap sizes (gc tuning goes here!)
o.a.c.thrift.CassandraDaemon
AbstractCassandraDaemonACD.setup():Reads configuration: DatabaseDescriptorLoads schema: DD.loadSchemas()Scrub directoriesInitialize storage (keyspaces + CFs)Commit log recovery: CL.recover()StorageService.initServer() -> StorageService.joinTokenRing()
Attn Tinkerers!Abstracted initialization of transport.Handy if you’re experimenting with transports/RPCJust extend AbstractCassandraDaemon and make sure that class is started up via bin/cassandra.
o.a.c.thrift.CassandraServerImplements thrift interface methods (the API).Start here when trying to understand the read/write path and RPC.
ConfigurationDatabaseDescriptorSide-effect of ACD.setup()Reads config settings from yamlDefines system tablesChanges regularlyI hate this code.  Please fix it.
Main SingletonsStorageServiceStorageProxyMessagingServiceCompactionManagerStageManagerMigrationManager
Did you just say ‘Singletons?’
Main SingletonsStorageServiceStorageProxyMessagingServiceCompactionManagerStageManagerMigrationManager
JMX MBeansTooling supplied by MbeansAnything that does measureable/configurable work is tooledThread poolsCompactionHinted handoffStreamingStorageCommit log
StorageServiceinitServer() -> joinTokenRing()Starts gossipStarts MessagingServiceNegotiates bootstrapMany ring operations live here.Repository of ring topologyTokenMetadata (quasi-singleton via SS.tokenMetadata_)Partitioner instance is also here
MessagingServiceVerb handlers live here (initialized from SS).Main event handlers, haven’t changed much.Socket listener2 threads per ring nodeMessage gatewayemitted from MessageProducerimplsMS.sendRR()MS.sendOneWay()MS.receive()Messages are versioned now (0.8)IncomingTCPConnection
StorageProxyTop level of all read/write operationsCalled from o.a.c.thrift.CassandraServerWrite path changed because of countersNotion of WritePerformerEventually to Table and ColumnFamilyStoreFurther, to SSTable and related classes.
StageManagerFancy java ThreadPoolExecutorSEDA:  http://www.eecs.harvard.edu/~mdw/papers/seda-sosp01.pdfconsumes callables from a queue.Manages concurrency.Hasn’t changed much.
Adding API MethodsDefine method+structures in IDLinterface/cassandra.thriftRegenerate filesant gen-thrift-java gen-thrift-pyImplement stubs:o.a.c.thrift.CassandraServerCreate a system testtests/system/test_thrift_server.py
ReadingSocket->CassandraServerPermissionsRequest validationMarshallingReadCommands created in CS.multigetSliceInternal, passed to StorageProxy1 per key
ReadingStorageProxy.read(), fetchRows()For each ReadCommandDetermine endpointsLocal & remote branches
ReadingStorageProxy localREAD stage executes a LocalReadRunnableTrue read vs digestTable, ColumnFamilyStoreCFS.getTopLevelColumnsMake QueryFilterQuery MemtablesQuery SSTablesCoalesce in iterators
ReadingStorageProxy remoteread commandResponse handlerSend to remote nodesRead repair happens in SP.fetchRows().
WritingCS.doInsert()Marshalling, creates RMsStorageProxylocal/remote branchSP.sendToHintedEndpoints()RowMutationone Key per (several CFs)ColumnFamilyCollection of column modifications
WritingRM.apply->Table.applyWrite to CLIterate over RM CFsCFS.apply()Overwrites results on pre-existing column families
WritingRM is serialized into a Message and sent to other nodesWaits for ACKs depending on CL
Challenges
ChallengesTo have an in-depth understanding of everything.Hard for hobbyist/part-timersOutside of Datastax, little support for full-timersStill changing fastKeeping up
Challenge: Lines of Code0.4 (Sep 2009)52 kloc0.5 (Jan 2010)59 kloc0.6 (Apr 2010)73 kloc0.7 (Jan 2011)122 kloc0.8 (Jun 2011)146 klocTrunk (yesterday)149 klocAverage:4,500 lines per month
ChallengesCodewise Growing painsSoftware maturityDecisions made early on

Cassandra Codebase 2011

Editor's Notes

  • #2 Who was here last year?Very good presentations on data modeling and capacity planning.
  • #3 Turn it around.Ask questions first.
  • #16 Transport still not initialized though.DD getting loaded is just a side-effect
  • #19 This is actually a good exercise.
  • #23 Good place to extend and experiment on your own.