Building Distributed Systems in Scala

Building Distributed Systems in Scala
A presentation to Emerging Technologies for the Enterprise
April 8, 2010 – Philadelphia, PA

TM

About @al3x
‣ At Twitter since 2007
‣ Working on the Web
since 1995
‣ Co-author of
Programming Scala
(O’Reilly, 2009)
‣ Into programming
languages,
distributed systems.

About Twitter
‣ Social messaging – a
new way to
communicate
‣ Launched in
mid-2006
‣ Hit the mainstream in
2008
‣ 50+ million tweets per
day (600+ per
second)
‣ Millions of users
worldwide

Technologies Used At Twitter
Languages Frameworks
‣ Ruby, JavaScript ‣ Rails
‣ Scala ‣ jQuery
‣ lil’ bit of C, Python, Java

Data Storage Misc.
‣ MySQL ‣ memcached
‣ Cassandra ‣ ZooKeeper
‣ HBase (Hadoop) ‣ Jetty
‣ so much more!

Why Scala?
‣ A language that’s both fun and productive.
‣ Great performance (on par with Java).
‣ Object-oriented and functional programming,
together.
‣ Ability to reuse existing Java libraries.
‣ Flexible concurrency (Actors, threads, events).
‣ A smart community with infectious momentum.

Hawkwind
A case study in (re)building
a distributed system in Scala.

Requirements
‣ Search for people by name, username, eventually
by other attributes.
‣ Order the results some sensible way (ex: by
number of followers).
‣ Offer suggestions for misspellings/alternate names.
‣ Handle case-folding and other text normalization
concerns on the query string.
‣ Return results in about a second, preferably less.

Finding People on Twitter

results


suggestion

results


speedy!

suggestion

results

First Attempt: acts_as_solr
‣ Crunched on time, so we wanted the fastest
route to working user search.
‣ Uses the Solr distribution/platform from Apache
Lucene.
‣ Tries to make Rails integration straightforward
and idiomatic.
‣ Easy to get running, hard to operationalize.

In the Interim: A Move to SOA
‣ Stopped thinking of our architecture as just a
Rails app and the components that orbit it.
‣ Started building isolated services that
communicate with the rest of the system via
Thrift (an RPC and server framework).
‣ Allows us freedom to change the underlying
implementation of services without modifying the
rest of the system.

Thrift Example
struct Results {
1: list<i64> people
2: string suggestion
3: i32 processingTime /* milliseconds */
4: list<i32> timings
5: i32 totalResults
}

service NameSearch {
Results ﬁnd(1: string name, 2: i32 maxResults, 3: bool
wantSuggestion)

Results ﬁnd_with_ranking(1: string name, 2: i32 maxResults, 3: bool
wantSuggestion, 4: Ranker ranking)
}

Second Attempt: Hawkwind 1
‣ A quick (three weeks) bespoke Scala project to
“stop the bleeding”.
‣ Vertically but not horizontally scalable: no
sharding, no failover, machine-level redundancy.
‣ Ran into memory and disk space limits.
‣ Reused Java code but didn’t offer nice Scala
wrappers or rewrites.
‣ Still, planned to grow 10x, grew 25x!

Goals for Hawkwind 2
‣ Horizontally scalable: sharded corpus,
replication of shards, easy to grow the service.
‣ Faster.
‣ Higher-quality results.
‣ Better use of Scala (language features,
programming style).
‣ Maintainable code base, make it easy to add
features.

High-Level Concepts
‣ Shards: pieces of the user corpus.
‣ Replicas: copies of shards.
‣ Document Servers.
‣ Merge Servers.
‣ Every machine gets the same code, can be
either a Document Server or a Merge Server.

Hawkwind 2 Internet

High-Level queries for users, API requests

Architecture Rails Cluster

Thrift call to semi-random Merge Server

Merge Merge Merge
Server Server Server

Thrift calls to semi-random replica of each shard

Shard 1 Shard 1 Shard 2 Shard 2 Shard 3 Shard 3
Doc Server Doc Server Doc Server Doc Server Doc Server Doc Server

periodic deliveries of sharded user corpus

Hadoop (HBase)

Taking Care of Data
‣ A Hadoop job gathers up the user data and slices it
into shards.
‣ A cron job fetches these data dumps several times
per day.
‣ To load a new corpus on a Document Server, simply
restart the process.
‣ Redundancy and staggered scheduling keeps the
system from running too hot while restarts are in
progress.

What a Document Server does
‣ On startup, load Thrift serialized User objects.
‣ Populate an Inverted Index, Map, and Trie with
normalized attributes of those User objects.
‣ Once ready, listen for queries.
‣ Answering a query basically means looking
stuff up in those pre-populated data structures.
‣ Maintains a connection pool for Thrift requests,
wrapping org.apache.commons.pool.

What a Merge Server does
‣ Gets queries.
‣ Fans out queries to Document Servers.
‣ Waits for queries to come back using a custom
ParallelFuture class, which wraps a number of
java.util.concurrent classes.
‣ Merges together the result sets, re-ranks them,
and ships ‘em back to the requesting client.

How to model a distributed system?
‣ Literal decomposition: classes for all
architectural components (Shard, Replica, etc.).
‣ Each component knows/does as little as
possible.
‣ Isolate mutable state, test carefully.
‣ Cleanly delegate calls.

Literal Decomposition: Replica
case class Replica(val shard: Shard, val server: Server) {
private val log = Logger.get
val BACKEND_ERROR = Stats.getCounter("backend_timeout")

def query(q: Query): DocResults = w3c.time("replica-query") {
server.thriftCall { client =>
// logic goes here
}
}

def ping(): Boolean = server.thriftCall { client =>
log.debug("calling ping via thrift for %s", server)
val rv = client.ping()
log.debug("ping returned %s from %s", rv, server)
rv
}
}

Literal Decomposition: Server
case class Server(val hostname: String, val port: Int) {
val pool = ConnectionPool(hostname, port)
private val log = Logger.get

def thriftCall[A](f: Client => A) = {
log.debug("making thriftCall for server %s", this)
pool.withClient { client => f(client) }
}

def replica: Replica = {
Replica(ShardMap.serversToShards(this), this)
}
}

Hawkwind 2
Query Call
MergeLayer.query

Graph ShardMap.query

shard.replicaManager ! query

shard.query

randomReplica()

replica.query

server.thriftCall

NameSearchDocumentLayerClient.ﬁnd

Hawkwind 2
Query Call
MergeLayer.query

Graph what’s this? ShardMap.query

shard.replicaManager ! query

shard.query

randomReplica()

replica.query

server.thriftCall

NameSearchDocumentLayerClient.ﬁnd

ShardMap: Isolating Mutable State
‣ A singleton and an Actor.
‣ Contains a map from Servers to their
corresponding Shards.
‣ Also contains a map from Shards to the Replicas
of those shards.
‣ Responsible for populating and managing
those maps.
‣ Send it a message to evict or reinsert a Replica.
‣ Fans out queries to Shards.

ReplicaHealthChecker
‣ Much like the ShardMap, a singleton and an
Actor.
‣ Maintains mutable lists of unhealthy Replicas
(“the penalty box”).
‣ Constantly checking to see if evicted Replicas
are healthy again (back online).
‣ Sends messages to itself – an effective Actor
technique.

Challenges, Large and Small
‣ Fast importing of huge serialized Thrift object
dumps.
‣ Testing the ShardMap and ReplicaHealthChecker
(mutable state wants to hurt you).
‣ Efficient accent normalization and filtering for
special characters.
‣ Working with the Apache Commons object pool.
‣ Breaking out different ranking mechanisms in a
clean, reusable way.

Libraries & Tools
Things that make working in Scala
way more productive.

sbt – the Simple Build Tool
‣ Scala’s answer to Ant and Maven.
‣ Sets up new projects.
‣ Maintains project configuration, build tasks,
and dependencies in pure Scala. Totally open-
ended.
‣ Interactive console.
‣ Will run tasks as soon as files in your project
change – automatically compile and run tests!

Ostrich
‣ Gather statistics about your application.
‣ Counters, gauges, and timings.
‣ Share stats via JMX, a plain-text socket, a web
interface, or log files.
‣ Ex:
Stats.time("foo") {
timeConsumingOperation()
}

Configgy
‣ Manages configuration files and logging.
‣ Flexible file format, can include files in other files.
‣ Inheritance, variable substitution.
‣ Tunable logging, logging with Scribe.
‣ Subscription API: push and validate
configuration changes to running processes.
‣ Ex:
val foo = config.getString(“foo”)

Specs + xrayspecs
‣ A behavior-driven development (BDD) testing
framework for Scala.
‣ Elegant, readable, fun-to-write tests.
‣ Support for several mocking frameworks (we
like Mockito).
‣ Test concurrent operations, time, much more.
‣ Ex:
"suggestion with a List of null does not blow up" in {
MergeLayer.suggestion("steve", List(null)) mustEqual None
}

Questions? Follow me at
twitter.com/al3x

Learn with us at engineering.twitter.com
Work with us at jobs.twitter.com

TM

Building Distributed Systems in Scala

More Related Content

What's hot

Viewers also liked

Similar to Building Distributed Systems in Scala

More from Alex Payne

Recently uploaded

Building Distributed Systems in Scala

Editor's Notes