KEMBAR78
Solr4 nosql search_server_2013 | PDF
Solr 4
The NoSQL Search Server
Yonik Seeley
May 30, 2013
2 2
NoSQL Databases
• Wikipedia says:
A NoSQL database provides a mechanism for storage and retrieval of data that
use looser consistency models than traditional relational databases in order to
achieve horizontal scaling and higher availability. Some authors refer to them as
"Not only SQL" to emphasize that some NoSQL systems do allow SQL-like query
language to be used.
• Non-traditional data stores
• Doesn’t use / isn’t designed around SQL
• May not give full ACID guarantees
• Offers other advantages such as greater scalability as a
tradeoff
• Distributed, fault-tolerant architecture
3 3
Solr Cloud Design Goals
• Automatic Distributed Indexing
• HA for Writes
• Durable Writes
• Near Real-time Search
• Real-time get
• Optimistic Concurrency
4 4
Solr Cloud
• Distributed Indexing designed from the ground up to
accommodate desired features
• CAP Theorem
• Consistency, Availability, Partition Tolerance (saying goes “choose 2”)
• Reality: Must handle P – the real choice is tradeoffs between C and A
• Ended up with a CP system (roughly)
• Value Consistency over Availability
• Eventual consistency is incompatible with optimistic concurrency
• Closest to MongoDB in architecture
• We still do well with Availability
• All N replicas of a shard must go down before we lose writability for that
shard
• For a network partition, the “big” partition remains active (i.e. Availability
isn’t “on” or “off”)
5 5
Solr 4
6 6
Solr 4 at a glance
• Document Oriented NoSQL Search Server
• Data-format agnostic (JSON, XML, CSV, binary)
• Schema-less options (more coming soon)
• Distributed
• Multi-tenanted
• Fault Tolerant
• HA + No single points of failure
• Atomic Updates
• Optimistic Concurrency
• Near Real-time Search
• Full-Text search + Hit Highlighting
• Tons of specialized queries: Faceted search, grouping,
pseudo-join, spatial search, functions
The desire for these
features drove some
of the “SolrCloud”
architecture
7 7
Quick Start
1.  Unzip the binary distribution (.ZIP file)
Note: no “installation” required
3.  Start Solr
4.  Go!
Browse to http://localhost:8983/solr for the new admin
interface
$	
  cd	
  example	
  
$	
  java	
  –jar	
  start.jar	
  
8 8
New admin UI
9 9
Add and Retrieve document
$ curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d '
[
{ "id" : "book1",
"title" : "American Gods",
"author" : "Neil Gaiman"
}
]'
$ curl http://localhost:8983/solr/get?id=book1
{	
  
	
  	
  "doc":	
  {	
  
	
  	
  	
  	
  "id"	
  :	
  "book1",	
  
	
  	
  	
  	
  "author":	
  "Neil	
  Gaiman",	
  
	
  	
  	
  	
  "title"	
  :	
  "American	
  Gods",	
  
	
  	
  	
  	
  "_version_":	
  1410390803582287872	
  
	
  	
  }	
  
}	
  
Note: no type of “commit”
is necessary to retrieve
documents via /get
(real-time get)
10 10
Simplified JSON Delete Syntax
• Singe delete-by-id
{"delete":”book1"}	
  
• Multiple delete-by-id
{"delete":[”book1”,”book2”,”book3”]}	
  
• Delete with optimistic concurrency
{"delete":{"id":”book1",	
  "_version_":123456789}}	
  
• Delete by Query
{"delete":{”query":”tag:category1”}}	
  
11 11
Atomic Updates
$	
  curl	
  http://localhost:8983/solr/update	
  -­‐H	
  'Content-­‐type:application/json'	
  -­‐d	
  '	
  
[	
  
	
  {"id"	
  	
  	
  	
  	
  	
  	
  	
  :	
  "book1",	
  
	
  	
  "pubyear_i"	
  :	
  {	
  "add"	
  :	
  2001	
  },	
  
	
  	
  "ISBN_s"	
  	
  	
  	
  :	
  {	
  "add"	
  :	
  "0-­‐380-­‐97365-­‐1"}	
  
	
  }	
  
]'	
  
$	
  curl	
  http://localhost:8983/solr/update	
  -­‐H	
  'Content-­‐type:application/json'	
  -­‐d	
  '	
  
[	
  
	
  {"id"	
  	
  	
  	
  	
  	
  	
  	
  :	
  "book1",	
  
	
  	
  "copies_i"	
  	
  :	
  {	
  "inc"	
  :	
  1},	
  
	
  	
  "cat"	
  	
  	
  	
  	
  	
  	
  :	
  {	
  "add"	
  :	
  "fantasy"},	
  
	
  	
  "ISBN_s"	
  	
  	
  	
  :	
  {	
  "set"	
  :	
  "0-­‐380-­‐97365-­‐0"}	
  
	
  	
  "remove_s"	
  	
  :	
  {	
  "set"	
  :	
  null	
  }	
  }	
  
]'	
  
12 12
Optimistic Concurrency
•  Conditional update based on document version
Solr
1. /get document
2. Modify
document,
retaining
_version_
3. /update resulting
document
4. Go back to
step #1 if fail
code=409
client
13 13
Version semantics
_version_ Update Semantics
> 1 Document version must exactly match supplied
_version_
1 Document must exist
< 0 Document must not exist
0 Don’t care (normal overwrite if exists)
•  Specifying _version_ on any update
invokes optimistic concurrency
14 14
Optimistic Concurrency Example
$	
  curl	
  http://localhost:8983/solr/update	
  -­‐H	
  'Content-­‐type:application/json'	
  -­‐d	
  '	
  
[	
  	
  {	
  
	
  	
  	
  	
  "id":"book2",	
  
	
  	
  	
  	
  "title":["Neuromancer"],	
  
	
  	
  	
  	
  "author":"William	
  Gibson",	
  
	
  	
  	
  	
  "copiesIn_i":6,	
  
	
  	
  	
  	
  "copiesOut_i":4,	
  
	
  	
  	
  	
  "_version_":123456789	
  }	
  
]'	
  
$	
  curl	
  http://localhost:8983/solr/get?id=book2	
  
{	
  "doc”	
  :	
  {	
  
	
  	
  	
  	
  "id":"book2",	
  
	
  	
  	
  	
  "title":["Neuromancer"],	
  
	
  	
  	
  	
  "author":"William	
  Gibson",	
  
	
  	
  	
  	
  "copiesIn_i":7,	
  
	
  	
  	
  	
  "copiesOut_i":3,	
  
	
  	
  	
  	
  "_version_":123456789	
  }}	
  
curl http://localhost:8983/solr/update?_version_=123456789 -H 'Content-type:application/json'
-d […]
Get the document
Modify and resubmit, using
the same _version_
Alternately, specify
the _version_ as a
request parameter
15 15
Optimistic Concurrency Errors
• HTTP Code 409 (Conflict) returned on version mismatch
$ curl -i http://localhost:8983/solr/update -H 'Content-type:application/json' -d '
[{"id":"book1", "author":"Mr Bean", "_version_":54321}]'
HTTP/1.1	
  409	
  Conflict	
  
Content-­‐Type:	
  text/plain;charset=UTF-­‐8	
  
Transfer-­‐Encoding:	
  chunked	
  
	
  	
  
{	
  
	
  	
  "responseHeader":{	
  
	
  	
  	
  	
  "status":409,	
  
	
  	
  	
  	
  "QTime":1},	
  
	
  	
  "error":{	
  
	
  	
  	
  	
  "msg":"version	
  conflict	
  for	
  book1	
  expected=12345	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  actual=1408814192853516288",	
  
	
  	
  	
  	
  "code":409}}	
  
16 16
Schema
17 17
Schema REST API
• Restlet is now integrated with Solr
• Get a specific field
curl	
  
http://localhost:8983/solr/schema/fields/price	
  
{"field":{	
  
	
  	
  	
  	
  "name":"price",	
  
	
  	
  	
  	
  "type":"float",	
  
	
  	
  	
  	
  "indexed":true,	
  
	
  	
  	
  	
  "stored":true	
  }}	
  
• Get all fields
curl	
  http://localhost:8983/solr/schema/fields	
  
• Get Entire Schema!
curl	
  http://localhost:8983/solr/schema	
  
	
  
18 18
Dynamic Schema
• Add a new field (Solr 4.4)
curl	
  -­‐XPUT	
  http://localhost:8983/solr/schema/fields/strength	
  -­‐d	
  ‘
{"type":”float",	
  "indexed":"true”}	
  	
  	
  
‘	
  
• Works in distributed (cloud) mode too!
• Schema must be managed & mutable (not currently the default)
<schemaFactory	
  class="ManagedIndexSchemaFactory">	
  
	
  	
  <bool	
  name="mutable">true</bool>	
  
	
  	
  <str	
  name="managedSchemaResourceName">managed-­‐schema</str>	
  
</schemaFactory>	
  	
  
19 19
Schemaless
• “Schemaless” really normally means that the client(s) have an implicit
schema
• “No Schema” impossible for anything based on Lucene
•  A field must be indexed the same way across documents
• Dynamic fields: convention over configuration
•  Only pre-define types of fields, not fields themselves
•  No guessing. Any field name ending in _i is an integer
• “Guessed Schema” or “Type Guessing”
•  For previously unknown fields, guess using JSON type as a hint
•  Coming soon (4.4?) based on the Dynamic Schema work
• Many disadvantages to guessing
•  Lose ability to catch field naming errors
•  Can’t optimize based on types
•  Guessing incorrectly means having to start over
20 20
Solr Cloud
21 21
Solr Cloud
shard1
replica2
replica3
replica2
replica3
ZooKeeper
quorum
ZK
node
ZK
node
ZK
node
ZK
node
ZK
node
/configs
/myconf
solrconfig.xml
schema.xml
/clusterstate.json
/aliases.json
/livenodes
server1:8983/solr
server2:8983/solr/collections
/collection1
configName=myconf
/shards
/shard1
server1:8983/solr
server2:8983/solr
/shard2
server3:8983/solr
server4:8983/solr
http://.../solr/collection1/query?q=awesome
Load-balanced
sub-request
replica1
shard2
replica1
ZooKeeper holds cluster state
•  Nodes in the cluster
•  Collections in the cluster
•  Schema & config for each collection
•  Shards in each collection
•  Replicas in each shard
•  Collection aliases
22 22
Distributed Indexing
shard1
http://.../solr/collection1/update
shard2
•  Update sent to any node
•  Solr determines what shard the document is on, and forwards to shard leader
•  Shard Leader versions document and forwards to all other shard replicas
•  HA for updates (if one leader fails, another takes it’s place)
23 23
Collections API
l  Create a new document collection
http://localhost:8983/solr/admin/collections?	
  
	
  action=CREATE	
  	
  
	
  &name=mycollection	
  
	
  &numShards=4	
  
	
  &replicationFactor=3	
  
	
  
l  Delete a collection
http://localhost:8983/solr/admin/collections?	
  
	
  action=DELETE	
  
	
  &name=mycollection	
  
	
  
l  Create an alias to a collection (or a group of collections)
http://localhost:8983/solr/admin/collections?	
  
	
  action=CREATEALIAS	
  
	
  &name=tri_state	
  
	
  &collections=NY,NJ,CT	
  
24 24
http://localhost:8983/solr/#/~cloud
25 25
Distributed Query Requests
l  Distributed query across all shards in the collection
http://localhost:8983/solr/collection1/query?q=foo	
  
	
  
l  Explicitly specify node addresses to load-balance across
shards=localhost:8983/solr|localhost:8900/solr,	
  
	
  	
  	
  	
  	
  	
  	
  localhost:7574/solr|localhost:7500/solr	
  
l  A list of equivalent nodes are separated by “|”
l  Different phases of the same distributed request use the same node
l  Specify logical shards to search across
shards=NY,NJ,CT	
  
	
  
l  Specify multiple collections to search across
collection=collection1,collection2	
  
	
  
l  public	
  CloudSolrServer(String	
  zkHost)	
  
l  ZK aware SolrJ Java client that load-balances across all nodes in cluster
l  Calculate where document belongs and directly send to shard leader (new)
26 26
Durable Writes
• Lucene flushes writes to disk on a “commit”
• Uncommitted docs are lost on a crash (at lucene level)
• Solr 4 maintains it’s own transaction log
• Contains uncommitted documents
• Services real-time get requests
• Recovery (log replay on restart)
• Supports distributed “peer sync”
• Writes forwarded to multiple shard replicas
• A replica can go away forever w/o collection data loss
• A replica can do a fast “peer sync” if it’s only slightly out of date
• A replica can do a full index replication (copy) from a peer
27 27
Near Real Time (NRT) softCommit
• softCommit opens a new view of the index without
flushing + fsyncing files to disk
• Decouples update visibility from update durability
• commitWithin now implies a soft commit
• Current autoCommit defaults from solrconfig.xml:
	
  <autoCommit>	
  	
  
	
  	
  	
  	
  	
  	
  	
  <maxTime>15000</maxTime>	
  	
  
	
  	
  	
  	
  	
  	
  	
  <openSearcher>false</openSearcher>	
  	
  
	
  </autoCommit>	
  
	
  
<!-­‐-­‐	
  	
  <autoSoftCommit>	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  <maxTime>5000</maxTime>	
  	
  
	
  	
  	
  	
  	
  	
  	
  </autoSoftCommit>	
  -­‐-­‐>	
  
28 28
Document Routing
80000000-bfffffff
00000000-3fffffff
40000000-7ffffff
f
c0000000-ffffffff
shard1shard4
shard3 shard2
id	
  =	
  BigCo!doc5	
  
9f27 3c71
(MurmurHash3)
q=my_query	
  
shard.keys=BigCo!	
  
9f27 0000 9f27 ffffto
(hash)
shard1
numShards=4
router=compositeId
hash
ring
29 29
Seamless Online Shard Splitting
Shard2_0
Shard1
replica
leader
Shard2
replica
leader
Shard3
replica
leader
Shard2_1
1.  http://localhost:8983/solr/admin/collections?
action=SPLITSHARD&collection=mycollection&shard=Shard2	
  
2.  New sub-shards created in “construction” state
3.  Leader starts forwarding applicable updates, which are buffered by the sub-shards
4.  Leader index is split and installed on the sub-shards
5.  Sub-shards apply buffered updates then become “active” leaders and old shard
becomes “inactive”
update
30 30
Stay in touch
https://twitter.com/LucidWorks
http://www.linkedin.com/company/lucidworks
http://plus.google.com/u/0/b/112313059186533721298/

Solr4 nosql search_server_2013

  • 1.
    Solr 4 The NoSQLSearch Server Yonik Seeley May 30, 2013
  • 2.
    2 2 NoSQL Databases • Wikipediasays: A NoSQL database provides a mechanism for storage and retrieval of data that use looser consistency models than traditional relational databases in order to achieve horizontal scaling and higher availability. Some authors refer to them as "Not only SQL" to emphasize that some NoSQL systems do allow SQL-like query language to be used. • Non-traditional data stores • Doesn’t use / isn’t designed around SQL • May not give full ACID guarantees • Offers other advantages such as greater scalability as a tradeoff • Distributed, fault-tolerant architecture
  • 3.
    3 3 Solr CloudDesign Goals • Automatic Distributed Indexing • HA for Writes • Durable Writes • Near Real-time Search • Real-time get • Optimistic Concurrency
  • 4.
    4 4 Solr Cloud • DistributedIndexing designed from the ground up to accommodate desired features • CAP Theorem • Consistency, Availability, Partition Tolerance (saying goes “choose 2”) • Reality: Must handle P – the real choice is tradeoffs between C and A • Ended up with a CP system (roughly) • Value Consistency over Availability • Eventual consistency is incompatible with optimistic concurrency • Closest to MongoDB in architecture • We still do well with Availability • All N replicas of a shard must go down before we lose writability for that shard • For a network partition, the “big” partition remains active (i.e. Availability isn’t “on” or “off”)
  • 5.
  • 6.
    6 6 Solr 4at a glance • Document Oriented NoSQL Search Server • Data-format agnostic (JSON, XML, CSV, binary) • Schema-less options (more coming soon) • Distributed • Multi-tenanted • Fault Tolerant • HA + No single points of failure • Atomic Updates • Optimistic Concurrency • Near Real-time Search • Full-Text search + Hit Highlighting • Tons of specialized queries: Faceted search, grouping, pseudo-join, spatial search, functions The desire for these features drove some of the “SolrCloud” architecture
  • 7.
    7 7 Quick Start 1. Unzip the binary distribution (.ZIP file) Note: no “installation” required 3.  Start Solr 4.  Go! Browse to http://localhost:8983/solr for the new admin interface $  cd  example   $  java  –jar  start.jar  
  • 8.
  • 9.
    9 9 Add andRetrieve document $ curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d ' [ { "id" : "book1", "title" : "American Gods", "author" : "Neil Gaiman" } ]' $ curl http://localhost:8983/solr/get?id=book1 {      "doc":  {          "id"  :  "book1",          "author":  "Neil  Gaiman",          "title"  :  "American  Gods",          "_version_":  1410390803582287872      }   }   Note: no type of “commit” is necessary to retrieve documents via /get (real-time get)
  • 10.
    10 10 Simplified JSONDelete Syntax • Singe delete-by-id {"delete":”book1"}   • Multiple delete-by-id {"delete":[”book1”,”book2”,”book3”]}   • Delete with optimistic concurrency {"delete":{"id":”book1",  "_version_":123456789}}   • Delete by Query {"delete":{”query":”tag:category1”}}  
  • 11.
    11 11 Atomic Updates $  curl  http://localhost:8983/solr/update  -­‐H  'Content-­‐type:application/json'  -­‐d  '   [    {"id"                :  "book1",      "pubyear_i"  :  {  "add"  :  2001  },      "ISBN_s"        :  {  "add"  :  "0-­‐380-­‐97365-­‐1"}    }   ]'   $  curl  http://localhost:8983/solr/update  -­‐H  'Content-­‐type:application/json'  -­‐d  '   [    {"id"                :  "book1",      "copies_i"    :  {  "inc"  :  1},      "cat"              :  {  "add"  :  "fantasy"},      "ISBN_s"        :  {  "set"  :  "0-­‐380-­‐97365-­‐0"}      "remove_s"    :  {  "set"  :  null  }  }   ]'  
  • 12.
    12 12 Optimistic Concurrency • Conditional update based on document version Solr 1. /get document 2. Modify document, retaining _version_ 3. /update resulting document 4. Go back to step #1 if fail code=409 client
  • 13.
    13 13 Version semantics _version_Update Semantics > 1 Document version must exactly match supplied _version_ 1 Document must exist < 0 Document must not exist 0 Don’t care (normal overwrite if exists) •  Specifying _version_ on any update invokes optimistic concurrency
  • 14.
    14 14 Optimistic ConcurrencyExample $  curl  http://localhost:8983/solr/update  -­‐H  'Content-­‐type:application/json'  -­‐d  '   [    {          "id":"book2",          "title":["Neuromancer"],          "author":"William  Gibson",          "copiesIn_i":6,          "copiesOut_i":4,          "_version_":123456789  }   ]'   $  curl  http://localhost:8983/solr/get?id=book2   {  "doc”  :  {          "id":"book2",          "title":["Neuromancer"],          "author":"William  Gibson",          "copiesIn_i":7,          "copiesOut_i":3,          "_version_":123456789  }}   curl http://localhost:8983/solr/update?_version_=123456789 -H 'Content-type:application/json' -d […] Get the document Modify and resubmit, using the same _version_ Alternately, specify the _version_ as a request parameter
  • 15.
    15 15 Optimistic ConcurrencyErrors • HTTP Code 409 (Conflict) returned on version mismatch $ curl -i http://localhost:8983/solr/update -H 'Content-type:application/json' -d ' [{"id":"book1", "author":"Mr Bean", "_version_":54321}]' HTTP/1.1  409  Conflict   Content-­‐Type:  text/plain;charset=UTF-­‐8   Transfer-­‐Encoding:  chunked       {      "responseHeader":{          "status":409,          "QTime":1},      "error":{          "msg":"version  conflict  for  book1  expected=12345                        actual=1408814192853516288",          "code":409}}  
  • 16.
  • 17.
    17 17 Schema RESTAPI • Restlet is now integrated with Solr • Get a specific field curl   http://localhost:8983/solr/schema/fields/price   {"field":{          "name":"price",          "type":"float",          "indexed":true,          "stored":true  }}   • Get all fields curl  http://localhost:8983/solr/schema/fields   • Get Entire Schema! curl  http://localhost:8983/solr/schema    
  • 18.
    18 18 Dynamic Schema • Adda new field (Solr 4.4) curl  -­‐XPUT  http://localhost:8983/solr/schema/fields/strength  -­‐d  ‘ {"type":”float",  "indexed":"true”}       ‘   • Works in distributed (cloud) mode too! • Schema must be managed & mutable (not currently the default) <schemaFactory  class="ManagedIndexSchemaFactory">      <bool  name="mutable">true</bool>      <str  name="managedSchemaResourceName">managed-­‐schema</str>   </schemaFactory>    
  • 19.
    19 19 Schemaless • “Schemaless” reallynormally means that the client(s) have an implicit schema • “No Schema” impossible for anything based on Lucene •  A field must be indexed the same way across documents • Dynamic fields: convention over configuration •  Only pre-define types of fields, not fields themselves •  No guessing. Any field name ending in _i is an integer • “Guessed Schema” or “Type Guessing” •  For previously unknown fields, guess using JSON type as a hint •  Coming soon (4.4?) based on the Dynamic Schema work • Many disadvantages to guessing •  Lose ability to catch field naming errors •  Can’t optimize based on types •  Guessing incorrectly means having to start over
  • 20.
  • 21.
  • 22.
    22 22 Distributed Indexing shard1 http://.../solr/collection1/update shard2 • Update sent to any node •  Solr determines what shard the document is on, and forwards to shard leader •  Shard Leader versions document and forwards to all other shard replicas •  HA for updates (if one leader fails, another takes it’s place)
  • 23.
    23 23 Collections API l Create a new document collection http://localhost:8983/solr/admin/collections?    action=CREATE      &name=mycollection    &numShards=4    &replicationFactor=3     l  Delete a collection http://localhost:8983/solr/admin/collections?    action=DELETE    &name=mycollection     l  Create an alias to a collection (or a group of collections) http://localhost:8983/solr/admin/collections?    action=CREATEALIAS    &name=tri_state    &collections=NY,NJ,CT  
  • 24.
  • 25.
    25 25 Distributed QueryRequests l  Distributed query across all shards in the collection http://localhost:8983/solr/collection1/query?q=foo     l  Explicitly specify node addresses to load-balance across shards=localhost:8983/solr|localhost:8900/solr,                localhost:7574/solr|localhost:7500/solr   l  A list of equivalent nodes are separated by “|” l  Different phases of the same distributed request use the same node l  Specify logical shards to search across shards=NY,NJ,CT     l  Specify multiple collections to search across collection=collection1,collection2     l  public  CloudSolrServer(String  zkHost)   l  ZK aware SolrJ Java client that load-balances across all nodes in cluster l  Calculate where document belongs and directly send to shard leader (new)
  • 26.
    26 26 Durable Writes • Luceneflushes writes to disk on a “commit” • Uncommitted docs are lost on a crash (at lucene level) • Solr 4 maintains it’s own transaction log • Contains uncommitted documents • Services real-time get requests • Recovery (log replay on restart) • Supports distributed “peer sync” • Writes forwarded to multiple shard replicas • A replica can go away forever w/o collection data loss • A replica can do a fast “peer sync” if it’s only slightly out of date • A replica can do a full index replication (copy) from a peer
  • 27.
    27 27 Near RealTime (NRT) softCommit • softCommit opens a new view of the index without flushing + fsyncing files to disk • Decouples update visibility from update durability • commitWithin now implies a soft commit • Current autoCommit defaults from solrconfig.xml:  <autoCommit>                  <maxTime>15000</maxTime>                  <openSearcher>false</openSearcher>      </autoCommit>     <!-­‐-­‐    <autoSoftCommit>                      <maxTime>5000</maxTime>                  </autoSoftCommit>  -­‐-­‐>  
  • 28.
    28 28 Document Routing 80000000-bfffffff 00000000-3fffffff 40000000-7ffffff f c0000000-ffffffff shard1shard4 shard3shard2 id  =  BigCo!doc5   9f27 3c71 (MurmurHash3) q=my_query   shard.keys=BigCo!   9f27 0000 9f27 ffffto (hash) shard1 numShards=4 router=compositeId hash ring
  • 29.
    29 29 Seamless OnlineShard Splitting Shard2_0 Shard1 replica leader Shard2 replica leader Shard3 replica leader Shard2_1 1.  http://localhost:8983/solr/admin/collections? action=SPLITSHARD&collection=mycollection&shard=Shard2   2.  New sub-shards created in “construction” state 3.  Leader starts forwarding applicable updates, which are buffered by the sub-shards 4.  Leader index is split and installed on the sub-shards 5.  Sub-shards apply buffered updates then become “active” leaders and old shard becomes “inactive” update
  • 30.
    30 30 Stay intouch https://twitter.com/LucidWorks http://www.linkedin.com/company/lucidworks http://plus.google.com/u/0/b/112313059186533721298/