KEMBAR78
Schema Design with MongoDB | KEY
open-source, high-performance,
  document-oriented database
Schema Design Basics
    roger@10Gen.com
This talk
This talk
‣Intro
  -Terms / Definitions
This talk
‣Intro
  -Terms / Definitions
This talk
‣Intro
  -Terms / Definitions
‣Getting a flavor
 -Creating a Schema
 -Indexes
 -Evolving the Schema
This talk
‣Intro
  -Terms / Definitions
‣Getting a flavor
 -Creating a Schema
 -Indexes
 -Evolving the Schema
‣Data modeling
 -DBRef
 -Single Table Inheritance
 -Many - Many
 -Trees
 -Lists / Queues / Stacks
Document Oriented

Basic unit of data: JSON Documents
Not Relational, Key Value

Not OODB
 - Associations implied by Document Structure
 - but your database schema != your program schema
Terms
Table           -> Collection

Row(s)          -> JSON Document

Index           -> Index

Join            -> Embedding and Linking
                   across documents

Partition       -> Shard
Partition Key   -> Shard Key
Considerations

What are the requirements ?
 - Functionality to be supported
 - Access Patterns ?
 - Data Life Cycle (insert, update, deletes)
 - Expected Performance / Workload ?

Capabilities of the database ?
DB Considerations
How can we manipulate this data ?
 Dynamic Queries
 Secondary Indexes
 Atomic Updates
 Map Reduce

Access Patterns ?
  Read / Write Ratio
  Types of updates
  Types of queries

Considerations
 No Joins
 Single Document Transactions only
Design Session
Use Rich Design Documents

 post = {author: “kyle”,
      date: new Date(),
      text: “my blog post...”,
      tags: [“mongodb”, “intro”]}

 >db.post.save(post)
>db.posts.find()

 { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
   author : "kyle",
   date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)",
   text : "My first blog",
   tags : [ "mongodb", "intro" ] }

Notes:
 - ID is unique, but can be anything you’d like
Secondary index for “author”

 // 1 means ascending, -1 means descending

 >db.posts.ensureIndex({author: 1})

 >db.posts.find({author: 'kyle'})

 { _id    : ObjectId("4c4ba5c0672c685e5e8aabf3"),
   author : "kyle",
   ... }
Verifying indexes exist
 >db.system.indexes.find()

   // Index on ID
   { name : "_id_",
     ns : "test.posts",
     key : { "_id" : 1 } }
Verifying indexes exist
 >db.system.indexes.find()

   // Index on ID
   { name : "_id_",
     ns : "test.posts",
     key : { "_id" : 1 } }

   // Index on author
   { _id : ObjectId("4c4ba6c5672c685e5e8aabf4"),
     ns : "test.posts",
     key : { "author" : 1 },
     name : "author_1" }
Query operators
Conditional operators:
 $ne, $in, $nin, $mod, $all, $size, $exists, $type, ..
 $lt, $lte, $gt, $gte, $ne,

  // find posts with any tags
  >db.posts.find({tags: {$exists: true}})
Query operators
Conditional operators:
 $ne, $in, $nin, $mod, $all, $size, $exists, $type, ..
 $lt, $lte, $gt, $gte, $ne,

  // find posts with any tags
  >db.posts.find({tags: {$exists: true}})

Regular expressions:
   // posts where author starts with k
   >db.posts.find({author: /^k*/i })
Query operators
Conditional operators:
 $ne, $in, $nin, $mod, $all, $size, $exists, $type, ..
 $lt, $lte, $gt, $gte, $ne,

  // find posts with any tags
  >db.posts.find({tags: {$exists: true}})

Regular expressions:
   // posts where author starts with k
   >db.posts.find({author: /^k*/i })

Counting:
   // posts written by mike
  >db.posts.find({author: “mike”}).count()
Extending the Schema
 comment = {author: “fred”,
            date: new Date(),
            text: “super duper”}

 update = { ‘$push’: {comments: comment},
            ‘$inc’: {comments_count: 1}}

 >db.posts.update({_id: “...” }, update)
{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
      author : "kyle",
      date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)",
      text : "My first blog",
      tags : [ "mongodb", "intro" ],
      comments_count: 1,
      comments : [

      {

      
 author : "Fred",

      
 date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)",

      
 text : "Super Duper"

      }
      ]}
// create index on nested documents:
>db.posts.ensureIndex({"comments.author": 1})

>db.posts.find({comments.author:”Fred”})
// create index on nested documents:
>db.posts.ensureIndex({"comments.author": 1})

>db.posts.find({comments.author:”kyle”})

// find last 5 posts:
>db.posts.find().sort({date:-1}).limit(5)
// create index on nested documents:
>db.posts.ensureIndex({"comments.author": 1})

>db.posts.find({comments.author:”kyle”})

// find last 5 posts:
>db.posts.find().sort({date:-1}).limit(5)

// most commented post:
 >db.posts.find().sort({comments_count:-1}).limit(1)

When sorting, check if you need an index
Map Reduce
Aggregation and batch manipulation

 Collection in, Collection out

 Parallel in sharded environments
Map reduce
mapFunc = function () {
  this.tags.forEach(function (z) {emit(z, {count:1});});
}

reduceFunc = function (k, v) {
  var total = 0;
  for (var i = 0; i < v.length; i++) { total += v[i].count; }
  return {count:total}; }

res = db.posts.mapReduce(mapFunc, reduceFunc)

>db[res.result].find()
  { _id : "intro", value : { count : 1 } }
  { _id : "mongodb", value : { count : 1 } }
Review
So Far:
- Started out with a simple schema
- Queried Data
- Evolved the schema
- Queried / Updated the data some more
Wordnik
9B   records, 100M queries / week, 1,2TB
{

     entry : {

     
   header: { id: 0,

     
   
     headword: "m",

     
   
     sourceDictionary: "GCide",

     
   
     textProns : [

     
   
       {text: "(em)",

     
   
        seq:0}

     
   
     ],

     
   
     syllables: [

     
   
     
    {id: 0,

     
   
         text: "m"}

     
   
     ],

     
   

     
   
     sourceDictionary: "1913 Webster",

     
   
     headWord: "m",

     
   
     id: 1,

     
   
     definitions: : [

     
   
       {text: "M, the thirteenth letter..."},

     
   
       {text: "As a numeral, M stands for 1000"}]

     
   
     }

     }
}
Review
So Far:
- Started out with a simple schema
- Queried Data
- Evolved the schema
- Queried / Updated the data some more

Observations:
- Using Rich Documents works well
- Simplify relations by embedding them
- Iterative development is easy with MongoDB
Single Table Inheritance
>db.shapes.find()

 { _id: ObjectId("..."), type: "circle", area: 3.14, radius: 1}
 { _id: ObjectId("..."), type: "square", area: 4,    d: 2}
 { _id: ObjectId("..."), type: "rect",   area: 10, length: 5, width: 2}

// find shapes where radius > 0
>db.shapes.find({radius: {$gt: 0}})

// create index
>db.shapes.ensureIndex({radius: 1})
One to Many
- Embedded Array / Array Keys
  - slice operator to return subset of array
  - hard to find latest comments across all documents
One to Many
- Embedded Array / Array Keys
  - slice operator to return subset of array
  - hard to find latest comments across all documents

- Embedded tree
  - Single document
  - Natural
  - Hard to query
One to Many
- Embedded Array / Array Keys
  - slice operator to return subset of array
  - hard to find latest comments across all documents

- Embedded tree
  - Single document
  - Natural
  - Hard to query

- Normalized (2 collections)
  - most flexible
  - more queries
Many - Many
Example:

- Product can be in many categories
- Category can have many products



  Products     id | product_id | category_id   Category
products:
  { _id: ObjectId("4c4ca23933fb5941681b912e"),
    name: "Sumatra Dark Roast",
    category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
                    ObjectId("4c4ca25433fb5941681b92af”]}
products:
  { _id: ObjectId("4c4ca23933fb5941681b912e"),
    name: "Sumatra Dark Roast",
    category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
                    ObjectId("4c4ca25433fb5941681b92af”]}

categories:
  { _id: ObjectId("4c4ca25433fb5941681b912f"),
    name: "Indonesia",
    product_ids: [ ObjectId("4c4ca23933fb5941681b912e"),
                   ObjectId("4c4ca30433fb5941681b9130"),
                   ObjectId("4c4ca30433fb5941681b913a"]}
products:
  { _id: ObjectId("4c4ca23933fb5941681b912e"),
    name: "Sumatra Dark Roast",
    category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
                    ObjectId("4c4ca25433fb5941681b92af”]}

categories:
  { _id: ObjectId("4c4ca25433fb5941681b912f"),
    name: "Indonesia",
    product_ids: [ ObjectId("4c4ca23933fb5941681b912e"),
                   ObjectId("4c4ca30433fb5941681b9130"),
                   ObjectId("4c4ca30433fb5941681b913a"]}

//All categories for a given product
>db.categories.find({product_ids: ObjectId("4c4ca23933fb5941681b912e")})
products:
  { _id: ObjectId("4c4ca23933fb5941681b912e"),
    name: "Sumatra Dark Roast",
    category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
                    ObjectId("4c4ca25433fb5941681b92af”]}

categories:
  { _id: ObjectId("4c4ca25433fb5941681b912f"),
    name: "Indonesia",
    product_ids: [ ObjectId("4c4ca23933fb5941681b912e"),
                   ObjectId("4c4ca30433fb5941681b9130"),
                   ObjectId("4c4ca30433fb5941681b913a"]}

//All categories for a given product
>db.categories.find({product_ids: ObjectId("4c4ca23933fb5941681b912e")})

//All products for a given category
>db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})
Alternative
products:
  { _id: ObjectId("4c4ca23933fb5941681b912e"),
    name: "Sumatra Dark Roast",
    category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
                    ObjectId("4c4ca25433fb5941681b92af”]}

categories:
  { _id: ObjectId("4c4ca25433fb5941681b912f"),
    name: "Indonesia"}
Alternative
products:
  { _id: ObjectId("4c4ca23933fb5941681b912e"),
    name: "Sumatra Dark Roast",
    category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
                    ObjectId("4c4ca25433fb5941681b92af”]}

categories:
  { _id: ObjectId("4c4ca25433fb5941681b912f"),
    name: "Indonesia"}

// All products for a given category
>db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})
Alternative
products:
  { _id: ObjectId("4c4ca23933fb5941681b912e"),
    name: "Sumatra Dark Roast",
    category_ids: [ ObjectId("4c4ca25433fb5941681b912f"),
                    ObjectId("4c4ca25433fb5941681b92af”]}

categories:
  { _id: ObjectId("4c4ca25433fb5941681b912f"),
    name: "Indonesia"}

// All products for a given category
>db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})

// All categories for a given product
product = db.products.find(_id : some_id)
>db.categories.find({_id : {$in : product.category_ids}})
Trees
Full Tree in Document

{ comments: [
     { author: “rpb”, text: “...”,
       replies: [
                  {author: “Fred”, text: “...”,
                   replies: []}
       ]}
   ]}

  Pros: Single Document, Performance, Intuitive
  Cons: Hard to search, Partial Results, 4MB limit
Trees
Parent Links
- Each node is stored as a document
- Contains the id of the parent

Child Links
- Each node contains the id’s of the children
- Can support graphs (multiple parents / child)
Array of Ancestors
- Store Ancestors of a node
 {   _id:   "a" }
 {   _id:   "b", ancestors: [ "a" ], parent: "a" }
 {   _id:   "c", ancestors: [ "a", "b" ], parent: "b" }
 {   _id:   "d", ancestors: [ "a", "b" ], parent: "b" }
 {   _id:   "e", ancestors: [ "a" ], parent: "a" }
 {   _id:   "f", ancestors: [ "a", "e" ], parent: "e" }
 {   _id:   "g", ancestors: [ "a", "b", "d" ], parent: "d" }
Array of Ancestors
- Store Ancestors of a node
 {   _id:   "a" }
 {   _id:   "b", ancestors: [ "a" ], parent: "a" }
 {   _id:   "c", ancestors: [ "a", "b" ], parent: "b" }
 {   _id:   "d", ancestors: [ "a", "b" ], parent: "b" }
 {   _id:   "e", ancestors: [ "a" ], parent: "a" }
 {   _id:   "f", ancestors: [ "a", "e" ], parent: "e" }
 {   _id:   "g", ancestors: [ "a", "b", "d" ], parent: "d" }

//find all descendants of b:
>db.tree2.find({ancestors: ‘b’})
Array of Ancestors
- Store Ancestors of a node
 {   _id:   "a" }
 {   _id:   "b", ancestors: [ "a" ], parent: "a" }
 {   _id:   "c", ancestors: [ "a", "b" ], parent: "b" }
 {   _id:   "d", ancestors: [ "a", "b" ], parent: "b" }
 {   _id:   "e", ancestors: [ "a" ], parent: "a" }
 {   _id:   "f", ancestors: [ "a", "e" ], parent: "e" }
 {   _id:   "g", ancestors: [ "a", "b", "d" ], parent: "d" }

//find all descendants of b:
>db.tree2.find({ancestors: ‘b’})

//find all ancestors of f:
>ancestors = db.tree2.findOne({_id:’f’}).ancestors
>db.tree2.find({_id: { $in : ancestors})
findAndModify
Queue example

//Example: grab highest priority job and mark

job = db.jobs.findAndModify({
          query: {inprogress: false},
          sort:   {priority: -1),
          update: {$set: {inprogress: true,
                          started: new Date()}},
          new: true})
More Cool Stuff

• Aggregation
• Capped collections
• GridFS
• Geo
Learn More
 Kyle’s presentation + video:
http://www.slideshare.net/kbanker/mongodb-schema-design
http://www.blip.tv/file/3704083

 Dwight’s presentation
http://www.slideshare.net/mongosf/schema-design-with-mongodb-dwight-
merriman

 Documentation
Trees:   http://www.mongodb.org/display/DOCS/Trees+in+MongoDB
Queues: http://www.mongodb.org/display/DOCS/findandmodify+Command
Aggregration: http://www.mongodb.org/display/DOCS/Aggregation
Capped Col. : http://www.mongodb.org/display/DOCS/Capped+Collections
Geo: http://www.mongodb.org/display/DOCS/Geospatial+Indexing
GridFS: http://www.mongodb.org/display/DOCS/GridFS+Specification
Thank You :-)
Download
       MongoDB

and let us know what you think
          @mongodb
   http://www.mongodb.org
DBRef
DBRef
 {$ref: collection, $id: id_value}

- Think URL
- YDSMV: your driver support may vary

Sample Schema:
  nr = {note_refs: [{"$ref" : "notes", "$id" : 5}, ... ]}

Dereferencing:
  nr.forEach(function(r) {
    printjson(db[r.$ref].findOne({_id: r.$id}));
  }
BSON
Mongodb stores data in BSON internally

 Lightweight, Traversable, Efficient encoding

 Typed
   boolean, integer, float, date, string, binary, array...

Schema Design with MongoDB

  • 1.
    open-source, high-performance, document-oriented database
  • 2.
    Schema Design Basics roger@10Gen.com
  • 3.
  • 4.
    This talk ‣Intro -Terms / Definitions
  • 5.
    This talk ‣Intro -Terms / Definitions
  • 6.
    This talk ‣Intro -Terms / Definitions ‣Getting a flavor -Creating a Schema -Indexes -Evolving the Schema
  • 7.
    This talk ‣Intro -Terms / Definitions ‣Getting a flavor -Creating a Schema -Indexes -Evolving the Schema ‣Data modeling -DBRef -Single Table Inheritance -Many - Many -Trees -Lists / Queues / Stacks
  • 8.
    Document Oriented Basic unitof data: JSON Documents Not Relational, Key Value Not OODB - Associations implied by Document Structure - but your database schema != your program schema
  • 9.
    Terms Table -> Collection Row(s) -> JSON Document Index -> Index Join -> Embedding and Linking across documents Partition -> Shard Partition Key -> Shard Key
  • 10.
    Considerations What are therequirements ? - Functionality to be supported - Access Patterns ? - Data Life Cycle (insert, update, deletes) - Expected Performance / Workload ? Capabilities of the database ?
  • 11.
    DB Considerations How canwe manipulate this data ? Dynamic Queries Secondary Indexes Atomic Updates Map Reduce Access Patterns ? Read / Write Ratio Types of updates Types of queries Considerations No Joins Single Document Transactions only
  • 12.
    Design Session Use RichDesign Documents post = {author: “kyle”, date: new Date(), text: “my blog post...”, tags: [“mongodb”, “intro”]} >db.post.save(post)
  • 13.
    >db.posts.find() { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "kyle", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "My first blog", tags : [ "mongodb", "intro" ] } Notes: - ID is unique, but can be anything you’d like
  • 14.
    Secondary index for“author” // 1 means ascending, -1 means descending >db.posts.ensureIndex({author: 1}) >db.posts.find({author: 'kyle'}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "kyle", ... }
  • 15.
    Verifying indexes exist >db.system.indexes.find() // Index on ID { name : "_id_", ns : "test.posts", key : { "_id" : 1 } }
  • 16.
    Verifying indexes exist >db.system.indexes.find() // Index on ID { name : "_id_", ns : "test.posts", key : { "_id" : 1 } } // Index on author { _id : ObjectId("4c4ba6c5672c685e5e8aabf4"), ns : "test.posts", key : { "author" : 1 }, name : "author_1" }
  • 17.
    Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({tags: {$exists: true}})
  • 18.
    Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({tags: {$exists: true}}) Regular expressions: // posts where author starts with k >db.posts.find({author: /^k*/i })
  • 19.
    Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({tags: {$exists: true}}) Regular expressions: // posts where author starts with k >db.posts.find({author: /^k*/i }) Counting: // posts written by mike >db.posts.find({author: “mike”}).count()
  • 20.
    Extending the Schema comment = {author: “fred”, date: new Date(), text: “super duper”} update = { ‘$push’: {comments: comment}, ‘$inc’: {comments_count: 1}} >db.posts.update({_id: “...” }, update)
  • 21.
    { _id :ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "kyle", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "My first blog", tags : [ "mongodb", "intro" ], comments_count: 1, comments : [ { author : "Fred", date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)", text : "Super Duper" } ]}
  • 22.
    // create indexon nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”Fred”})
  • 23.
    // create indexon nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”kyle”}) // find last 5 posts: >db.posts.find().sort({date:-1}).limit(5)
  • 24.
    // create indexon nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”kyle”}) // find last 5 posts: >db.posts.find().sort({date:-1}).limit(5) // most commented post: >db.posts.find().sort({comments_count:-1}).limit(1) When sorting, check if you need an index
  • 25.
    Map Reduce Aggregation andbatch manipulation Collection in, Collection out Parallel in sharded environments
  • 26.
    Map reduce mapFunc =function () { this.tags.forEach(function (z) {emit(z, {count:1});}); } reduceFunc = function (k, v) { var total = 0; for (var i = 0; i < v.length; i++) { total += v[i].count; } return {count:total}; } res = db.posts.mapReduce(mapFunc, reduceFunc) >db[res.result].find() { _id : "intro", value : { count : 1 } } { _id : "mongodb", value : { count : 1 } }
  • 27.
    Review So Far: - Startedout with a simple schema - Queried Data - Evolved the schema - Queried / Updated the data some more
  • 28.
    Wordnik 9B records, 100M queries / week, 1,2TB { entry : { header: { id: 0, headword: "m", sourceDictionary: "GCide", textProns : [ {text: "(em)", seq:0} ], syllables: [ {id: 0, text: "m"} ], sourceDictionary: "1913 Webster", headWord: "m", id: 1, definitions: : [ {text: "M, the thirteenth letter..."}, {text: "As a numeral, M stands for 1000"}] } } }
  • 29.
    Review So Far: - Startedout with a simple schema - Queried Data - Evolved the schema - Queried / Updated the data some more Observations: - Using Rich Documents works well - Simplify relations by embedding them - Iterative development is easy with MongoDB
  • 31.
    Single Table Inheritance >db.shapes.find() { _id: ObjectId("..."), type: "circle", area: 3.14, radius: 1} { _id: ObjectId("..."), type: "square", area: 4, d: 2} { _id: ObjectId("..."), type: "rect", area: 10, length: 5, width: 2} // find shapes where radius > 0 >db.shapes.find({radius: {$gt: 0}}) // create index >db.shapes.ensureIndex({radius: 1})
  • 32.
    One to Many -Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents
  • 33.
    One to Many -Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents - Embedded tree - Single document - Natural - Hard to query
  • 34.
    One to Many -Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents - Embedded tree - Single document - Natural - Hard to query - Normalized (2 collections) - most flexible - more queries
  • 35.
    Many - Many Example: -Product can be in many categories - Category can have many products Products id | product_id | category_id Category
  • 36.
    products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]}
  • 37.
    products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]}
  • 38.
    products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} //All categories for a given product >db.categories.find({product_ids: ObjectId("4c4ca23933fb5941681b912e")})
  • 39.
    products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} //All categories for a given product >db.categories.find({product_ids: ObjectId("4c4ca23933fb5941681b912e")}) //All products for a given category >db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})
  • 40.
    Alternative products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia"}
  • 41.
    Alternative products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia"} // All products for a given category >db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})
  • 42.
    Alternative products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia"} // All products for a given category >db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")}) // All categories for a given product product = db.products.find(_id : some_id) >db.categories.find({_id : {$in : product.category_ids}})
  • 43.
    Trees Full Tree inDocument { comments: [ { author: “rpb”, text: “...”, replies: [ {author: “Fred”, text: “...”, replies: []} ]} ]} Pros: Single Document, Performance, Intuitive Cons: Hard to search, Partial Results, 4MB limit
  • 44.
    Trees Parent Links - Eachnode is stored as a document - Contains the id of the parent Child Links - Each node contains the id’s of the children - Can support graphs (multiple parents / child)
  • 45.
    Array of Ancestors -Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }
  • 46.
    Array of Ancestors -Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" } //find all descendants of b: >db.tree2.find({ancestors: ‘b’})
  • 47.
    Array of Ancestors -Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" } //find all descendants of b: >db.tree2.find({ancestors: ‘b’}) //find all ancestors of f: >ancestors = db.tree2.findOne({_id:’f’}).ancestors >db.tree2.find({_id: { $in : ancestors})
  • 48.
    findAndModify Queue example //Example: grabhighest priority job and mark job = db.jobs.findAndModify({ query: {inprogress: false}, sort: {priority: -1), update: {$set: {inprogress: true, started: new Date()}}, new: true})
  • 49.
    More Cool Stuff •Aggregation • Capped collections • GridFS • Geo
  • 50.
    Learn More Kyle’spresentation + video: http://www.slideshare.net/kbanker/mongodb-schema-design http://www.blip.tv/file/3704083 Dwight’s presentation http://www.slideshare.net/mongosf/schema-design-with-mongodb-dwight- merriman Documentation Trees: http://www.mongodb.org/display/DOCS/Trees+in+MongoDB Queues: http://www.mongodb.org/display/DOCS/findandmodify+Command Aggregration: http://www.mongodb.org/display/DOCS/Aggregation Capped Col. : http://www.mongodb.org/display/DOCS/Capped+Collections Geo: http://www.mongodb.org/display/DOCS/Geospatial+Indexing GridFS: http://www.mongodb.org/display/DOCS/GridFS+Specification
  • 51.
  • 52.
    Download MongoDB and let us know what you think @mongodb http://www.mongodb.org
  • 53.
    DBRef DBRef {$ref: collection,$id: id_value} - Think URL - YDSMV: your driver support may vary Sample Schema: nr = {note_refs: [{"$ref" : "notes", "$id" : 5}, ... ]} Dereferencing: nr.forEach(function(r) { printjson(db[r.$ref].findOne({_id: r.$id})); }
  • 54.
    BSON Mongodb stores datain BSON internally Lightweight, Traversable, Efficient encoding Typed boolean, integer, float, date, string, binary, array...

Editor's Notes