KEMBAR78
From mysql to MongoDB(MongoDB2011北京交流会) | PDF
from   MySQL

       to

               mongoDB
                                  nightsailer
                 March 3, 2011 / Mongo Beijing
About.me


       @nightsailer
     nightsailer # gmail.com
    http://nightsailer.com/
http://github.com/nightsailer
Mysql data type
      char / varchar
        Int/BigInt
  Float/Double/Decimal
        Blob/Text
            .....
BSON   [bee · sahn]




   Byte              Object Id
Int32/Int64            Array
  Double              Boolean
                  Javascript code
                 Embed document
                        .....

                                    http://bsonspec.org/
BSON

                  Object Id

   0    1 2 3     4 5 6      7 8 9 10 11
       utc time    machine    pid inc/random

  Enforce collection unique: _id
  Should global/cluster unique
BSON

         Date / Timestamp

       Perl: DateTime / Mongo::Timestamp
       PHP: MongoDate / MongoDatetime


  *             Date         Int64
BSON

            Binary Data
   Perl: string
                  my $default_avatar = "xFFxFExFF";
                  $collection->insert({"avatar_file" => $string});


   PHP:MongoBinData

            * 4Mb/8Mb(1.8+)
Create Database ....     No!
   Create Table ....      No!
      Alter Table ....         No!
Table => Collection
Row => Document
            Embed document

Embed document : Array/List/Hash
_id
           Collection

                   Array/List

Object Id => perl: MongoDB::OID   php: MongoId
Insert/Save
perl:
    $db->user->create({name=>’ns’});
  $db->user->save({_id=>5,name=>’ns’});

        SQL Injection             .
Update
>db.user.update({_id:5},{name:’ns’,‘email’:‘xxx’});




      mysql
Upsert



>db.user.update({_id:5},{_id:5,name:’ns’,email:‘xxx’},true)



                                   _id,
In-place Updates
$inc $push $pushAll $pull $pullAll $addToSet
                   $set
                    ....
                 Atomic
     modifier
       Array/Hash              $set)
Upsert & modifier
               $modifier                                           _id

  >db.art_track.update({ art_id:25,d:20110303 },
{ $addToSet:{follower:234},$inc:{views:1}},true);




 > db.art_track.find()
 { "_id" : ObjectId("4d6df20cb7fc9b3c1329c917"), "art_id" : 25, "d" : 20110303, "follower" :
 [ 234 ], "views" : 2 }
Select => Find
>db.user.find({city:‘beijing’})
          .sort({created_on:-1})
                .limit(20);
Cursor
  when find & run command
        cursor->next

OP_QEURY           OP_GETMORE


                       cursor
Order by => Sort
Sort Limit   Cursor
Joins ? No!



              ...
Query Modifier
                                    :
$ne, $in, $nin, $mod, $all, $size, $exists, $type, ..
             $lt, $lte, $gt, $gte, $ne,
                          ...
Command
                                             :
     count,map/reduce, group,...
    >db.$cmd.findOne(cmd_query_obj);


db.$cmd.findOne(‘user’) == db.user.count()
FindAndModify
db.runCommand( { findAndModify : <collection>,
              query: { filter },
                 sort : { },
                 update: {},
                  field: {},
            new/remove: true/false
                      });
FindAndModify
#perl
my $job = db_find_and_modify {
         query => { state => 0 },
         update => { '$set' => { state => 1, ts =>
time } },
         sort => { _id => 1 }
    };
#    Mysql Auto increment
<?php
$db->command(array(
    ‘findandmodify’=>‘sequences’,
    ‘query’ => array(‘_id’=>‘user_id’),
    ‘update’ => array(‘$inc’=>array(‘val’=>1)),
    ‘new’ => true,
    ‘upsert’ => true,
));
?>
                                          *update/sort,   !
List commands
> db.runCommand({listCommands:1})


        Quick reference card
 http://www.10gen.com/reference
Index
  db.foo.ensureIndex({a:1})
db.foo.ensureIndex({a:1,b:-1})

    collection        64    index
     query           1     index
Multikeys
              Array/Object

               >db.art.ensureIndex({tags:1});
>db.art.find({tags:’    ’}).sort({created_on:-1}).limit(10);

         >db.user.ensureIndex({‘profile.age’:1});
         >db.user.find({"profile.age":{$gte:18}});
Special index / 1.8+
Sparse Indexes
   >db.user.ensureIndex({sina_account:1},{sparse:true});
   >db.user.findOne({sina_account:‘xxx’});

Covered Indexes
   >db.user.ensureIndex({_id:1,passport:1,state:1});
   >db.user.findOne({passport:‘xxx’},{_id:0,passport:
   1,state:1});
Like %mongoDB% ?

 > db.count({title:/mongodb/i});



   Regex => index
collection
     db.fulltext.save({
index_name: [ word1,word2]
   {                    }
            ...
             })
index_name
db.fullext.find({index_name:{$all:
          [w1,w2,w3]}})
      .sort({updated_on:-1})
Sum/Group by => ?
Group command          +         db
  Map/Reduce: good,   sharding
Alternate approach
Live: counter field =>$inc
     Backend service:
                  => Gearman: workers
ORM,easy.
PHP: Zend/Symfony/CI ...
               Ruby:Mongoid

           Perl: Mongoose MongoDBx
<?php
//        model
class Lgk_Model_Art extends Lgk_Core_Model_Base {
    protected $collection = "art";
    protected $schema = array(
         'category_id' => 0,
         'tags' => array(),
         'fav_tags' => array(),
         ....
    );
    protected $required_fields = array('user_id','name');
    protected $int_fields = array('user_id','category_id','deleted','published','approved','private');
    protected $created_timestamp_fields = array('created_on','updated_on');
    protected $joins = array(
         'user' =>    array('user_id' => 'Lgk_Core_Model_User'),
         'assets' => array('assets' => 'Lgk_Core_Model_Asset'),
         'category' => array('category_id' => 'Lgk_Core_Model_Category'),
         'thumbnail_asset' => array('thumb_asset_id' => 'Lgk_Core_Model_Asset'),
	     );
    protected function extra_extend_model_row(&$row) {
    }
?>
//
public function load_joins($row) {
       ...
         foreach ($this->joins as $attribute => $definition) {
             list($pk_name,$model_class) = each($definition);
             if (isset($row[$pk_name])) {
                 if (is_array($row[$pk_name])) {
                      $row[$attribute] = &DoggyX_Model_Mapper::load_model_list($row[$pk_name],$model_class);
                 }
                 else {
                      $row[$attribute] = &DoggyX_Model_Mapper::load_model($row[$pk_name],$model_class);
                 }
             }
         }
         return $row;
    }
}
GridFS
 Just specification.
Not best but good.
Write-once, read many
     db.assets:{                  db.fs.file: {
    _id:ObjectId,              _id: ObjectId,
  file_id: ObjectId,           r: (ref counter)
thumb_id:ObjectId,             hash:‘xxxxx’
          ...                          }
           }


        db.fs.files     1
                                       r
                       files         file_id
GridFS

Nginx module ?
Plack/Twiggy    AnyEvent


         Nginx                    Todo                Node.js


Varnish/Squid    proxy store

                                    Plack/Starman       /



                               ETag: file _id
Replication
           local.system.replset
      local.oplog.rs => oplog/capped



* local.usr.xxx =>                     collection


                     collection
mySQL: mmm



          ReplicaSet
      1 Primary + 2 Secondary + 1
                Arbiter


                      * Master/Slave
“SlaveOk”
<?php
$con =new Mongo(‘mongodb://s1:27017’,array
(‘replicaSet’=>true);
$con->setSlaveOkay(true);
?>
#perl
my $con = MongoDB::Connection->new(host=>‘mongodb://
localhost’,w =>2,find_master => 1);
$MongoDB::Cursor::slave_okay = 1;
my $cursor = $con->user->find;
$cursor->slave_okay = 1;


                             Driver
Auto Shard
:             :    db

                          : mongos
    shard_key:        ,
             counting
WTF?
VPN

    => VPN

&     =>

     oplog   /replay,
db.repairDatabase,

:   primary,                 !
               oplog
RAID10
XFS filesystem
kill -9 or     = Crash!



                          Durability?
                          --syncdelay     ? (60=>5)
                                    ‘w’
             db.runCommand( { getlasterror : 1 , w : 2 } )


             ReplicaSet       2   secondary
                                          -dur upgrade to 1.8!
Count is slow!

    counter
Type: string vs int

           “2” != 2

<php? $user_id = (int) $user_id; ?>
OOM Killer

> db.serverStatus()


 Swap             > Memory
Ordered hash

Perl: IxHash
               ({a=>1, b=>-1}) => IxHash->new


Ruby (<1.9): BSON::OrderedHash
PHP: Array is OK
Cursor
         last_row_id = ObjectId(‘....’);
db.activity_stream->find({_id:{$lt: last_row_id },
    user_id:20 } ).sort( {_id:-1} ).limit(10);
PHP/5.2.x: front-end
Perl 5.2.10+ / Plack / Gearman:
Daemon, large-file upload, data
process, job queue service ...

           MongoDB        :
             session store
             business data
             GridFS: Media files (user uploaded
             files, thumbnails ..)
use MongoX (host=>‘mongodb://127.0.0.1’,db=>‘test’);
use_collection 'task_queue';
db_update { state => 1, queue => 'sina_tweet' }, {
        '$set' => { state => 0 },
        '$inc' => { tries => 1 },
        },{ multiple => 1 };
db_remove { state => 1, queue => 'sina_tweet', 'tries' =>
{ '$gt' => 3 } };
db_inc {}

//
my $cnt = 0;
for_connections {
   for_dbs {
        for_collections {
          $cnt += db_count;
        } ‘art’
     } ‘db1’,‘db2’
} ‘c_1’ ,‘c_2’,‘c_arch1’,‘c_arch2’;



     http://github.com/nightsailer/mongo-x/
:
http://czone.chinavisual.com/
2009-6 rebuild on mongodb
          < 1.0
http://cvurl.cn/
     :
http://tu.chinavisual.com/
mongoDB:
                 :             GridFS: hight resolution
http://down.chinavisual.com/
                               images,photos/medium
                               size files(60mb~500mb)
http://idea.chinavisual.com/
PHP 开发      师
        job-php@chinavisual.com
 Modern Perl 开发      师
         job-perl@chinavisual.com

   mongoDB / Git / Catalyst /
Ubuntu / Xiapian / Nginx / Plack /
Moose / Node.js / Titanium / Redis

From mysql to MongoDB(MongoDB2011北京交流会)

  • 1.
    from MySQL to mongoDB nightsailer March 3, 2011 / Mongo Beijing
  • 2.
    About.me @nightsailer nightsailer # gmail.com http://nightsailer.com/ http://github.com/nightsailer
  • 3.
    Mysql data type char / varchar Int/BigInt Float/Double/Decimal Blob/Text .....
  • 4.
    BSON [bee · sahn] Byte Object Id Int32/Int64 Array Double Boolean Javascript code Embed document ..... http://bsonspec.org/
  • 5.
    BSON Object Id 0 1 2 3 4 5 6 7 8 9 10 11 utc time machine pid inc/random Enforce collection unique: _id Should global/cluster unique
  • 6.
    BSON Date / Timestamp Perl: DateTime / Mongo::Timestamp PHP: MongoDate / MongoDatetime * Date Int64
  • 7.
    BSON Binary Data Perl: string my $default_avatar = "xFFxFExFF"; $collection->insert({"avatar_file" => $string}); PHP:MongoBinData * 4Mb/8Mb(1.8+)
  • 8.
    Create Database .... No! Create Table .... No! Alter Table .... No!
  • 9.
    Table => Collection Row=> Document Embed document Embed document : Array/List/Hash
  • 10.
    _id Collection Array/List Object Id => perl: MongoDB::OID php: MongoId
  • 11.
    Insert/Save perl: $db->user->create({name=>’ns’}); $db->user->save({_id=>5,name=>’ns’}); SQL Injection .
  • 12.
  • 13.
  • 14.
    In-place Updates $inc $push$pushAll $pull $pullAll $addToSet $set .... Atomic modifier Array/Hash $set)
  • 15.
    Upsert & modifier $modifier _id >db.art_track.update({ art_id:25,d:20110303 }, { $addToSet:{follower:234},$inc:{views:1}},true); > db.art_track.find() { "_id" : ObjectId("4d6df20cb7fc9b3c1329c917"), "art_id" : 25, "d" : 20110303, "follower" : [ 234 ], "views" : 2 }
  • 16.
    Select => Find >db.user.find({city:‘beijing’}) .sort({created_on:-1}) .limit(20);
  • 17.
    Cursor whenfind & run command cursor->next OP_QEURY OP_GETMORE cursor
  • 18.
    Order by =>Sort Sort Limit Cursor
  • 19.
  • 20.
    Query Modifier : $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, ...
  • 21.
    Command : count,map/reduce, group,... >db.$cmd.findOne(cmd_query_obj); db.$cmd.findOne(‘user’) == db.user.count()
  • 22.
    FindAndModify db.runCommand( { findAndModify: <collection>, query: { filter }, sort : { }, update: {}, field: {}, new/remove: true/false });
  • 23.
    FindAndModify #perl my $job =db_find_and_modify { query => { state => 0 }, update => { '$set' => { state => 1, ts => time } }, sort => { _id => 1 } }; # Mysql Auto increment <?php $db->command(array( ‘findandmodify’=>‘sequences’, ‘query’ => array(‘_id’=>‘user_id’), ‘update’ => array(‘$inc’=>array(‘val’=>1)), ‘new’ => true, ‘upsert’ => true, )); ?> *update/sort, !
  • 24.
    List commands > db.runCommand({listCommands:1}) Quick reference card http://www.10gen.com/reference
  • 25.
  • 26.
    Multikeys Array/Object >db.art.ensureIndex({tags:1}); >db.art.find({tags:’ ’}).sort({created_on:-1}).limit(10); >db.user.ensureIndex({‘profile.age’:1}); >db.user.find({"profile.age":{$gte:18}});
  • 27.
    Special index /1.8+ Sparse Indexes >db.user.ensureIndex({sina_account:1},{sparse:true}); >db.user.findOne({sina_account:‘xxx’}); Covered Indexes >db.user.ensureIndex({_id:1,passport:1,state:1}); >db.user.findOne({passport:‘xxx’},{_id:0,passport: 1,state:1});
  • 28.
    Like %mongoDB% ? > db.count({title:/mongodb/i}); Regex => index
  • 29.
    collection db.fulltext.save({ index_name: [ word1,word2] { } ... })
  • 30.
    index_name db.fullext.find({index_name:{$all: [w1,w2,w3]}}) .sort({updated_on:-1})
  • 31.
    Sum/Group by =>? Group command + db Map/Reduce: good, sharding
  • 32.
    Alternate approach Live: counterfield =>$inc Backend service: => Gearman: workers
  • 33.
    ORM,easy. PHP: Zend/Symfony/CI ... Ruby:Mongoid Perl: Mongoose MongoDBx
  • 34.
    <?php // model class Lgk_Model_Art extends Lgk_Core_Model_Base { protected $collection = "art"; protected $schema = array( 'category_id' => 0, 'tags' => array(), 'fav_tags' => array(), .... ); protected $required_fields = array('user_id','name'); protected $int_fields = array('user_id','category_id','deleted','published','approved','private'); protected $created_timestamp_fields = array('created_on','updated_on'); protected $joins = array( 'user' => array('user_id' => 'Lgk_Core_Model_User'), 'assets' => array('assets' => 'Lgk_Core_Model_Asset'), 'category' => array('category_id' => 'Lgk_Core_Model_Category'), 'thumbnail_asset' => array('thumb_asset_id' => 'Lgk_Core_Model_Asset'), ); protected function extra_extend_model_row(&$row) { } ?> // public function load_joins($row) { ... foreach ($this->joins as $attribute => $definition) { list($pk_name,$model_class) = each($definition); if (isset($row[$pk_name])) { if (is_array($row[$pk_name])) { $row[$attribute] = &DoggyX_Model_Mapper::load_model_list($row[$pk_name],$model_class); } else { $row[$attribute] = &DoggyX_Model_Mapper::load_model($row[$pk_name],$model_class); } } } return $row; } }
  • 35.
  • 36.
    Write-once, read many db.assets:{ db.fs.file: { _id:ObjectId, _id: ObjectId, file_id: ObjectId, r: (ref counter) thumb_id:ObjectId, hash:‘xxxxx’ ... } } db.fs.files 1 r files file_id
  • 37.
  • 38.
    Plack/Twiggy AnyEvent Nginx Todo Node.js Varnish/Squid proxy store Plack/Starman / ETag: file _id
  • 39.
    Replication local.system.replset local.oplog.rs => oplog/capped * local.usr.xxx => collection collection
  • 40.
    mySQL: mmm ReplicaSet 1 Primary + 2 Secondary + 1 Arbiter * Master/Slave
  • 41.
    “SlaveOk” <?php $con =new Mongo(‘mongodb://s1:27017’,array (‘replicaSet’=>true); $con->setSlaveOkay(true); ?> #perl my$con = MongoDB::Connection->new(host=>‘mongodb:// localhost’,w =>2,find_master => 1); $MongoDB::Cursor::slave_okay = 1; my $cursor = $con->user->find; $cursor->slave_okay = 1; Driver
  • 42.
    Auto Shard : : db : mongos shard_key: , counting
  • 43.
  • 44.
    VPN => VPN & => oplog /replay,
  • 45.
    db.repairDatabase, : primary, ! oplog
  • 46.
  • 47.
    kill -9 or = Crash! Durability? --syncdelay ? (60=>5) ‘w’ db.runCommand( { getlasterror : 1 , w : 2 } ) ReplicaSet 2 secondary -dur upgrade to 1.8!
  • 48.
  • 49.
    Type: string vsint “2” != 2 <php? $user_id = (int) $user_id; ?>
  • 50.
  • 51.
    Ordered hash Perl: IxHash ({a=>1, b=>-1}) => IxHash->new Ruby (<1.9): BSON::OrderedHash PHP: Array is OK
  • 52.
    Cursor last_row_id = ObjectId(‘....’); db.activity_stream->find({_id:{$lt: last_row_id }, user_id:20 } ).sort( {_id:-1} ).limit(10);
  • 53.
    PHP/5.2.x: front-end Perl 5.2.10+/ Plack / Gearman: Daemon, large-file upload, data process, job queue service ... MongoDB : session store business data GridFS: Media files (user uploaded files, thumbnails ..)
  • 54.
    use MongoX (host=>‘mongodb://127.0.0.1’,db=>‘test’); use_collection'task_queue'; db_update { state => 1, queue => 'sina_tweet' }, { '$set' => { state => 0 }, '$inc' => { tries => 1 }, },{ multiple => 1 }; db_remove { state => 1, queue => 'sina_tweet', 'tries' => { '$gt' => 3 } }; db_inc {} // my $cnt = 0; for_connections { for_dbs { for_collections { $cnt += db_count; } ‘art’ } ‘db1’,‘db2’ } ‘c_1’ ,‘c_2’,‘c_arch1’,‘c_arch2’; http://github.com/nightsailer/mongo-x/
  • 55.
  • 56.
    http://cvurl.cn/ : http://tu.chinavisual.com/
  • 57.
    mongoDB: : GridFS: hight resolution http://down.chinavisual.com/ images,photos/medium size files(60mb~500mb)
  • 58.
  • 59.
    PHP 开发 师 job-php@chinavisual.com Modern Perl 开发 师 job-perl@chinavisual.com mongoDB / Git / Catalyst / Ubuntu / Xiapian / Nginx / Plack / Moose / Node.js / Titanium / Redis