-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Data loss on tables with binary primary key
Hello!
We found bug on our production environment when migrate tables with binary columns as primary key.
It raises occationally, only at some updated rows, when gh-ost apply changes from binlog.
Usually, only few rows affected, looks like there are rows with special symbols at primary key.
How to reproduce
- Create table
CREATE TABLE `testorders` (
`UUId` binary(16) NOT NULL,
`Version` int(11) NOT NULL DEFAULT '1',
`ClientUUId` binary(16) DEFAULT NULL,
`CreatedAtUtc` datetime NOT NULL,
`ModifiedAtUtc` datetime DEFAULT NULL,
PRIMARY KEY (`UUId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;- Run gh-ost
./gh-ost \
--assume-rbr \
--assume-master-host="localhost" \
--master-user="ghost_user" \
--master-password="PASSWORD" \
--ssl \
--ssl-allow-insecure \
--max-load=Threads_running=32 \
--max-lag-millis=1500 \
--critical-load=Threads_running=500 \
--chunk-size=5000 \
--host="localhost" \
--user="ghost_user" \
--password="PASSWORD" \
--alter="ADD INDEX IDX_ClientUUId(ClientUUId)" \
--database="testghost" \
--table="testorders" \
--skip-strict-mode \
--debug \
--stack \
--postpone-cut-over-flag-file="/tmp/ghost.cutover.testorders.flag" \
--panic-flag-file="/tmp/ghost.panic.testorders.flag" \
--allow-on-master \
--execute- Insert rows into the table
# bad UUId
insert into testorders (UUId, Version, ClientUUId, CreatedAtUtc)
values(0x150B7B5B516F87125E8597FFB44BF200, 1, 0xE53C111AFEF8A2D3B9A3484108D08F2B, '2020-12-31 07:18:00');
# good UUId
insert into testorders (UUId, Version, ClientUUId, CreatedAtUtc)
values(0x002E363F305044DE79E9B3D0105BC23E, 1, 0xE53C111AFEF8A2D3B9A3484108D08F2E, '2020-12-31 07:19:00');- Update rows
# bad UUId - 0 rows affected at gh-ost log
update testorders
set Version = 2,
ModifiedAtUtc='2020-12-31 07:20:00'
where UUId = 0x150B7B5B516F87125E8597FFB44BF200
and Version = 1;
# good UUId - 1 rows affected
update testorders
set Version = 2,
ModifiedAtUtc='2020-12-31 07:20:00'
where UUId = 0x002E363F305044DE79E9B3D0105BC23E
and Version = 1;- Unpostpone migration
echo unpostpone | nc -U /tmp/gh-ost.testghost.testorders.sockCheck testorders and _testorders_del tables, and we can see, that at testorders table, row with UUId 0x150B7B5B516F87125E8597FFB44BF200 has Version with 1 value, but expected 2.
Update for row with UUId 0x150B7B5B516F87125E8597FFB44BF200 was not applied, 0 rows affected.
Also, you may see this data loss even before unpostpone, on _testorders_gho table.
select 'testorders' as TableName, UUId, Version, ClientUUId, CreatedAtUtc
from testorders
union all
select '_testorders_del' as TableName, UUId, Version, ClientUUId, CreatedAtUtc
from _testorders_del
order by UUId, TableName;
#
# +---------------+----------------------------------+-------+----------------------------------+-------------------+
# |TableName |UUId |Version|ClientUUId |CreatedAtUtc |
# +---------------+----------------------------------+-------+----------------------------------+-------------------+
# |testorders |0x002E363F305044DE79E9B3D0105BC23E|2 |0xE53C111AFEF8A2D3B9A3484108D08F2E|2020-12-31 07:19:00|
# |_testorders_del|0x002E363F305044DE79E9B3D0105BC23E|2 |0xE53C111AFEF8A2D3B9A3484108D08F2E|2020-12-31 07:19:00|
# |testorders |0x150B7B5B516F87125E8597FFB44BF200|1 |0xE53C111AFEF8A2D3B9A3484108D08F2B|2020-12-31 07:18:00|
# |_testorders_del|0x150B7B5B516F87125E8597FFB44BF200|2 |0xE53C111AFEF8A2D3B9A3484108D08F2B|2020-12-31 07:18:00|
# +---------------+----------------------------------+-------+----------------------------------+-------------------+- Gh-ost log
$ ./gh-ost \
> --assume-rbr \
> --assume-master-host="localhost" \
> --master-user="ghost_user" \
> --master-password="PASSWORD" \
> --ssl \
> --ssl-allow-insecure \
> --max-load=Threads_running=32 \
> --max-lag-millis=1500 \
> --critical-load=Threads_running=500 \
> --chunk-size=5000 \
> --host="localhost" \
> --user="ghost_user" \
> --password="PASSWORD" \
> --alter="ADD INDEX IDX_ClientUUId(ClientUUId)" \
> --database="testghost" \
> --table="testorders" \
> --skip-strict-mode \
> --debug \
> --stack \
> --postpone-cut-over-flag-file="/tmp/ghost.cutover.testorders.flag" \
> --panic-flag-file="/tmp/ghost.panic.testorders.flag" \
> --allow-on-master \
> --execute
2020-12-31 07:44:49 INFO starting gh-ost 1.1.0
2020-12-31 07:44:49 INFO Migrating `testghost`.`testorders`
2020-12-31 07:44:49 INFO connection validated on localhost:3306
2020-12-31 07:44:49 INFO User has REPLICATION CLIENT, REPLICATION SLAVE privileges, and has ALL privileges on `testghost`.*
2020-12-31 07:44:49 INFO binary logs validated on localhost:3306
2020-12-31 07:44:49 INFO Inspector initiated on 7ad1731d2e8e:3306, version 5.7.32-log
2020-12-31 07:44:49 INFO Table found. Engine=InnoDB
2020-12-31 07:44:49 DEBUG Estimated number of rows via STATUS: 0
2020-12-31 07:44:49 DEBUG Validated no foreign keys exist on table
2020-12-31 07:44:49 DEBUG Validated no triggers exist on table
2020-12-31 07:44:49 INFO Estimated number of rows via EXPLAIN: 1
2020-12-31 07:44:49 DEBUG Potential unique keys in testorders: [PRIMARY: [UUId]; has nullable: false]
2020-12-31 07:44:49 INFO Master forced to be localhost:3306
2020-12-31 07:44:49 INFO log_slave_updates validated on localhost:3306
2020-12-31 07:44:49 INFO connection validated on localhost:3306
2020-12-31 07:44:49 DEBUG Streamer binlog coordinates: bin.000003:3196903
2020-12-31 07:44:49 INFO Connecting binlog streamer at bin.000003:3196903
[2020/12/31 07:44:49] [info] binlogsyncer.go:133 create BinlogSyncer with config {99999 mysql localhost 3306 ghost_user false false 0xc000582300 false UTC true 0 0s 0s 0 false}
[2020/12/31 07:44:49] [info] binlogsyncer.go:354 begin to sync binlog from position (bin.000003, 3196903)
[2020/12/31 07:44:49] [info] binlogsyncer.go:203 register slave for master server localhost:3306
2020-12-31 07:44:49 DEBUG Beginning streaming
2020-12-31 07:44:49 INFO rotate to next log from bin.000003:0 to bin.000003
[2020/12/31 07:44:49] [info] binlogsyncer.go:723 rotate to (bin.000003, 3196903)
2020-12-31 07:44:49 INFO connection validated on localhost:3306
2020-12-31 07:44:49 INFO connection validated on localhost:3306
2020-12-31 07:44:49 INFO will use time_zone='SYSTEM' on applier
2020-12-31 07:44:49 INFO Examining table structure on applier
2020-12-31 07:44:49 INFO Applier initiated on 7ad1731d2e8e:3306, version 5.7.32-log
2020-12-31 07:44:49 INFO Dropping table `testghost`.`_testorders_ghc`
2020-12-31 07:44:49 INFO Table dropped
2020-12-31 07:44:49 INFO Creating changelog table `testghost`.`_testorders_ghc`
2020-12-31 07:44:49 INFO Changelog table created
2020-12-31 07:44:49 INFO Creating ghost table `testghost`.`_testorders_gho`
2020-12-31 07:44:49 INFO Ghost table created
2020-12-31 07:44:49 INFO Altering ghost table `testghost`.`_testorders_gho`
2020-12-31 07:44:49 DEBUG ALTER statement: alter /* gh-ost */ table `testghost`.`_testorders_gho` ADD INDEX IDX_ClientUUId(ClientUUId)
2020-12-31 07:44:49 INFO Ghost table altered
2020-12-31 07:44:49 INFO Intercepted changelog state GhostTableMigrated
2020-12-31 07:44:49 INFO Waiting for ghost table to be migrated. Current lag is 0s
2020-12-31 07:44:49 DEBUG ghost table migrated
2020-12-31 07:44:49 INFO Handled changelog state GhostTableMigrated
2020-12-31 07:44:49 DEBUG Potential unique keys in _testorders_gho: [PRIMARY: [UUId]; has nullable: false]
2020-12-31 07:44:49 INFO Chosen shared unique key is PRIMARY
2020-12-31 07:44:49 INFO Shared columns are UUId,Version,ClientUUId,CreatedAtUtc,ModifiedAtUtc
2020-12-31 07:44:49 INFO Listening on unix socket file: /tmp/gh-ost.testghost.testorders.sock
2020-12-31 07:44:49 DEBUG Reading migration range according to key: PRIMARY
2020-12-31 07:44:49 INFO Migration min values: [<nil>]
2020-12-31 07:44:49 DEBUG Reading migration range according to key: PRIMARY
2020-12-31 07:44:49 INFO Migration max values: [<nil>]
2020-12-31 07:44:49 INFO Waiting for first throttle metrics to be collected
2020-12-31 07:44:49 INFO First throttle metrics collected
2020-12-31 07:44:49 DEBUG Operating until row copy is complete
# Migrating `testghost`.`testorders`; Ghost table is `testghost`.`_testorders_gho`
# Migrating 7ad1731d2e8e:3306; inspecting 7ad1731d2e8e:3306; executing on dev-mysql
# Migration started at Thu Dec 31 07:44:49 +0000 2020
2020-12-31 07:44:49 DEBUG Getting nothing in the write queue. Sleeping...
2020-12-31 07:44:49 DEBUG No rows found in table. Rowcopy will be implicitly empty
# chunk-size: 5000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=32; critical-load: Threads_running=500; nice-ratio: 0.000000
2020-12-31 07:44:49 INFO Row copy complete
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Migrating `testghost`.`testorders`; Ghost table is `testghost`.`_testorders_gho`
# postpone-cut-over-flag-file: /tmp/ghost.cutover.testorders.flag [set]
# panic-flag-file: /tmp/ghost.panic.testorders.flag
# Serving on unix socket: /tmp/gh-ost.testghost.testorders.sock
# Migrating 7ad1731d2e8e:3306; inspecting 7ad1731d2e8e:3306; executing on dev-mysql
# Migration started at Thu Dec 31 07:44:49 +0000 2020
# chunk-size: 5000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=32; critical-load: Threads_running=500; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# postpone-cut-over-flag-file: /tmp/ghost.cutover.testorders.flag [set]
# panic-flag-file: /tmp/ghost.panic.testorders.flag
# Serving on unix socket: /tmp/gh-ost.testghost.testorders.sock
Copy: 0/1 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: bin.000003:3199085; Lag: 0.03s, State: migrating; ETA: N/A
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: bin.000003:3199085; Lag: 0.03s, State: migrating; ETA: due
2020-12-31 07:44:49 DEBUG checking for cut-over postpone
2020-12-31 07:44:50 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 0s(copy); streamer: bin.000003:3204196; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:44:51 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 0s(copy); streamer: bin.000003:3208658; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:44:52 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 3s(total), 0s(copy); streamer: bin.000003:3213116; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:44:53 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 4s(total), 0s(copy); streamer: bin.000003:3217581; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:44:54 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 5s(total), 0s(copy); streamer: bin.000003:3222044; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:44:55 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 6s(total), 0s(copy); streamer: bin.000003:3226507; Lag: 0.12s, State: postponing cut-over; ETA: due
2020-12-31 07:44:56 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 7s(total), 0s(copy); streamer: bin.000003:3230970; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:44:57 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 8s(total), 0s(copy); streamer: bin.000003:3235435; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:44:58 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 9s(total), 0s(copy); streamer: bin.000003:3239898; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:44:59 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 10s(total), 0s(copy); streamer: bin.000003:3244360; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:00 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 11s(total), 0s(copy); streamer: bin.000003:3248825; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:01 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 12s(total), 0s(copy); streamer: bin.000003:3253287; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:02 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 13s(total), 0s(copy); streamer: bin.000003:3257753; Lag: 0.11s, State: postponing cut-over; ETA: due
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 14s(total), 0s(copy); streamer: bin.000003:3262534; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:03 DEBUG ApplyDMLEventQueries() applied 1 events in one transaction
2020-12-31 07:45:03 DEBUG ApplyDMLEventQueries() applied 1 events in one transaction
2020-12-31 07:45:03 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 2; Backlog: 0/1000; Time: 15s(total), 0s(copy); streamer: bin.000003:3267967; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:04 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 2; Backlog: 0/1000; Time: 16s(total), 0s(copy); streamer: bin.000003:3272431; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:05 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 2; Backlog: 0/1000; Time: 17s(total), 0s(copy); streamer: bin.000003:3276893; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:06 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 2; Backlog: 0/1000; Time: 18s(total), 0s(copy); streamer: bin.000003:3281359; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:07 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 2; Backlog: 0/1000; Time: 19s(total), 0s(copy); streamer: bin.000003:3285825; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:08 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 2; Backlog: 0/1000; Time: 20s(total), 0s(copy); streamer: bin.000003:3290289; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:09 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 2; Backlog: 0/1000; Time: 21s(total), 0s(copy); streamer: bin.000003:3294755; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:10 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 2; Backlog: 0/1000; Time: 22s(total), 0s(copy); streamer: bin.000003:3299215; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:11 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 2; Backlog: 0/1000; Time: 23s(total), 0s(copy); streamer: bin.000003:3303679; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:12 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 2; Backlog: 1/1000; Time: 24s(total), 0s(copy); streamer: bin.000003:3308513; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:13 DEBUG 0 rows affected. query=
update /* gh-ost `testghost`.`_testorders_gho` */
`testghost`.`_testorders_gho`
set
`UUId`=?, `Version`=?, `ClientUUId`=?, `CreatedAtUtc`=?, `ModifiedAtUtc`=?
where
((`UUId` = ?))
; args=[
{[Qo�^����K� 2 �<��ӹ�HЏ+ 2020-12-31 07:18:00 2020-12-31 07:20:00
{[Qo�^����K�]
2020-12-31 07:45:13 DEBUG ApplyDMLEventQueries() applied 1 events in one transaction
2020-12-31 07:45:13 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 3; Backlog: 0/1000; Time: 25s(total), 0s(copy); streamer: bin.000003:3312977; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:14 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 3; Backlog: 0/1000; Time: 26s(total), 0s(copy); streamer: bin.000003:3317443; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:15 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 3; Backlog: 0/1000; Time: 27s(total), 0s(copy); streamer: bin.000003:3321905; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:16 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 3; Backlog: 0/1000; Time: 28s(total), 0s(copy); streamer: bin.000003:3326365; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:17 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 3; Backlog: 0/1000; Time: 29s(total), 0s(copy); streamer: bin.000003:3330831; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:18 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 3; Backlog: 0/1000; Time: 30s(total), 0s(copy); streamer: bin.000003:3335295; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:19 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 3; Backlog: 0/1000; Time: 31s(total), 0s(copy); streamer: bin.000003:3339758; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:20 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 3; Backlog: 1/1000; Time: 32s(total), 0s(copy); streamer: bin.000003:3344589; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:21 DEBUG ApplyDMLEventQueries() applied 1 events in one transaction
2020-12-31 07:45:21 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 33s(total), 0s(copy); streamer: bin.000003:3349430; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:22 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 34s(total), 0s(copy); streamer: bin.000003:3353892; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:23 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 35s(total), 0s(copy); streamer: bin.000003:3358354; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:24 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 36s(total), 0s(copy); streamer: bin.000003:3362820; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:25 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 37s(total), 0s(copy); streamer: bin.000003:3367286; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:26 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 38s(total), 0s(copy); streamer: bin.000003:3371750; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:27 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 39s(total), 0s(copy); streamer: bin.000003:3376214; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:28 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 40s(total), 0s(copy); streamer: bin.000003:3380680; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:29 DEBUG Getting nothing in the write queue. Sleeping...
# Migrating `testghost`.`testorders`; Ghost table is `testghost`.`_testorders_gho`
# Migrating 7ad1731d2e8e:3306; inspecting 7ad1731d2e8e:3306; executing on dev-mysql
# Migration started at Thu Dec 31 07:44:49 +0000 2020
# chunk-size: 5000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=32; critical-load: Threads_running=500; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# postpone-cut-over-flag-file: /tmp/ghost.cutover.testorders.flag [set]
# panic-flag-file: /tmp/ghost.panic.testorders.flag
# Serving on unix socket: /tmp/gh-ost.testghost.testorders.sock
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 41s(total), 0s(copy); streamer: bin.000003:3384745; Lag: 0.11s, State: postponing cut-over; ETA: due
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 41s(total), 0s(copy); streamer: bin.000003:3385628; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:30 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 42s(total), 0s(copy); streamer: bin.000003:3390294; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:31 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 43s(total), 0s(copy); streamer: bin.000003:3394756; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:32 DEBUG Getting nothing in the write queue. Sleeping...
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 44s(total), 0s(copy); streamer: bin.000003:3399219; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:33 DEBUG Getting nothing in the write queue. Sleeping...
# Migrating `testghost`.`testorders`; Ghost table is `testghost`.`_testorders_gho`
# Migrating 7ad1731d2e8e:3306; inspecting 7ad1731d2e8e:3306; executing on dev-mysql
# Migration started at Thu Dec 31 07:44:49 +0000 2020
# chunk-size: 5000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=32; critical-load: Threads_running=500; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# postpone-cut-over-flag-file: /tmp/ghost.cutover.testorders.flag [set]
# panic-flag-file: /tmp/ghost.panic.testorders.flag
# Serving on unix socket: /tmp/gh-ost.testghost.testorders.sock
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 45s(total), 0s(copy); streamer: bin.000003:3402490; Lag: 0.11s, State: postponing cut-over; ETA: due
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 45s(total), 0s(copy); streamer: bin.000003:3404170; Lag: 0.11s, State: postponing cut-over; ETA: due
2020-12-31 07:45:34 DEBUG checking for cut-over postpone: complete
2020-12-31 07:45:34 INFO Grabbing voluntary lock: gh-ost.53.lock
2020-12-31 07:45:34 INFO Setting LOCK timeout as 6 seconds
2020-12-31 07:45:34 INFO Looking for magic cut-over table
2020-12-31 07:45:34 INFO Creating magic cut-over table `testghost`.`_testorders_del`
2020-12-31 07:45:34 DEBUG Getting nothing in the write queue. Sleeping...
2020-12-31 07:45:34 INFO Magic cut-over table created
2020-12-31 07:45:34 INFO Locking `testghost`.`testorders`, `testghost`.`_testorders_del`
2020-12-31 07:45:34 INFO Tables locked
2020-12-31 07:45:34 INFO Session locking original & magic tables is 53
2020-12-31 07:45:34 INFO Writing changelog state: AllEventsUpToLockProcessed:1609400734582131956
2020-12-31 07:45:34 INFO Intercepted changelog state AllEventsUpToLockProcessed
2020-12-31 07:45:34 INFO Handled changelog state AllEventsUpToLockProcessed
2020-12-31 07:45:34 INFO Waiting for events up to lock
Copy: 0/0 100.0%; Applied: 4; Backlog: 1/1000; Time: 46s(total), 0s(copy); streamer: bin.000003:3409894; Lag: 0.11s, State: migrating; ETA: due
2020-12-31 07:45:35 INFO Waiting for events up to lock: got AllEventsUpToLockProcessed:1609400734582131956
2020-12-31 07:45:35 INFO Done waiting for events up to lock; duration=986.674396ms
# Migrating `testghost`.`testorders`; Ghost table is `testghost`.`_testorders_gho`
# Migrating 7ad1731d2e8e:3306; inspecting 7ad1731d2e8e:3306; executing on dev-mysql
# Migration started at Thu Dec 31 07:44:49 +0000 2020
2020-12-31 07:45:35 DEBUG Getting nothing in the write queue. Sleeping...
# chunk-size: 5000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=32; critical-load: Threads_running=500; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# postpone-cut-over-flag-file: /tmp/ghost.cutover.testorders.flag [set]
# panic-flag-file: /tmp/ghost.panic.testorders.flag
# Serving on unix socket: /tmp/gh-ost.testghost.testorders.sock
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 46s(total), 0s(copy); streamer: bin.000003:3410768; Lag: 0.11s, State: migrating; ETA: due
2020-12-31 07:45:35 INFO Setting RENAME timeout as 3 seconds
2020-12-31 07:45:35 INFO Session renaming tables is 60
2020-12-31 07:45:35 INFO Issuing and expecting this to block: rename /* gh-ost */ table `testghost`.`testorders` to `testghost`.`_testorders_del`, `testghost`.`_testorders_gho` to `testghost`.`testorders`
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 47s(total), 0s(copy); streamer: bin.000003:3415015; Lag: 0.11s, State: migrating; ETA: due
2020-12-31 07:45:36 DEBUG Getting nothing in the write queue. Sleeping...
2020-12-31 07:45:36 INFO Found atomic RENAME to be blocking, as expected. Double checking the lock is still in place (though I don't strictly have to)
2020-12-31 07:45:36 INFO Checking session lock: gh-ost.53.lock
2020-12-31 07:45:36 INFO Connection holding lock on original table still exists
2020-12-31 07:45:36 INFO Will now proceed to drop magic table and unlock tables
2020-12-31 07:45:36 INFO Dropping magic cut-over table
2020-12-31 07:45:36 INFO Releasing lock from `testghost`.`testorders`, `testghost`.`_testorders_del`
2020-12-31 07:45:36 INFO Tables unlocked
2020-12-31 07:45:36 INFO Tables renamed
2020-12-31 07:45:36 INFO Lock & rename duration: 2.111127134s. During this time, queries on `testorders` were blocked
[2020/12/31 07:45:36] [info] binlogsyncer.go:164 syncer is closing...
[2020/12/31 07:45:36] [error] binlogstreamer.go:77 close sync with err: sync is been closing...
2020-12-31 07:45:36 INFO Closed streamer connection. err=<nil>
[2020/12/31 07:45:36] [info] binlogsyncer.go:179 syncer is closed
2020-12-31 07:45:36 INFO Dropping table `testghost`.`_testorders_ghc`
2020-12-31 07:45:36 DEBUG Done streaming
2020-12-31 07:45:36 INFO Table dropped
2020-12-31 07:45:36 INFO Am not dropping old table because I want this operation to be as live as possible. If you insist I should do it, please add `--ok-to-drop-table` next time. But I prefer you do not. To drop the old table, issue:
2020-12-31 07:45:36 INFO -- drop table `testghost`.`_testorders_del`
2020-12-31 07:45:36 INFO Done migrating `testghost`.`testorders`
2020-12-31 07:45:36 INFO Removing socket file: /tmp/gh-ost.testghost.testorders.sock
2020-12-31 07:45:36 INFO Tearing down inspector
2020-12-31 07:45:36 INFO Tearing down applier
2020-12-31 07:45:36 DEBUG Tearing down...
2020-12-31 07:45:36 INFO Tearing down streamer
2020-12-31 07:45:36 INFO Tearing down throttler
2020-12-31 07:45:36 DEBUG Tearing down...
# DoneGh-host status and upostpone command output:
~ $ echo status | nc -U /tmp/gh-ost.testghost.testorders.sock
# Migrating `testghost`.`testorders`; Ghost table is `testghost`.`_testorders_gho`
# Migrating 7ad1731d2e8e:3306; inspecting 7ad1731d2e8e:3306; executing on dev-mysql
# Migration started at Thu Dec 31 07:44:49 +0000 2020
# chunk-size: 5000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=32; critical-load: Threads_running=500; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# postpone-cut-over-flag-file: /tmp/ghost.cutover.testorders.flag [set]
# panic-flag-file: /tmp/ghost.panic.testorders.flag
# Serving on unix socket: /tmp/gh-ost.testghost.testorders.sock
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 41s(total), 0s(copy); streamer: bin.000003:3384745; Lag: 0.11s, State: postponing cut-over; ETA: due
~ $ echo unpostpone | nc -U /tmp/gh-ost.testghost.testorders.sock
Unpostponed
# Migrating `testghost`.`testorders`; Ghost table is `testghost`.`_testorders_gho`
# Migrating 7ad1731d2e8e:3306; inspecting 7ad1731d2e8e:3306; executing on dev-mysql
# Migration started at Thu Dec 31 07:44:49 +0000 2020
# chunk-size: 5000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=32; critical-load: Threads_running=500; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# postpone-cut-over-flag-file: /tmp/ghost.cutover.testorders.flag [set]
# panic-flag-file: /tmp/ghost.panic.testorders.flag
# Serving on unix socket: /tmp/gh-ost.testghost.testorders.sock
Copy: 0/0 100.0%; Applied: 4; Backlog: 0/1000; Time: 45s(total), 0s(copy); streamer: bin.000003:3402490; Lag: 0.11s, State: postponing cut-over; ETA: dueSome ideas
- We analyzed logs, looks like special symbols not escaped - https://github.com/go-sql-driver/mysql/blob/ffa70d4981de905bced351b5703851f8e8755225/utils.go#L606-L652 For example, if gh-ost copy existing rows from table with binary primary key, we can see such good (?) query with
_binaryprefix:
insert /* gh-ost `testghost`.`testorders` */ ignore into `testghost`.`_testorders_gho` (`UUId`, `Version`, `ClientUUId`, `CreatedAtUtc`, `ModifiedAtUtc`)
(select `UUId`, `Version`, `ClientUUId`, `CreatedAtUtc`, `ModifiedAtUtc` from `testghost`.`testorders` force index (`PRIMARY`)
where (((`UUId` > _binary'\0.6?0PD�y���[�>') or ((`UUId` = _binary'\0.6?0PD�y���[�>'))) and ((`UUId` < _binary'��{[Qo��^����K�\0') or ((`UUId` = _binary'��{[Qo��^����K�\0')))) lock in share mode
)We may see difference at values at Update query - [Qo�^����K�], and Insert query - _binary'��{[Qo��^����K�\0'). Last 00 at 0x150B7B5B516F87125E8597FFB44BF200 escaped to \0.
- Not sure, but may be gh-ost should rollback migration, if affected rows count is 0.
Additional info
Scripts to create users and schema.
CREATE SCHEMA testghost;
CREATE USER 'tester'@'%' IDENTIFIED BY 'PASSWORD';
GRANT ALL PRIVILEGES ON testghost.* TO 'tester'@'%';
CREATE USER 'ghost_user'@'%' IDENTIFIED BY 'PASSWORD';
GRANT ALL PRIVILEGES ON testghost.* TO 'ghost_user'@'%';
GRANT REPLICATION SLAVE ON *.* TO 'ghost_user'@'%';
GRANT REPLICATION CLIENT ON *.* TO 'ghost_user'@'%';
SET PASSWORD FOR 'ghost_user'@'%'='PASSWORD';Run MySql in docker
docker run --name r-mysql -v /ghost/mysql:/etc/mysql/conf.d -v /ghost/mysql/datadir:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw -p 3306:3306 -d mysql:5.7Mysql conf
[mysqld]
server-id=1
log-bin=bin.log
log-bin-index=bin-log.index
max_binlog_size=100M
binlog_format=row
binlog_row_image=full
socket=mysql.sock
log_slave_updates=onWe are using slightly patched gh-ost version from 8ae02ef69c0ec886e12e4361d136f1e5a88d7552 commit. We added debug message with rows affected into this line:
Line 1025 in 8ae02ef
| if _, err := tx.Exec(buildResult.query, buildResult.args...); err != nil { |
res, err := tx.Exec(buildResult.query, buildResult.args...)
if err != nil {
err = fmt.Errorf("%s; query=%s; args=%+v", err.Error(), buildResult.query, buildResult.args)
return rollback(err)
}
rowsAffected, err := res.RowsAffected()
if rowsAffected == 0 {
this.migrationContext.Log.Debugf("0 rows affected. query=%s; args=%+v", buildResult.query, buildResult.args)
}
if err != nil {
err = fmt.Errorf("RowsAffected return error. %s; query=%s; args=%+v", err.Error(), buildResult.query, buildResult.args)
return rollback(err)
}Hope, it will be easy to fix.
Best wishes and Happy New Year!