KEMBAR78
Fix copy % count inaccuracy by changing the heuristics · Issue #600 · github/gh-ost · GitHub
Skip to content

Fix copy % count inaccuracy by changing the heuristics  #600

@ggunson

Description

@ggunson

gh-ost's statistics, when run on a large, busy table with a lot of inserts, can become inaccurate over time. Internally we've sometimes seen cutovers become available when the migration is reported as 90% complete or less.

Copy: 1045877600/1264045717 82.7%; Applied: 884999578; Backlog: 12/1000; Time: 460h0m0s(total), 459h59m53s(copy); streamer: mysql-bin.012574:687201905; State: migrating; ETA: 95h57m17s

The 1045877600/1264045717 82.7% above is equivalent to count of rows in _gho table/count of rows in original table. Compared to the actual counts, the size of the new _gho table was 11% greater, so the % completed was 93%. (The count of rows in the original table was quite accurate, if not exact).

Currently gh-ost determines the table counts by

  1. Getting the row count at the beginning (whether the exact row count or an estimate),
  2. Parsing the binlogs: +1 for inserts and -1 for deletes.
  3. Getting the rows_affected from the insert into _gho select ... from original_table.

The counts are likely inaccurate because due to the concurrent threads of row copying and binlog parsing/applying, we don't know if a binlog INSERT results in a net 1 row increase (since it becomes a REPLACE, and might cancel out via delete+insert) and we don't know if a DELETE deletes a row (because it hadn't been copied over yet by the copy thread). Likely since gh-ost is erring on the side of too-low counts, @shlomi-noach guesses it's the parsed delete counts that are the bigger cause of the discrepancy.

If we have gh-ost get all the _gho row counts from the rows_affected of the writes to the _gho table, this should result in more accurate statistics. E.g.

if _, err := tx.Exec(buildResult.query, buildResult.args...); err != nil {

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions