-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
As per #26, the solution of atomically swapping the tables is unsafe in the event of death of connections.
I've been considering some other solutions. I've reached dead-end with all of them, but hopefully someone else can pick on where I stopped and find a solution.
An idea based on views, derived from my earlier work on http://code.openark.org/blog/mysql/auto-caching-tables is as follows:
Views & SLEEP(), dead-end on the SLEEP() part
- We assume we want to alter table
tbl - Ghost table is called
ghost - We issue:
create or replace view ghost_view as select * from ghost where sleep(600)>=0
(this view is updatable, and can receiveINSERT,UPDATE,DELETE) - To begin the swap, we issue:
rename table tbl to tbl_old, ghost_view to tbl
RENAMEworks just as well on views as it does on tables - Queries are now executing on the view, but are stalling for a long enough period
We now have time to apply changes ontoghost - Problem/Dead-end: we would like to
rename table tbl to tbl_old_view, ghost to tblbut therenameis blocked until all those hanging queries are complete. There's no way to migrate those already running queries onto the renamed ghost table.- I tried two levels of views (a view calling a view calling the table) and then swapping the mid-layered view. Still does not work. The
renamehangs until the queries complete, which is not what we want.
- I tried two levels of views (a view calling a view calling the table) and then swapping the mid-layered view. Still does not work. The
Any ideas?
Views & GET_LOCK(), spaghetti on GET_LOCK()
Before I realized the spaghetti was there, I found this solution to be appealing:
- True, it uses
GET_LOCK, so same "what if the connection dies" problem is still there - However I reduce the number of risky connections from 2 to 1
- And then make it possible to have n connections who will share the risk: it would take death of all n connections to cause for a premature
rename(as opposed to death of any connection)
And then I found the spaghetti.
The solution is similar to the above, but:
- Connection mysqlbinlog reader proof of concept #1 issues
GET_LOCK('ding', 0)and succeeds - We
create or replace view ghost_view as select * from ghost where get_lock('ding',600) >= 0 - We
rename table tbl to tbl_old, ghost_view to tbl
queries are blocked ontbl(which is now aview) - We complete writing events onto
ghost - We
rename table tbl to tbl_old_view, ghost to tbl- this blocks due to the already existing queries
- We
RELEASE_LOCK('ding')in connection mysqlbinlog reader proof of concept #1 - Problem queries are released, but are now blocking each other!!!
renameunblocks and we have our migrated table in place.
Before discussing the Problem, note that we can variant as follows:
- Have connections mysqlbinlog reader proof of concept #1 .. #n issue a
SELECT GET_LOCK('ding#1', 600)toGET_LOCK('ding#n', 600)(we take n locks) - Variant the view:
create or replace view ghost_view as select * from ghost where get_lock('ding#1',600)+get_lock('ding#2',600)+...+get_lock('ding#n',600) >= 0
Such that it would take the death of all n connections to make a premature rename.
So, back to the problem. Each query will issue get_lock and queries will block each other, leading to really scary workload. We can hack around this by doing crazy stuff like:
create view... select where release_lock(concat('ding', if(get_lock('ding',600) >= 0, '', ''))) >= 0. This will make sure to release any acquired lock.
- This still scares me because of crazy locking contention.
Ideas?