You definitely have to pick your approach based on the engine type... optimizing for MyISAM or for InnoDB.
We recently ran a benchmark comparing different ways to insert data and measured the time from before insertion and until all indices are fully restored. It was on an empty table, but we used up to 10 million rows.
MyISAM with LOAD DATA INFILE
and ALTER TABLE ... ENABLE/DISABLE KEYS
won hands down in our test (on a Windows 7 system, MySQL 5.5.27 - now we're trying it on a Linux system).
ENABLE and DISABLE KEYS does not work for InnoDB, it's MyISAM only. For InnoDB, use SET AUTOCOMMIT = 0; SET FOREIGN_KEY_CHECKS = 0; SET UNIQUE_CHECKS = 0;
if you are sure your data doesn't contain duplicates (don't forget to set them to 1
after the upload is complete).
I don't think you need OPTIMIZE TABLE
after a bulk insert - MySQL rows are ordered by insertion and the index is rebuilt anyway. There's no "extra fragmentation" by doing a bulk insert.
Feel free to comment if I made factual errors.
UPDATE: According to our more recent and complete test results, the advice to DISABLE / ENABLE keys is wrong.
A coworker had a program run multiple different tests - a table with InnoDB / MyISAM prefilled and empty, selection and insertions speeds with LOAD DATA LOCAL
, INSERT INTO
, REPLACE INTO
and UPDATE
, on "dense" and "fragmented" tables (I'm not quite sure how, I think it was along the lines of DELETE FROM ... ORDER BY RAND() LIMIT ...
with a fixed seed so it's still comparable) and enabled and diasabled indices.
We tested it with many different MySQL versions (5.0.27, 5.0.96, 5.1.something, 5.5.27, 5.6.2) on Windows and Linux (not the same versions on both OS, though). MyISAM only won when the table was empty. InnoDB was faster when data was present already and generally performed better (except for hdd-space - MyISAM is smaller on disk).
Still, to really benefit from it, you have to test it yourself - with different versions, different configuration settings and a lot of patience - especially regarding weird inconsistencies (5.0.97 was a lot faster than 5.5.27 with the same config - we're still searching the cause). What we did find was that DISABLE KEYS
and ENABLE KEYS
are next to worthless and sometimes harmfull if you don't start with an empty table.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…