I have an import utility sitting on the same physical server as my SQL Server instance. Using a custom IDataReader
, it parses flat files and inserts them into a database using SQLBulkCopy
. A typical file has about 6M qualified rows, averaging 5 columns of decimal and short text, about 30 bytes per row.
Given this scenario, I found a batch size of 5,000 to be the best compromise of speed and memory consumption. I started with 500 and experimented with larger. I found 5000 to be 2.5x faster, on average, than 500. Inserting the 6 million rows takes about 30 seconds with a batch size of 5,000 and about 80 seconds with batch size of 500.
10,000 was not measurably faster. Moving up to 50,000 improved the speed by a few percentage points but it's not worth the increased load on the server. Above 50,000 showed no improvements in speed.
This isn't a formula, but it's another data point for you to use.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…