To answer your direct question: (1) filesystems tend to use powers of 2, so you want to do the same. (2) the larger your working buffer, the less effect any mis-sizing will have.
As you say, if you allocate 4100 and the actual block size is 4096, you'll need two reads to fill the buffer. If, instead, you have a 1,000,000 byte buffer, then being one block high or low doesn't matter (because it takes 245 4096-byte blocks to fill that buffer). Moreover, the larger buffer means that the OS has a better chance to order the reads.
That said, I wouldn't use NIO for this. Instead, I'd use a simple BufferedInputStream
, with maybe a 1k buffer for my read()
s.
The main benefit of NIO is keeping data out of the Java heap. If you're reading and writing a file, for example, using an InputStream
means that the OS reads the data into a JVM-managed buffer, the JVM copies that into an on-heap buffer, then copies it again to an off-heap buffer, then the OS reads that off-heap buffer to write the actual disk blocks (and typically adds its own buffers). In this case, NIO will eliminate that native-heap copies.
However, to compute a hash, you need to have the data in the Java heap, and the Mac
SPI will move it there. So you don't get the benefit of NBI keeping the data off-heap, and IMO the "old IO" is easier to write.
Just don't forget that InputStream.read()
is not guaranteed to read all the bytes you ask for.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…