Edit AllPages

Here’s a couple of methods for creating a report on the performance of random disk access on a single file. I ran this test on a dual 1.8 G5, but I wouldn’t mind getting some help on building a database of performance results for other setups. I posted a summary of my results in the form of tab delimited table data here RandomDiskAccessSingleFileTestResults. This table data was generated by the method tabDelimitedTableDataFromReport and saved to file in the method randomDiskAccessTest. The summary table shows that for my system you can read 65k blocks before you start to see average seek times increase as a function of the block size. I’m wondering how much this threshold varies from system to sytem. The test was run on a file larger than 500MB, so the only requirement for the test will be for a file size greater than 500MB.

If you would like to help out, please add your test results to RandomDiskAccessSingleFileTestResults. –zootbobbalu

I noticed that OS X will cache large amounts of disk access in RAM, so if you have touched the test file (e.g. the file has just been copied or moved across volumes) before performing this test you will get inaccurate results. The only way to be sure that the test file is not in RAM is to reboot or to copy a file larger than the amount of installed RAM (the first choice is probably better).


After plotting the data for two different setups, I noticed how well the curves match the reported specs for two different drives. Both drives have average seek times of 8.5 ms and both drives have an average sustained transfer rate of around 60MB/s. These two values show up nicely in the test results. Since an average sustained transfer rate of 60MB/s is about 60k/ms this would explain why 65k block reads was showing up as a threshold in both test results. I guess a general rule of thumb for figuring out the maximum block size to read for a random disk access is to just multiply the sustained transfer rate by ten percent of the average seek time. I’m using ten percent because I figure this is where the average seek time becomes a second order influence and the sustained transfer rate becomes a first order influence.