Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Asynchronous File I/O on Linux



On May 18, 2010, at 8:58 AM, Mark Woodward wrote:
> 
> Wait, even from a pedantic perspective, asynchronous I/O is the ability it issue disk I/O requests without blocking the process or thread. I am merely attempting to use this ability to optimize a particular type of operation on a file.

No, you're not.  Really.  Try this: open() a file handle with the async I/O  option then try to read() and see what happens.  Experiment, because the results are not what you seem to expect.

> With tagged queuing on SATA and SCSI before it, a driver is able to issue multiple requests simultaneously to the device and the device is supposed to be able to get requested blocks in cache and return them over the device I/O bus.

This is concurrent I/O.

> A few points: (1) The disk block I am requesting will never be in cache unless I have requested it.

Yup.

> (2) A "good database" i.e. not mySQL, something like Oracle, DB2, PostgreSQL, etc. do their own caching and manage their own data access. Oracle still has their own device level access system.

Which is primarily used for raw devices that bypass the filesystem driver.  If the database is on disk then even Oracle relies on the filesystem driver and kernel buffers.  And in practice, when the database is on a high performance storage system it's faster than Oracle's own raw device I/O options for reasons previously mentioned.

> (3) A "few" (4) milliseconds shaved off a function that is run half a million to a million times is between 1/2 hour and an hour of wall clock time. That is important.

If you stripe across three spindles then you cut your access times by approximately 33% without having to code anything.  But if you code it anyway then your code will probably run *slower* because you're wasting CPU cycles trying to optimize something with completely different seek timings from what you expect from a single spindle.

> (4) If asynchronous I/O is not used, then I will *always" have the worst case scenario of purely sequential reads.

Again, try my suggestion above and see what happens.

--Rich P.








BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org