Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Asynchronous File I/O on Linux



On 05/18/2010 02:06 PM, Bill Bogstad wrote:
> On Tue, May 18, 2010 at 1:19 PM, Richard Pieri <richard.pieri-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org=
> wrote:
>  =20
>> On May 18, 2010, at 1:01 PM, Bill Bogstad wrote:
>>    =20
>>> He also doesn't want to open the file multiple times which causes a
>>> problem since a file descriptor can only have one lseek() position at=

>>> a time.   I can imagine scenarios (for example a library might be
>>>      =20
>> This is why you create a bunch of threads, each with its own file hand=
le, each file handle with its own unique pointer.  You have to open the f=
ile multiple times and run each read() in its own thread to truly paralle=
lize the read operations.
>>    =20
> Please re-read the end of my last message.   Take a look at pread()
> (POSIX) and readahead() (Linux only).
> It turns out you do not need separate file handles.    Threads may
> still be required to make it non-blocking.
>
>  =20
>>> Err, CPU cycles are practically free compared to disk seeks.   That's=

>>> why disk schedulers implement things like elevator algorithms rather
>>> then FIFO:
>>>      =20
>> They're not free if your poll/select loop is waiting for input from th=
e device.  In other words, you're blocking on the I/O anyway, you're just=
 shifting where you block from the kernel into a loop in your program.
>>    =20
> Not necessarily   I would like random chunks of data from this file
> (perhaps NEED it at some specific computation point in the future),
> but I have some other computation I can do in the meantime.   Please
> start the disk IO now.  Don't make me create multiple FDs for a single
> file.    At my option, I would like you to:
>
> 1. signal() me in some way when the data is available.
> 2. have a non-blocking operation I could use to check when the data is =
ready.
>
> Bonus would be if I could submit multiple such requests to the kernel
> with a single system call so I can be sure the disk scheduler's
> algorithm has all my requests as soon as possible.
>
> I don't see why this is an absurd way to want to do computation.  It
> may not be possible in a Linux/POSIX environment, but I don't see why
> it wasn't worth thinking about how to come close to it.   If I hadn't,
> I would have never learned about pread()/readahead() which look to be
> very useful.   If you allow threads, it would appear that just about
> everything except
> the bonus part is doable.
>
>  =20
Thanks for the heads up on readahead(2). Since most of my systems
programming has been on commercial Unix (Tru64, HP-UX) and FreeBSD, I
never encountered that.

I think the signal in question is sigio. When you issue a non-blocking
read if the I/O cannot be completed you will get an EWOULDBLOCK errno.
You would have to initially set your signals up so that your signal
handler can tell you when the data is available.

Certainly the use of pthreads can provide parallelization, but threads
can block and cause you some timing issues.

--=20
Jerry Feldman <gaf-mNDKBlG2WHs at public.gmane.org>
Boston Linux and Unix
PGP key id: 537C5846
PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB  CA3B 4607 4319 537C 5846








BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org