Page 1 of 2 12 LastLast
Results 1 to 10 of 13

Thread: Python - Reading from pipe with file object iterator

  1. #1
    Join Date
    Feb 2007
    Location
    New York
    Beans
    894
    Distro
    Ubuntu 9.10 Karmic Koala

    [Python] Reading from pipe with file object iterator

    I've got syslog writing to a named pipe, and I'm trying to read it line-by-line in Python. This works, printing out each line as it comes:

    Code:
    f = open("test_pipe")
    while True:
        line = f.readline()
        print line,
    The recommended method is to use a file as an interator, but this does not work. It dumps out all the lines at once only when I restart syslog (presumably because it's sending an EOF):

    Code:
    f = open("test_pipe")
    for line in f:
        print line,
    Whyyyy?


    According to PEP 234 - Iterators, these are equivalent:

    A:
    Code:
    while 1:
        line = file.readline()
        if not line:
            break
        ...
    B:
    Code:
    for line in iter(file.readline, ""):
        ...
    C:
    Code:
    for line in file:
        ...
    So I really don't understand why A and B work on a pipe that has been sent newlines and then flushed, while C only works when the pipe has been closed.
    Last edited by Endolith; September 12th, 2008 at 05:27 PM. Reason: Change title to [Python]
    "Please remember to do things the Ubuntu way. There is always more than one solution to a problem, choose the one you think will be the easiest for the user. ... Try to think as a green user and choose the simplest solution." — Code of Conduct

  2. #2
    Join Date
    Feb 2007
    Location
    New York
    Beans
    894
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Python - Reading from pipe with file object iterator

    Aha. The last method does return values without closing the file, but only if I've sent a large number of lines. This behaves the same way (doesn't work):

    Code:
    for line in file("pipe"):
        print line,
    I guess open() uses the file() type, so there isn't much difference here.

    Is this some kind of pre-buffering problem? This doesn't work, either:

    Code:
    for line in file("pipe","r",1):
        print line,
    Where 1 is buffering:

    Code:
    If the buffering argument is given, 0 means unbuffered, 1 means line
    buffered, and larger numbers specify the buffer size.
    It also doesn't work if I open both for reading and writing with 1.

    Nor does this:

    Code:
    for line in f.readlines():
        print line,
    even though this should be equivalent to

    Code:
    for line in iter(f.readline, ""):
        print line,
    which does work.
    Last edited by Endolith; September 12th, 2008 at 05:30 PM.
    "Please remember to do things the Ubuntu way. There is always more than one solution to a problem, choose the one you think will be the easiest for the user. ... Try to think as a green user and choose the simplest solution." — Code of Conduct

  3. #3
    Join Date
    Feb 2007
    Location
    New York
    Beans
    894
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Python - Reading from pipe with file object iterator

    Actually, no.
    Code:
    for line in f.readlines()
    would read in the whole file first (or a large chunk of the file?), and then split it into a list with an element for each line, while
    Code:
    for line in iter(f.readline(), "")
    reads in one character at a time until it hits a new line, which iter then makes into the first item in a list? Then it keeps reading until it hits a newline to make the next element, and so on, until it hits the EOF, which will cause readline() to return "", at which point it quits. If the EOF doesn't exist yet (pipe not closed), and it's read all the characters but not found a newline yet, it just sits and waits?

    So that would explain why I'm seeing this. So does "for line in f:" mean "read in the entire file and then split into a list with an element for each line"?
    "Please remember to do things the Ubuntu way. There is always more than one solution to a problem, choose the one you think will be the easiest for the user. ... Try to think as a green user and choose the simplest solution." — Code of Conduct

  4. #4
    Join Date
    Dec 2007
    Location
    .
    Beans
    Hidden!
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: [Python] Reading from pipe with file object iterator

    Quote Originally Posted by Endolith View Post
    I've got syslog writing to a named pipe, and I'm trying to read it line-by-line in Python. This works, printing out each line as it comes:

    Code:
    f = open("test_pipe")
    while True:
        line = f.readline()
        print line,
    The recommended method is to use a file as an interator, but this does not work. It dumps out all the lines at once only when I restart syslog (presumably because it's sending an EOF):

    Code:
    f = open("test_pipe")
    for line in f:
        print line,
    Whyyyy?

    Maybe you are not closing the file properly in the program that writes to it.

  5. #5
    Join Date
    Feb 2007
    Location
    New York
    Beans
    894
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: [Python] Reading from pipe with file object iterator

    Quote Originally Posted by days_of_ruin View Post
    Maybe you are not closing the file properly in the program that writes to it.
    I don't have any control over the program that writes to the pipe, and it's a pipe, so it's not supposed to be closed, is it? Programs continuously write to a pipe without closing it.
    Last edited by Endolith; September 12th, 2008 at 06:53 PM.
    "Please remember to do things the Ubuntu way. There is always more than one solution to a problem, choose the one you think will be the easiest for the user. ... Try to think as a green user and choose the simplest solution." — Code of Conduct

  6. #6
    Join Date
    Sep 2006
    Beans
    2,914

    Re: Python - Reading from pipe with file object iterator

    See here similar

  7. #7
    Join Date
    Feb 2007
    Location
    New York
    Beans
    894
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Python - Reading from pipe with file object iterator

    Quote Originally Posted by ghostdog74 View Post
    See here similar
    Yes, I've read that. It's helpful, but doesn't answer my question.
    "Please remember to do things the Ubuntu way. There is always more than one solution to a problem, choose the one you think will be the easiest for the user. ... Try to think as a green user and choose the simplest solution." — Code of Conduct

  8. #8
    Join Date
    Dec 2005
    Beans
    4

    Re: Python - Reading from pipe with file object iterator

    I entered the same problem today. The reason seams to be that the next method on files uses an internal buffer (see python doc for file.next()) and only returns if the pipe is closed.
    The following simple custom iterator using file.readline() internaly works fine for me:
    Code:
    class FileIterator:
        def __init__(self, file):
            self.file = file
    
        def __iter__(self):
            return self
    
        def next(self):
            l = self.file.readline()
            if l=='':
                raise StopIteration
            return l
    
    
    f = file('pipe', 'r')
    for l in FileIterator(f):
        print l,
    f.close()

  9. #9
    Join Date
    Feb 2007
    Location
    New York
    Beans
    894
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Python - Reading from pipe with file object iterator

    Quote Originally Posted by tmcg View Post
    I entered the same problem today. The reason seams to be that the next method on files uses an internal buffer (see python doc for file.next())
    http://www.python.org/doc/2.5/lib/bl...e-objects.html

    In order to make a for loop the most efficient way of looping over the lines of a file (a very common operation), the next() method uses a hidden read-ahead buffer. As a consequence of using a read-ahead buffer, combining next() with other file methods (like readline()) does not work right.
    and only returns if the pipe is closed.
    Or if there's enough data to fill the buffer?

    The following simple custom iterator using file.readline() internaly works fine for me:
    Yeah, we really shouldn't have to implement our own classes for this, though. Seems like a bug of shortsightedness to me. I'm going to file a bug and see if it sticks.
    "Please remember to do things the Ubuntu way. There is always more than one solution to a problem, choose the one you think will be the easiest for the user. ... Try to think as a green user and choose the simplest solution." — Code of Conduct

  10. #10
    Join Date
    Dec 2005
    Beans
    4

    Re: Python - Reading from pipe with file object iterator

    Following doc and PEP 234 the file iterator is intended to allow fast reading of files. To achive this it uses a buffer reading larger blocks at one.
    If you want to read lines from a pipe as soon as they are present you need to read byte by byte to check for newline (as readline() does). However this would slow down reading usual files.
    Hence the only bug I can see is that file iterator completely block until the pipe is closed. But even if this is fixed it would be unsuitable due to the use of a buffer.
    A nice solution would be to be able to turn of the buffer or to adjust its size.

Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •