Results 1 to 5 of 5

Thread: [SOLVED] Python, urllib: Why does this hang?

  1. #1
    Join Date
    Aug 2006
    Location
    60°27'48"N 24°48'18"E
    Beans
    3,458

    [SOLVED] Python, urllib: Why does this hang?

    Could someone please try out the below from somewhere else and tell me if it gets stuck for some reason at httplib's _ssl.read? (Yes, reading from SSL sockets seems to hang at the end of data for some reason, probably end of stream is not recognized or something... I chose to terminate reading at the html close tag and things work nicely...)

    I really don't understand what happened to this, I was happily making it multithreaded (at "process()") and it stopped working. The current version here has been returned to single-thread so it's not a threading issue. It fetches a couple of URLs and then just grinds to a halt. My own suspicion is that the server just chooses not to produce the response for me, but I'm a bit baffled by my ability to hit the same URLs by browser without trouble...

    PHP Code:
    SnipI don't think I'll leave my code here for Google.. ;) 
    EDIT: Actually, it does not hang, but almost... it's awfully slow, while Firefox is completely snappy. Another question -- is urllib threadsafe? How about urllib2?
    Last edited by CptPicard; January 11th, 2008 at 04:42 PM.
    LambdaGrok. | #ubuntu-programming on FreeNode

  2. #2
    Join Date
    Jul 2007
    Beans
    Hidden!
    Distro
    Ubuntu 8.04 Hardy Heron

    Re: Python, urllib: Why does this hang?

    I don't know if it has anything to do with your problem, buat what happened to '0', 'e' and 'f' in hexdigits?

    hexdigits = ['1','2','3','4','5','6','7','8','9','a','b','c','d']
    ch
    In Switzerland we make it other
    with apologies to Gerard Hoffnung


  3. #3
    Join Date
    Aug 2006
    Location
    60°27'48"N 24°48'18"E
    Beans
    3,458

    Re: Python, urllib: Why does this hang?

    They are as they're supposed to be... they are not quite hexdigits, but close If you format the URL wrong the server gives a very explicit error...
    LambdaGrok. | #ubuntu-programming on FreeNode

  4. #4
    Join Date
    Oct 2007
    Beans
    130
    Distro
    Ubuntu 8.04 Hardy Heron

    Re: Python, urllib: Why does this hang?

    Why are you using a lock "mutex" on those two lists? and what part of the code are you starting in a new thread?

    I normally use thread rather than threading, but what I can see happening in process() is that the system has to wait for a lock, and release it, every time it gets a new item. This is only being called by one function in one thread, so this isn't needed.

    Then the next function (fetch) is being called in the same thread as the main function, so we have to wait for that before getting the locks for the other list.

  5. #5
    Join Date
    Aug 2006
    Location
    60°27'48"N 24°48'18"E
    Beans
    3,458

    Re: Python, urllib: Why does this hang?

    The lock is there to protect the generator and the Set in the case of running in the multithreaded case, but as I said in the OP, this one has been returned into a singlethreaded case in order to examine the slowness of the http fetch in particular. Process() is dead code in the version presented.

    I am mostly interested in whether I'm doing something wrong with urllib in a single thread in the first place (and as a follow-up, whether it's thread-safe).

    Anyway, it seems like it might have something to do with using readlines() and it not knowing when to terminate -- something having to do with not knowing when at end of stream with SSL...
    LambdaGrok. | #ubuntu-programming on FreeNode

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •