I have some Python code that is downloading a page with urllib2 and then doing a read() on the resulting object. I have been printing out the page by redirecting the output to a file and using tail -f. For some reason, the page is never complete. Sometimes it will display up to just before the footer, other times it will be an earlier part in the page. Looking at the urllib2 documentation and all of the examples I have found, I don't believe this should be happening unless it somehow thought there was an end of file (EOF) in the page itself.
I have a method that sets the request variable to a urllib2.Request.
Code:
request = urllib2.Request(self.pageURL)
Another method is then passed this request and does the following (sans try/except block):
Code:
response = urllib2.urlopen(request)
...
# Save the HTML of the resulting page
self.resultPage = response.read()
response.close
After that, I have been printing out the resultPage for debugging. Any idea why this would be happening? The HTML is intact if I download the page with Firefox.
Bookmarks