PDA

View Full Version : Filling in web forms using a Python bot



-grubby
June 10th, 2008, 06:14 AM
I'm interested in bots, and I'm trying to make one that connects to a website and fills in forms. I'm using a manual on urllib2 (http://www.voidspace.org.uk/python/articles/urllib2.shtml), but it doesn't seem to be working out. My source code is as follows:



import urllib
import urllib2

url = 'http://grubbn.com/fluxbb/post.php?tid=180'

user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'

values = {
# req_username, req_email, req_message
# Guest name:, Guest e-mail:, Write message:
'req_username' : 'Grubbot',
'req_email' : 'nathangrubb@grubbn.com',
'req_message' : 'this is a test message from a bot'
}
headers = { 'User-Agent' : user_agent }

data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()

fred.reichbier
June 10th, 2008, 12:20 PM
It works for me on http://www.rentbayarea.com/post_test.php. Maybe you have forgot a hidden field or something?

-grubby
June 12th, 2008, 12:52 AM
Still to no avail. I'm not sure I'm getting the fields right, though. What part of the HTML markup is supposed to be the name of the form in this?

Wybiral
June 12th, 2008, 01:52 AM
Still to no avail. I'm not sure I'm getting the fields right, though. What part of the HTML markup is supposed to be the name of the form in this?

I don't have time to dive too deeply into the markup, but I see, at least:


<input type="hidden" name="form_sent" value="1" />
<input type="hidden" name="form_user" value="Guest" />

-grubby
June 12th, 2008, 02:22 AM
I'm pretty sure that it's either the name or value attribute. I'll try both and tell you how it goes

Edit: Maybe I'm not so sure..those are hidden input items and they don't even include enough for all 3 input areas. I'll just delvage into the source code and see what I can find

-grubby
June 12th, 2008, 02:49 AM
Ok, well I tried some different methods:


import urllib
import urllib2

url = 'http://grubbn.com/fluxbb/post.php?tid=180'

values = {
# req_username, req_email, req_message
# Guest name:, Guest e-mail:, Write message:
# form_sent, forum_user
# Guest, 1
'Guest name:' : 'Grubbot',
'Guest e-mail:' : 'nathangrubb@grubbn.com',
'Write message:' : 'this is a test message from a bot'
}

data = urllib.urlencode(values)
resp = urllib2.urlopen(url, data)
response = urllib2.urlopen(resp)
response.read


which resulted in the following python output:


File "bot.py", line 18, in <module>
response = urllib2.urlopen(resp)
File "/usr/lib/python2.5/urllib2.py", line 124, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.5/urllib2.py", line 373, in open
protocol = req.get_type()
AttributeError: addinfourl instance has no attribute 'get_type'


Any ideas?

Can+~
June 12th, 2008, 03:19 AM
File "bot.py", line 18, in <module>
response = urllib2.urlopen(resp)
File "/usr/lib/python2.5/urllib2.py", line 124, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.5/urllib2.py", line 373, in open
protocol = req.get_type()
AttributeError: addinfourl instance has no attribute 'get_type'


Try with:


import urllib
import urllib2

url = 'http://grubbn.com/fluxbb/post.php?tid=180'

values = {
# req_username, req_email, req_message
# Guest name:, Guest e-mail:, Write message:
# form_sent, forum_user
# Guest, 1
'req_username' : 'Grubbot',
'req_email' : 'nathangrubb@grubbn.com',
'req_message' : 'this is a test message from a bot'
}

data = urllib.urlencode(values)
resp = urllib2.urlopen(url, data)

print resp.read()

It posted back the same page.

-grubby
June 12th, 2008, 03:36 AM
Indeed, it posted back the source code. But how does this help me?

-grubby
June 16th, 2008, 09:34 PM
I'm going to bump this, I still haven't figured it out

skeeterbug
June 16th, 2008, 11:48 PM
I'm going to bump this, I still haven't figured it out


<div class="main-content message">
<p>You do not have permission to access this page.</p>
</div>

*EDIT*
I guess I will elaborate. First you need to login, urlib2 has support for cookies, so logging in first, then doing the rest should work fine.

-grubby
June 16th, 2008, 11:59 PM
Erm, sorry about any confusion, my forum settings used to let guests posts, when I was testing this bot, now they don't, I forgot about that. Thanks for the alert. I'll make it so guests can post