PDA

View Full Version : Python Count HTML Tags



matmatmat
June 13th, 2009, 03:51 PM
How would you count how many opening HTML tags there were & then closing tags in a line?
eg


for line in something:
for match in re.match("opening_regex_here",line):
count += 1
for match in re.match("closing_regex_here",line):
count2 += 1

Bodsda
June 13th, 2009, 05:59 PM
How would you count how many opening HTML tags there were & then closing tags in a line?
eg


for line in something:
for match in re.match("opening_regex_here",line):
count += 1
for match in re.match("closing_regex_here",line):
count2 += 1


Just tried this out in an interpreter, dunno if its exactly what you want but here goes, this is how I would do it.



line = "<your><line><with><html><tags>"
tag_o = line.split(">")
tag_c = line.split("<")
print "%s Opening tags, %s Closing tags.\n" % ((len(tag_o) - 1 ), (len(tag_c) - 1 )


For me, when i tried the list returned by the split contained 1 "", hence the -1.

Hope this helps, Regards,

Bodsda

benj1
June 13th, 2009, 06:16 PM
openning tag <[A-Za-z0-9=".? -]*>
closing tag </[A-Za-z0-9=".? -]*>

be careful if youre using re.match() i think that only matches the start of the line, you might be better with re.findall()

ghostdog74
June 13th, 2009, 06:25 PM
i think (if i am not wrong) OP wants to count matching tags....