PDA

View Full Version : Python regex problem (åäö) - swedish spec chars



wconstantine
July 20th, 2010, 01:06 PM
I'm trying to make my regex match a word with Swedish special characters in it.

This is a line from a file.

690382r3<räkneord>fyrtiofjärde

My regex looks like this:
p = re.compile("\w+<[a-z\såäö]+>", re.IGNORECASE)

Why does it not match? I think it's something with the special characters; åäö.

surfer
July 20th, 2010, 01:13 PM
#!/usr/bin/python
# -*- coding: utf-8 -*-

import re

swe="690382r3<räkneord>fyrtiofjärde"

p = re.compile("\w+<[a-z\såäö]+>", re.IGNORECASE)

res = p.search(swe)
print res.group(0)


...this works for me.

wconstantine
July 21st, 2010, 11:36 AM
You are quite right. I think the fault lies elsewhere in the code now. Thank you.

Hellkeepa
July 21st, 2010, 12:57 PM
HELLo!

Make sure that the line, and the script, are both encoded in UTF-8. That's a common issue when using the local characters with ASCII-codes > 126, like you've done.

Happy codin'!