PDA

View Full Version : Python XML help



dbbolton
June 6th, 2008, 10:20 PM
I have an XML file containing this:



<entry name="font" mtime="1211869682" type="string">
<stringvalue>FONTNAME</stringvalue>
</entry>


How can I get python to get the line in the middle and make it a string? I know how to get a line containing a search string, but the problem is that the "stringvalue" element appears in the file several times, and "FONTNAME" changes. The only string that could be used to find this is "font", but it occurs in the line before the line I want. Any suggestions?

[h2o]
June 6th, 2008, 10:58 PM
I have an XML file containing this:



<entry name="font" mtime="1211869682" type="string">
<stringvalue>FONTNAME</stringvalue>
</entry>


How can I get python to get the line in the middle and make it a string? I know how to get a line containing a search string, but the problem is that the "stringvalue" element appears in the file several times, and "FONTNAME" changes. The only string that could be used to find this is "font", but it occurs in the line before the line I want. Any suggestions?

Take a look at one of the xml parsers that are shipped with python. I have only tried "minidom" (http://docs.python.org/lib/module-xml.dom.minidom.html) but it works nicely.

Or you could iterate through the lines and find all lines that contain the '<entry name="font"' string and then extract the value from the next line:



extract_data = False
for line in lines:
if line.index('<entry name="font"') >= 0:
extract_data = True
elif extract_data:
# Extract the data from the tagusing split or whatever
# Then use the data in some way...
extract_data = False

pmasiar
June 6th, 2008, 11:17 PM
Do you have just this small snippet, or is is part of big file?

Small texts with fixed structure you can split on text literals, bigger/flexible XML files are much harder. Don't try to parse big XML using re. Use ElementTree; or BeautifulSoup for HTML.

nick_h
June 6th, 2008, 11:17 PM
You could use the expat (http://www.python.org/doc/current/lib/module-xml.parsers.expat.html) XML parser.

Here is a modified version of the example code:

import pyexpat

# 3 handler functions
def start_element(name, attrs):
print 'Start element:', name, attrs
def end_element(name):
print 'End element:', name
def char_data(data):
print 'Character data:', repr(data)

p = pyexpat.ParserCreate()

p.StartElementHandler = start_element
p.EndElementHandler = end_element
p.CharacterDataHandler = char_data

p.Parse("""<?xml version="1.0"?>
<entry name="font" mtime="1211869682" type="string">
<stringvalue>FONTNAME</stringvalue>
</entry>""", 1)

dbbolton
June 7th, 2008, 01:21 AM
;5131037']Take a look at one of the xml parsers that are shipped with python. I have only tried "minidom" (http://docs.python.org/lib/module-xml.dom.minidom.html) but it works nicely.

Or you could iterate through the lines and find all lines that contain the '<entry name="font"' string and then extract the value from the next line:



extract_data = False
for line in lines:
if line.index('<entry name="font"') >= 0:
extract_data = True
elif extract_data:
# Extract the data from the tagusing split or whatever
# Then use the data in some way...
extract_data = False


Here is what I've tried:


if exists(os.path.expanduser("~/.gconf/apps/gnome-terminal/profiles/Default/%25gconf.xml")):
extract_data = False
for line in open(os.path.expanduser("~/.gconf/apps/gnome-terminal/profiles/Default/%25gconf.xml")):
if line.index('<entry name="font"') >= 0:
extract_data = True
elif extract_data:
termfont = line
break
else:
termfont = "fghsrtfyhfgh"


term font ends up as "fghsrtfyhfgh". What did I do wrong?


Do you have just this small snippet, or is is part of big file?

Small texts with fixed structure you can split on text literals, bigger/flexible XML files are much harder. Don't try to parse big XML using re. Use ElementTree; or BeautifulSoup for HTML.

It is a large gconf XML file.

days_of_ruin
June 7th, 2008, 01:38 AM
What exactly are you trying to do?
I have been working with gconf files lately and I used the python
gconf module.

>>>import gconf
>>>c = gconf.client_get_default()
>>> c.get_value("/desktop/gnome/interface/font_name")
'Liberation Sans 10'

dbbolton
June 7th, 2008, 01:45 AM
What exactly are you trying to do?
I have been working with gconf files lately and I used the python
gconf module.

>>>import gconf
>>>c = gconf.client_get_default()
>>> c.get_value("/desktop/gnome/interface/font_name")
'Liberation Sans 10'

That is exactly what I want to do! I wasnt aware of the gconf module. Thank you very much :guitar:

pmasiar
June 7th, 2008, 03:20 AM
import gconf.... reminds me about this cartoon:

http://imgs.xkcd.com/comics/new_pet.png

LaRoza
June 7th, 2008, 03:25 AM
import gconf.... reminds me about this cartoon:


I thought it was a joke myself...

days_of_ruin
June 7th, 2008, 04:00 AM
I thought it was a joke myself...
Huh?What about that cartoon isn't a joke?

nick_h
June 7th, 2008, 12:21 PM
import gconf.... reminds me about this cartoon:
which reminded me of this one:

http://imgs.xkcd.com/comics/python.png

pmasiar
June 8th, 2008, 04:39 AM
Huh?What about that cartoon isn't a joke?

joke candidate was not the cartoon, but "import gconf". Looked really close to linked cartoons. Possibly you may want to upgrade your sense of humor :-)

LaRoza
June 8th, 2008, 04:41 AM
Huh?What about that cartoon isn't a joke?

I thought the initial "import gconf" was a parody on that.

days_of_ruin
June 8th, 2008, 05:05 AM
joke candidate was not the cartoon, but "import gconf". Looked really close to linked cartoons. Possibly you may want to upgrade your sense of humor :-)

I wasn't making a joke when I posted "import gconf".
Considering that "gconf" isn't an english word or anything I don't see
how that is comparable to "import soul".

LaRoza
June 8th, 2008, 05:07 AM
I wasn't making a joke when I posted "import gconf".
Considering that "gconf" isn't an english word or anything I don't see
how that is comparable to "import soul".

Well, someone says they want to use Python to edit gconf, and it just so happens to have a module named "gconf" that is easily imported.

It is the same principle.

Wybiral
June 8th, 2008, 05:12 AM
I wasn't making a joke when I posted "import gconf".
Considering that "gconf" isn't an english word or anything I don't see
how that is comparable to "import soul".

Well, it's funny because it's a cliché for Python.

Person #1 "I'm trying to write *thing* in python"

Person #2 "Oh, don't waste your time, just use: import *thing*"

Happens all the time :)

pmasiar
June 9th, 2008, 01:48 PM
Happens all the time :)

It even has it's own name: "Guido's little time machine". It works like this:

You need some functionality. You post about it to forums or mailing list. Guido reads it, jumps to his LTM, goes one or two versions of Python back, and adds the module you need, so by time he is back, Python has that module/functionality by default (in core), or downloadable module.

Guido's LTM is one of the best features Python has :-)