PDA

View Full Version : python substring to instance [i should know this]



Hume's doona
December 23rd, 2009, 02:38 AM
I know I should knoow this. I have a large xml file to convert to plain text, I just havve a simple question about substrings. Given the example:


<greeting>hello world</greeting><greeting>hello mars</greeting><greeting>hello jupiter</greeting><greeting>hello saturn</greeting><greeting>hello neptune</greeting>

How do I extract the second substring between the same brckets of <greeting>?

For the first, I knoow I use:


y=str(x)[str(x).find("<greeting>")+1:str(x).find("</greeting>")]


How do I then extract the second <greeting> to assign it to an instancce of the object greeting()?

I feel really stoopid asking this :oops:

Thanks for any help and happy holidays.

ghostdog74
December 23rd, 2009, 02:51 AM
split on "</greeting>" , and the second index will be your second instance.

Can+~
December 23rd, 2009, 02:53 AM
Or use one of the many XML parsing modules available for Python.

But if it's a one-shot procedure, then I agree with ghostdog.

DaithiF
December 23rd, 2009, 09:39 AM
i would use a regex:

>>> import re
>>> example_string = "<greeting>hello world</greeting><greeting>hello mars</gree
ting><greeting>hello jupiter</greeting><greeting>hello saturn</greeting><greeting>hello neptune</greeting>"
>>> matches = re.findall(r'<greeting>(.*?)</greeting>', example_string)
>>> matches[1]
'hello mars'
>>> matches
['hello world', 'hello mars', 'hello jupiter', 'hello saturn', 'hello neptune']