PDA

View Full Version : Text Manipulation



Catsworth
April 20th, 2009, 09:20 AM
Hey Guys :)

I've got some HTML source that contains elements for a drop-down selection box on a page.

Each selectable option is delimited in the following fashion:


<option>This is Option 1</option>

At present all of the options are in one long line in the source, I would like to split each option out into a separate line in a new text file called by_line.txt.

Anybody got any ideas where I could start with this?

Thanks :)

Catsworth
April 20th, 2009, 09:21 AM
Forgot to say, happy to do this either in shell or Python, whichever is easiest/best :)

ghostdog74
April 20th, 2009, 09:36 AM
# more file
<option>This is Option 1</option><option>This is Option 2</option><option>This is Option 3</option><option>This is Option 4</option>

# awk 'BEGIN{RS="</option>"}{print $0 RT}' file
<option>This is Option 1</option>
<option>This is Option 2</option>
<option>This is Option 3</option>
<option>This is Option 4</option>

Catsworth
April 20th, 2009, 09:48 AM
Awesome thanks :)