Results 1 to 3 of 3

Thread: sed help

  1. #1
    Join Date
    Dec 2012
    Beans
    7

    sed help

    Hi, I am stuck on this, have spent hours trying to find an answer :/

    I am trying to use sed to strip away content in .html files. Sometimes the html files have the html tags on own rows and sometimes not, as in the examples below:

    Example a.html:
    Code:
    <html>
    <head>
    <title>Example A</title>
    </head>
    <body style="styleA">
    <p>Some text
    </body>
    </html>
    Example b.html:
    Code:
    <html><head><title>Example B</title></head><body style="styleB"><p>Some text</body></html>
    With the following sed command I can add text after a tag independent of if it has a newline or not...

    Code:
    tobbe@virtualbox:~/$ sed 's/<[\/]*body[^>]*>/&\
    /g' a.html
    <html>
    <head>
    <title>Example A</title>
    </head>
    <body style="styleA">
    
    
    <p>Some text
    </body>
    
    
    </html>
    tobbe@virtualbox:~/$ 
    tobbe@virtualbox:~/$ 
    tobbe@virtualbox:~/$ sed 's/<[\/]*body[^>]*>/&\
    /g' b.html
    <html><head><title>Example B</title></head><body style="styleB">
    <p>Some text</body>
    </html>
    BUT I only want to add the newline when the tag is NOT followed by a newline... So I am trying to do the same as above but use the searchcritera as above + NOT a newline which I thought was '^$', i.e.

    Code:
    tobbe@virtualbox:~/$ sed 's/<[\/]*body[^>]*>^$/&\
    /g' b.html
    <html><head><title>Example B</title></head><body style="styleB"><p>Some text</body></html>

  2. #2
    Join Date
    Mar 2010
    Location
    India
    Beans
    8,171
    Distro
    Ubuntu 12.04 Precise Pangolin

    Re: sed help

    Thread moved to Programming Talk.
    ------------------------------------

    A nice reference link for sed usage : http://www.grymoire.com/Unix/Sed.html
    I'm no expert, but I managed to do everything I wanted with a very complex mix up of html files with the help of that guide and some bash references.

    Perhaps someone here can give you a better link or maybe even an elegant solution to what you want.
    Varun
    Help others by marking threads as [SOLVED], if they are. (See how)
    Wireless Script | Use Code Tags

  3. #3
    Join Date
    Jul 2007
    Location
    Poland
    Beans
    4,364
    Distro
    Ubuntu 10.04 Lucid Lynx

    Re: sed help

    . is not-a-newline

    try this:
    Code:
    sed -r 's/(<[\/]*body[^>]*>)(.)/\1\n\2/g'
    if your question is answered, mark the thread as [SOLVED]. Thx.
    To post code or command output, use [code] tags.
    Check your bash script here // BashFAQ // BashPitfalls

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •