Results 1 to 3 of 3

Thread: Grep problem in html file.

  1. #1
    Join Date
    Feb 2007
    Location
    Sebring, Florida USA
    Beans
    184

    Grep problem in html file.

    I have a grep routine for inserting some text into a large quantity of htm files. I am doing okay, with one exception. I don't seem to be able to grep and find the following string to insert some text just prior to it:

    <div class='header'>

    Here's a function that works to insert the contents of a text file just before the </head> in a group of htm files:

    Code:
    for file in *.htm ; do
    while read line
    do
      grep -q "</head>" <<<$line && cat jsinsert.txt >> "$file".tmp
      echo $line >> "$file".tmp
    done < "$file"
    done
    However if I try to insert the contents of a file just before the <div class='header'> in a file (there's only one instance of this in each file) I cannot do it. I am wondering if it has something to do with the single quotes in the string.

    Here's the target line I am searching for and my code that fails to find <div class='header'> in that line.
    Code:
    <div class="recipe"><img src="pics/13.jpg"><div class='header'><p class='Title'><span class='label'>Title:</span> Almond Puff Coffeecake</p>
    
    This is my routine that fails:
    
    for file in *.htm ; do
    while read line
    do
      grep -q "<div class='header'>" <<<$line && cat divname.txt >> "$file".tmp
      echo $line >> "$file".tmp
    done < "$file"
    done
    The contents of divname.txt is simply a line containing <div id="print_div1">
    I get the temp file but it's an exact duplicate of the original file. No info is inserted just prior to <div class='header'>

    ?????
    Control is a wonderful thing ... but only if you have your own.
    LM13 with MATE DE on:
    BioStar MCP6P-M2 Motherboard * NVIDIA GeForce 6150/nForce 430 Video * AMD Sempron LE1100 1.9GHZ CPU * 2GB RAM * 160 GB SATA2 HD * 320 GB IDE HD

  2. #2
    Join Date
    May 2007
    Location
    Leeds, UK
    Beans
    1,675
    Distro
    Ubuntu

    Re: Grep problem in html file.

    I didn't really dig into your script but wouldn't it be easier to use sed?

    Code:
    sed -e "s/<div class='header'>/<div id=\"print_div1\"><div class='header'>/g"
    The blue is the matched string, the red is the replacement. The 'g' means all occurrences.

    Lots of good examples of sed here ...

    http://www.grymoire.com/Unix/Sed.html#uh-0

  3. #3
    Join Date
    Jun 2006
    Location
    Antarctica
    Beans
    500
    Distro
    Kubuntu 12.04 Precise Pangolin

    Re: Grep problem in html file.

    Or an in-place replacement with perl (dangerous, test it on a spare file before):
    Code:
    perl -pi -w -e "s/<div class='header'>/<div id=\"print_div1\"><div class='header'>/;" FileName
    More generally you want to read this.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •