Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: Using sed to mass edit html files

  1. #1

    Using sed to mass edit html files

    I have about 6000 html files spread over many directories, all maths and physics notes for my website.
    I have external links in the footer of each page cos I think google is penalising me.
    I want to replace the string

    <P ALIGN=CENTER STYLE="margin-bottom: 0cm"><A HREF="http://www.studentforums.biz/">Student Forum</A> <A HREF="http://www.tutoragency.org/">Tutor Agency</A>

    with

    <P ALIGN=CENTER STYLE="margin-bottom: 0cm"><A HREF="http://www.studentforums.biz/" rel="nofollow">Student Forum</A> <A HREF="http://www.tutoragency.org/" rel="nofollow">Tutor Agency</A>

    How do I do this with sed?
    I have tried for a few hours but sed is frankly playing mozart with syntax.

  2. #2
    Join Date
    Nov 2007
    Location
    London, England
    Beans
    5,793
    Distro
    Xubuntu 15.04 Vivid Vervet

    Re: Using sed to mass edit html files

    Try this :

    Make a function that fixes a file by copy/pasting this into your command prompt:
    Code:
    function fixfile {
    echo fixing file $1
    sed -i \
    -e 's_<A HREF="http://www.studentforums.biz/">_<A HREF="http://www.studentforums.biz/" rel="nofollow">_g' \
    -e 's_<A HREF="http://www.tutoragency.org/">_<A HREF="http://www.tutoragency.org/" rel="nofollow">_g' $1
    }
    Then try it on one file:
    Code:
    fixfile filename
    and if it has the desired effect, do it on all the files:
    Code:
    find . -name '*.html' -exec fixfile '{}' \;
    P.S.
    Notice that I am searching for every instance of references to those two sites, not just the margin-bottom ones. That was to keep the search/replace strings short in this reply. If you need to be more specific, you can make the search/replace strings longer.
    Last edited by The Cog; March 6th, 2013 at 11:25 AM. Reason: P.S.

  3. #3

    Re: Using sed to mass edit html files

    seems to be working.
    A valuable piece of code

  4. #4

    Re: Using sed to mass edit html files

    How do I edit the code to search subdirectories?

  5. #5
    Join Date
    Oct 2010
    Location
    London
    Beans
    481
    Distro
    Ubuntu 12.04 Precise Pangolin

    Re: Using sed to mass edit html files

    By default, that find line should search subdirectories.

  6. #6

    Re: Using sed to mass edit html files

    When I try to edit all the files in a directory I get the error message

    find: ‘fixfile’: No such file or directory

  7. #7
    Join Date
    Feb 2013
    Beans
    Hidden!

    Re: Using sed to mass edit html files

    First, export the function to subshells, then invoke a subshell in the -exec action of find.
    Code:
    export -f fixfile
    find . -name \*.html -exec bash -c 'fixfile $0' {} \;
    Last edited by schragge; March 24th, 2013 at 11:46 PM.

  8. #8

    Re: Using sed to mass edit html files

    Whey - hey
    Brilliant!

  9. #9
    Join Date
    Nov 2007
    Location
    London, England
    Beans
    5,793
    Distro
    Xubuntu 15.04 Vivid Vervet

    Re: Using sed to mass edit html files

    Thanks, schragge. I overlooked that little complexity, and was really struggling to figure out how to do it. I still can't figure out how to do files with spaces in their names though.

  10. #10
    Join Date
    Feb 2013
    Beans
    Hidden!

    Re: Using sed to mass edit html files

    Quote Originally Posted by The Cog View Post
    I still can't figure out how to do files with spaces in their names though.
    Quote it.

    This
    Code:
    function fixfile {
    echo fixing file $1
    sed -i \
    -e 's_<A HREF="http://www.studentforums.biz/">_<A HREF="http://www.studentforums.biz/" rel="nofollow">_g' \
    -e 's_<A HREF="http://www.tutoragency.org/">_<A HREF="http://www.tutoragency.org/" rel="nofollow">_g' "$1"
    }
    and this
    Code:
    find -name \*.html -exec bash -c 'fixfile "$0"' {} \;
    Last edited by schragge; March 24th, 2013 at 11:47 PM.

Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •