Page 1 of 2 12 LastLast
Results 1 to 10 of 16

Thread: [SOLVED] Search multiple .odt files

  1. #1
    Join Date
    Aug 2007
    Beans
    139

    [SOLVED] Search multiple .odt files

    Is there a good way to search a ton (well, 268) files in .odt format for a short text string all at once?

  2. #2
    Join Date
    Jul 2008
    Beans
    565

    Re: Search multiple .odt files

    You might try
    grep -l "short text string" *.odt
    From what I recall odt is a plaintext xml based format so it should work.
    Last edited by eightmillion; August 24th, 2008 at 09:28 AM.
    Code:
    ruby -ne '$_.gsub(/<[^>]*>|\([^)]*\)|\[[^\]]*\]/,"").each_char{|i|STDOUT.flush.print(i);sleep(0.03)}if/(<\/li>|<ul>)<li>/' <(wget -qO- is.gd/e3EGx)

  3. #3
    Join Date
    Jan 2008
    Location
    Auckland, New Zealand
    Beans
    3,132
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Search multiple .odt files

    Quote Originally Posted by eightmillion View Post
    You might try
    From what I recall odt is a plaintext xml based format so it should work.
    They're a compressed archive so that won't work. I think there's a way to search through archives though but I'm not sure.

  4. #4
    Join Date
    Jul 2008
    Beans
    565

    Re: Search multiple .odt files

    They're a compressed archive so that won't work. I think there's a way to search through archives though but I'm not sure.
    You are right. They are zip archive. I think zgrep greps zip archives, so try this command instead:
    zgrep -l "short text string" *.odt
    Code:
    ruby -ne '$_.gsub(/<[^>]*>|\([^)]*\)|\[[^\]]*\]/,"").each_char{|i|STDOUT.flush.print(i);sleep(0.03)}if/(<\/li>|<ul>)<li>/' <(wget -qO- is.gd/e3EGx)

  5. #5
    Join Date
    Jan 2008
    Location
    Auckland, New Zealand
    Beans
    3,132
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Search multiple .odt files

    This will tell you all the files that contain your search term:

    Code:
    for file in $(ls *.odt); do
    unzip -p "$file" content.xml | grep -l "search term" > /dev/null
    if [ $? -eq 0 ]; then
    echo "$file"
    fi
    done
    Edit: Yeah zgrep would be better, I didn't know about that!
    Last edited by ad_267; August 24th, 2008 at 10:05 AM.

  6. #6
    Join Date
    Jul 2008
    Beans
    565

    Re: Search multiple .odt files

    Crap. zgrep only works on gzip and bzipped archives it appears. You can use a simple script to unzip all the files and grep them though
    Code:
    for i in *.odt;
    do unzip -ca $i | grep -q "short text string";
        if [ $? -eq 0 ];
            then echo "string found in $i";
        fi;
    done
    You'll need to save it as something like grepodt and change "short text string" as appropriate. Then make it executable with
    chmod +x grepodt
    Then drop it in the directory with all the odt files and call it with ./grepodt
    Hope this helps.
    Last edited by eightmillion; August 24th, 2008 at 09:57 AM.
    Code:
    ruby -ne '$_.gsub(/<[^>]*>|\([^)]*\)|\[[^\]]*\]/,"").each_char{|i|STDOUT.flush.print(i);sleep(0.03)}if/(<\/li>|<ul>)<li>/' <(wget -qO- is.gd/e3EGx)

  7. #7
    Join Date
    Feb 2008
    Location
    Utah
    Beans
    185
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Search multiple .odt files

    Quote Originally Posted by ad_267 View Post
    This will tell you all the files that contain your search term:

    Code:
    for file in $(ls *.odt); do
    unzip -p *.odt content.xml | grep -l "search term" > /dev/null
    if [ $? -eq 0]; then
    echo "$file"
    fi
    done
    Edit: Yeah zgrep would be better, I didn't know about that!
    I was just testing that before you posted it. I was trying to find a way to just show the string without all the XML data but your way is better since you can open the files themselves after you know which contain the string.

  8. #8
    Join Date
    Jan 2008
    Location
    Auckland, New Zealand
    Beans
    3,132
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Search multiple .odt files

    I made this into a proper script because I thought it might be useful some time. Also I accidentally had "unzip -p *.odt" instead of "unzip -p "$file""

    Code:
    #!/bin/bash
    
    if [ $# -ne 1 ]; then
    	echo "Usage: searchodt searchterm"
    	exit 1
    fi
    
    for file in $(ls *.odt); do
    	unzip -ca "$file" content.xml | grep -ql "$1"
    	if [ $? -eq 0 ]; then
    		echo "$file"
    	fi
    done
    Last edited by ad_267; August 24th, 2008 at 10:33 AM.

  9. #9
    Join Date
    Jul 2008
    Beans
    565

    Re: Search multiple .odt files

    ad_267, you should use -ca options for unzip. -a converts text files. You should use -q for grep also. It's quiet mode, so you can get rid of your "> /dev/null. Otherwise it looks good.
    Last edited by eightmillion; August 24th, 2008 at 10:11 AM.
    Code:
    ruby -ne '$_.gsub(/<[^>]*>|\([^)]*\)|\[[^\]]*\]/,"").each_char{|i|STDOUT.flush.print(i);sleep(0.03)}if/(<\/li>|<ul>)<li>/' <(wget -qO- is.gd/e3EGx)

  10. #10
    Join Date
    Jan 2008
    Location
    Auckland, New Zealand
    Beans
    3,132
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Search multiple .odt files

    Quote Originally Posted by eightmillion View Post
    ad_267, you should use -pa options for unzip. -a converts text files. You should use -q for grep also. It's quiet mode, so you can get rid of your "> /dev/null. Otherwise it looks good.
    Ok thanks I've modified it to use those options. That's the fun about writing bash scripts, you always learn something new. I saw your script after I posted mine and it looks like it works in a very similar way.
    Last edited by ad_267; August 24th, 2008 at 10:16 AM.

Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •