Page 1 of 2 12 LastLast
Results 1 to 10 of 15

Thread: Mission impossible?

  1. #1
    Join Date
    Aug 2006
    Beans
    445

    Mission impossible?

    I have a bunch of files in a bunch of sub-directories which may or may not contain specific text.

    For instance D001, D002, D003 ... may be in the files.

    I need to output a list of which TEXT(s) exist. I don't want to know which file they are in only that they exist.

    So I would look for "D0" in the directory and the output would be D001, D002, D003 etc.

    I have fiddled around with find and grep to no avail.

    Can anyone help?

    Much appreciation in anticipation

  2. #2
    Join Date
    Nov 2007
    Location
    London, England
    Beans
    7,704

    Re: Mission impossible?

    I assume they are all D followed by three digits. If so, maybe this would do the trick?
    Code:
    grep -R -o 'D[0-9][0-9][0-9]' | sort -u

  3. #3
    Join Date
    Aug 2006
    Beans
    445

    Re: Mission impossible?

    Thanks for your prompt reply.

    I guess I misled you. Sorry.

    The sought text could be any length (D0010, D00022, D00045, ...) BUT all starting with D00 ...

    So think:

    look for "star", output start, started, starlight ...

  4. #4
    Join Date
    Mar 2011
    Location
    U.K.
    Beans
    Hidden!
    Distro
    Ubuntu 22.04 Jammy Jellyfish

    Re: Mission impossible?

    I suggest https://itsfoss.com/ripgrep-all/

    using wildcards

    also a GUI: Recoll (but RipGrep is easier for me)

  5. #5
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: Mission impossible?

    Quote Originally Posted by The Cog View Post
    I assume they are all D followed by three digits. If so, maybe this would do the trick?
    Code:
    grep -R -o 'D[0-9][0-9][0-9]' | sort -u
    Quote Originally Posted by Langstracht View Post
    The sought text could be any length (D0010, D00022, D00045, ...) BUT all starting with D00 ...
    Modified:
    Code:
    grep -E 'D00*[0-9][0-9][0-9]' ./* | sort -u
    Note: Only one of 'very many ways'.

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

  6. #6
    Join Date
    Nov 2007
    Location
    London, England
    Beans
    7,704

    Re: Mission impossible?

    Both of these would match the given string definition. It's a very flexible definition we've been given.
    Code:
    grep -E -r -o 'D00.+' | sort -u
    grep -E -r -o 'D00[0-9]+' | sort -u

  7. #7
    Join Date
    Aug 2006
    Beans
    445

    Re: Mission impossible?

    Your suggested commands "work" - thank you - but the output is not what I want/need.

    As I tried to explain earlier, I need the output to be just the sought after text - D00111 or whatever. Not the complete line on which it occurs.

  8. #8
    Join Date
    Nov 2007
    Location
    London, England
    Beans
    7,704

    Re: Mission impossible?

    You haven't defined the "sought after text". It seems to start with "D00" but "could be any length". Actually, with your current definition, everything from "D00" to the end of the file is a valid result.

  9. #9
    Join Date
    Aug 2006
    Beans
    445

    Re: Mission impossible?

    I'm afraid I don't really understand the problem that you are having.

    But let me try to do this again.

    Any file in a folder may,or may not, contain the words star, startle, started, starlight.

    I would like to discover which of these words are in a file and output them - the words NOT the line they appear on.

    So I would command

    grep -r "star"

    and this WILL locate and output the line(s) on which any of these words appear.

    So how do I get it to just output the found words?

  10. #10
    Join Date
    Nov 2007
    Location
    London, England
    Beans
    7,704

    Re: Mission impossible?

    Define what constitutes the start and end of the search text. With words, we can perhaps assume that a non-alpha character (space, comma etc) marks the end of the word. With egrep, we could search for '\bstar\w*\b' except that grep thinks that \w (meaning letters in a word) includes underscore, so that stars_and_stripes would be seen as one word. That may or may not matter to you. Would "superstar" be a match for you?

    But you have described wanting to search for "D00" followed by an indeterminate length something. Where does it start or end? Messing around, I found a file here that contains "MAP_RESERVED0080". Does that match? What about "D0080x994-w pop"?

    This will find all strings starting with D00 followed by any numbers of digits. If it sees "MAP_RESERVED0080xyz" it will output "D0080".
    Code:
    grep -r -E -o -h 'D00[0-9]+' | sort -u

Page 1 of 2 12 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •