Results 1 to 2 of 2

Thread: Very particular deletions

  1. #1
    Join Date
    Jul 2010
    Beans
    9

    Very particular deletions

    I have large data files (32,000 lines) that I need to cut down. My problem is (1) I need to remove every row that contains "B" in column 22 and (2) I don't know anything about programming.
    How do I accomplish this?

    (If you can only solve problem 1 that's fine)

  2. #2
    Join Date
    Oct 2006
    Location
    Tucson, AZ
    Beans
    1,420
    Distro
    Xubuntu 10.04 Lucid Lynx

    Re: Very particular deletions

    Quote Originally Posted by ludwigvan1988 View Post
    I have large data files (32,000 lines) that I need to cut down. My problem is (1) I need to remove every row that contains "B" in column 22 and (2) I don't know anything about programming.
    How do I accomplish this?

    (If you can only solve problem 1 that's fine)
    1. In a terminal window:
    Code:
    egrep -v "^.{21}B" file.txt > newfile.txt
    where "file.txt" is the filename to be processed, and "newfile.txt" is the new file to be created without those lines.

    2. Take a class in programming

    For an explanation of what that command does: 'egrep' is a version of the 'grep' program that handles "extended regexes". Don't worry what these are for now - trust me - they're needed for this.

    'grep' is a utility program that searches through files to locate lines that contain certain things, as specified by a "pattern". In this case, we want everything *but* the lines that match, so we give it the "-v" option to invert the match (show only non-matching lines, rather than the default of showing only matching lines).

    The pattern "^.{21}B" - the "^" means "start of line". The "." says "match any character". The "{21}" says "repeat the previous match 21 times" (you can get the same results with "^.....................B", but the form I provided is a little easier to handle). The "B" is "match a literal character "B". So in total, "match any line that has a B in the 22nd column).

    'grep' normally sends it's output to stdout (aka, the screen). The "> newfile.txt" tells it to take what would normally be displayed on the screen, and send it to a file called "newfile.txt" instead.

    Lloyd B.
    Don't tell me to get a life.
    I had one once.
    It sucked.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •