Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Modifying lines in a text file

  1. #1
    Join Date
    Apr 2007
    Beans
    402
    Distro
    Ubuntu 10.04 Lucid Lynx

    Modifying lines in a text file

    Ok, so I've got a .csv file which was generated from an Excel document. It looks like this:

    Code:
    "field1"," field2 "," field3 "
    Now, unfortunately this is not exactly the format I want it in. The second and third fields seem to have a leading and trailing space, on ALMOST every line, which I want to remove. In addition, I want to add a number as the first field. So, the corrected line will look like this:

    Code:
    1,"field1","field2","field3"
    2,"field1a","field2a","field3a"
    etc.
    Now I know there's probably a million and one ways to do this. I was trying to fix this up in a hurry so I just whipped together a quick and (very!!) dirty C program to do it for me. I'm not too proud of it, but it was the best I could do at the time.

    I tried looking up documentation on tr, awk and sed, but alas I don't have nearly enough experience with either one (read: none) to be able to do anything with them.

    I'm trying to get better with shell scripting, so I was hoping someone could give me a hand on how to do this with pure bash scripting, (i.e. no Python, Ruby, or any other languages, please.)

  2. #2
    Join Date
    Jul 2006
    Beans
    209

    Re: Modifying lines in a text file

    maybe there's a cleaner solution, but I think this works:
    Code:
    cat inputfile | sed -e 's/"[[:space:]]*\([^[:space:]"]*\)[[:space:]]*"/"\1"/g' | awk '//{printf("%d,%s\n",NR,$0)}'
    EDIT: there may be problems if there's a field that looks like
    '" field with spaces between words "'
    Last edited by croto; August 1st, 2009 at 11:06 PM.

  3. #3
    Join Date
    Apr 2009
    Location
    Germany
    Beans
    2,134
    Distro
    Ubuntu Development Release

    Re: Modifying lines in a text file

    should also work but even messier
    Code:
    #!/bin/bash
    #!/bin/bash
    c=0
    while read line
    do c=$(($c+1));
    	IFS=$','
    	trim_str=""
    	for tok in $line;
    	do
    		tok=`echo $tok | sed -e "s/[ \n\t]*\"[ \t\n]*\(.*\)\"/\1/" | sed -e "s/[ \t\n]*$//"`
    		if [ -z "$trim_str" ]; then
    			trim_str="\"$tok\""
    		else
    			trim_str="$trim_str,\"$tok\""
    		fi
    	done;
    	echo "$c, $trim_str";
    	IFS=$'\n'
    done < "$1" #input file
    Last edited by MadCow108; August 1st, 2009 at 11:09 PM.

  4. #4
    Join Date
    Jul 2006
    Beans
    209

    Re: Modifying lines in a text file

    This solution seems to not have the problem my previous solution had
    Code:
    cat lala | sed -e 's/"[[:space:]]*/"/g' -e 's/[[:space:]]*"/"/g' | awk '//{printf("%d,%s\n",NR,$0)}'
    EDIT:
    even better:
    Code:
    cat lala | sed -e 's/[[:space:]]*"[[:space:]]*/"/g' | awk '//{printf("%d,%s\n",NR,$0)}'
    Last edited by croto; August 1st, 2009 at 11:12 PM.

  5. #5
    Join Date
    Dec 2007
    Location
    Iowa
    Beans
    127
    Distro
    Ubuntu 9.04 Jaunty Jackalope

    Re: Modifying lines in a text file

    A simple awk script can do the job:

    Code:
    awk '{gsub(" ",""); print FNR " " $0}' file
    Cheers,
    Dill

  6. #6
    Join Date
    Jun 2008
    Location
    California
    Beans
    2,271
    Distro
    Ubuntu 10.10 Maverick Meerkat

    Re: Modifying lines in a text file

    I'm just learning, but it works:

    Code:
    nl -s "," -w 1 file | sed 's/ "/"/g;s/" /"/g'

  7. #7
    Join Date
    Sep 2006
    Beans
    2,914

    Re: Modifying lines in a text file

    Code:
    # more file
    "field1"," field2 "," field3 "
    "field1"," fie ld2 "," fie ld3 "
    # awk -F"," '{for(i=1;i<=NF;i++){gsub(/^\" | \"/,"\"",$i)}{print NR,$0}}' OFS="," file
    1,"field1","field2","field3"
    2,"field1","fie ld2","fie ld3"

  8. #8
    Join Date
    Mar 2007
    Location
    Finland
    Beans
    256
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Modifying lines in a text file

    Very nice solutions Dill and ghostdog! Show why awk is my favourite for this kind of things @Volt9000 note that none of the examples above is "pure bash", in fact awk is as much of a programming language as python or perl. Awk is generally present in *nix systems so it doesn't really make a difference. However, you can't really do much just bash (try installing just bash on win, even "ls" won't work), but you need to call the system commands. See the bash tutorial in ghostdogs signature for more info.

  9. #9
    Join Date
    Apr 2007
    Beans
    402
    Distro
    Ubuntu 10.04 Lucid Lynx

    Re: Modifying lines in a text file

    Wow, talk about more than one way to skin a cat...
    Thanks, guys!

    Code:
    nl -s "," -w 1 file | sed 's/ "/"/g;s/" /"/g'
    Wow that's awesome, I didn't even know about that nl utility!

    note that none of the examples above is "pure bash", in fact awk is as much of a programming language as python or perl.
    Yes I suppose you're right...
    Although these solutions are much shorter (albeit more complex) than my little C program.

    Ok, I just realized something else: sometimes field3 will have embedded into it the following text:

    Code:
    &#xxx
    where xxx is a 3-digit number. This may appear any number of times (including none) in field 3. I need to add a semicolon after this. So

    Code:
    hello&#xxxworld&#yyyfoo
    will become

    Code:
    hello&#xxx;world&#yyy;foo
    I'm not sure how to do this because it's not just a straightforward replacement, since I have to search for the pound sign, and add the semicolon 3 characters after it's found.
    Last edited by Volt9000; August 2nd, 2009 at 11:30 AM.

  10. #10
    Join Date
    Jul 2006
    Beans
    209

    Re: Modifying lines in a text file

    It is a simple replacement,after all.
    Code:
    sed -e 's/[[:space:]]*"[[:space:]]*/"/g' -e 's/\(&#...\)/\1;/g' file | awk '//{printf("%d,%s\n",NR,$0)}'

Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •