October 11th, 2007, 02:11 PM
I have a large tab-delimited file and I would like to delete all the even-numbered fields (columns) because they contain redundant information. Is there a way to do so using shell scripting? I've thought about awk, but my knowledge is not that mcuh.

Thanks a lot!

October 11th, 2007, 03:03 PM
in python:

f = open("filename")
lines = f.split("\n")

for x in lines:
thisline = x.split("\t") # you might need to replace split("\t") with split() depending on if there are tabs between every field.
for y in range(1,len(thisline),2):
print thisline[y] + "\t"
print "\n"

I'm sure someone here knows a better way to format output in columns (I wouldnt mind a quick example either).

October 11th, 2007, 05:08 PM
awk '
printf $i" "
print ""

' "file"

in python:

for line in open("file"):
print '\t'.join(line[0::2])

October 13th, 2007, 06:52 PM
Wow, very lean. I ended up using python but with somewhat longer code (using the csv module).
I had completely forgot about stepping with slices. Guess I'll go fix my code... Thanks a lot!