View Full Version : [ubuntu] Split fields in a tab delimited file
ashkot
August 8th, 2012, 08:34 PM
Hi all,
I have a tab delimited file with 8 columns, in one of the column i have data such as
1:1000
2:2000
3:3000
What i want to do is to split the entire file such that a new column is created as follows
1 [tab] 1000
2 [tab] 2000
How can this be done?
Thanks in advance.
A
Lars Noodén
August 8th, 2012, 09:20 PM
Would something like this work?
awk 'FS="[:\t]"{printf "%s\t%s\t%s\n", $1, $2, $3}' /tmp/x.csv
You'd have to pad it out for the appropriate number of columns.
drmrgd
August 8th, 2012, 09:34 PM
You can also use the Output Field Separator (OFS) function of awk to put tabs in. This should work as well:
awk -F':' 'BEGIN{OFS="\t"} {print $1,$2}' test.txt
SeijiSensei
August 8th, 2012, 09:39 PM
I usually handle problems like this in an editor. I use jed (http://www.jedsoft.org/jed/) in Emacs mode and can enter the tab character as ^Q^I. So I would just do a global search and replace for ":" and replace it with tab sequence. Works for me.
I used sed for a variety of things, but it's not always as easy as using a full-screen text-mode editor.
Lars Noodén
August 8th, 2012, 09:41 PM
The first example I gave would give a problem if the colon appears elsewhere in a column. This restricts the substitution to one column. Swap $1 for the number of the column to be split:
awk '{sub(/:/,"\t",$1);printf "%s\t%s\t%s\n", $1, $2, $3}' /tmp/x.csv
steeldriver
August 8th, 2012, 09:46 PM
or you could even use 'tr' I think
if you want to extract just one column (I may be reading that wrong) then something like
cut -fn file | tr ':' '\t'where n is the index of the column containing the 1:1000 data
If I *am* reading it wrong, and you just want to split the data column, then I would use awk as well - to be absolutely safe (in case there are : characters elsewhere) you could do something like
awk 'BEGIN{OFS="\t"} { split($5,a,":"); print $1,$2,$3,$4,a[1],a[2],$6,$7,$8; }' (adjust the $5 as appropriate)
papibe
August 8th, 2012, 09:48 PM
Hi ashkot.
Another approach is to replace the delimiter ':' for 'tab'. This may work for you if the delimiter is not used in another field:
sed 's/:/\t/g' file.txt
Regards.
ashkot
August 9th, 2012, 09:23 AM
This worked the best for me, thank you all again.
ashkot
or you could even use 'tr' I think
if you want to extract just one column (I may be reading that wrong) then something like
cut -fn file | tr ':' '\t'where n is the index of the column containing the 1:1000 data
If I *am* reading it wrong, and you just want to split the data column, then I would use awk as well - to be absolutely safe (in case there are : characters elsewhere) you could do something like
awk 'BEGIN{OFS="\t"} { split($5,a,":"); print $1,$2,$3,$4,a[1],a[2],$6,$7,$8; }' (adjust the $5 as appropriate)
Powered by vBulletin® Version 4.2.2 Copyright © 2024 vBulletin Solutions, Inc. All rights reserved.