The parts of the line that are identical don't really matter. It is the parts that need to be different and some method of locating where that difference can be 100% assured in each line.
sed works on 1 line at time. So does awk, which is a little more powerful.
Code:
sed -e 's/transcript_name/transcript_id/g' inputfile > output
is how you replace things. If you want to insert things, the you can match on the leading whitespace and transcript - something like this:
Code:
sed -e 's/transcript_name/transcript_id "transcript:AAS13770"; transcript_name/g' inputfile > output
So perform other changes when the pattern isn't 100%, you can either run another sed command and pass the output from the first into the second (via pipes) or add another -e s///g stanza or use tools that work on column locations or have better grouping capabilities like ruby, python or perl.
For amazing text processing, using another scripting language like perl would be my choice. Perl has a chunking function that will split on any delimiter you like (whitespace is default) and easily store each line into an array.
Code:
#!/usr/bin/env perl
while (<>){
chomp;
my @line = split(/ +/, $_);
print join ' ', @line, "\n";
$line[8] = $line[8] . "-foo";
print "8: ", $line[8], "\n";
}
The @line is an array, so $line[8] should have the 'transcript_name' inside. If you modify just that part of the array before printing it out, you can make it say anything you like. See above. There are 50 other ways to handle this too. Arrays and splitting is 1, probably not even the best. If all you care about is the output line, then I wouldn't bother assigning anything inside the array. Let the 'print' handle what you need.
Run this with filename.pl inputfile. Output goes to stdout so it can be used as a filter. If you use the output from another program's stdout as stdin to the perl or sed, then is looks like this:
Code:
cat file | filename.pl | sed -e "s/whatever/something new/g" |more
Filters that use stdin and write to stdout are very powerful. sed, grep, cut, join and 150 other Unix tools work that way.
Code:
du -h * | sort -hr |more
That's for lurkers.
Bookmarks