Tonight I was working on a problem where I have two similar files named something like 'file_n_something_x.xls' and 'file_n_something_x.vcf'. I was trying to extract a subset of the data from each and compare it to find out if it was unique to one file or the other. Because the data is similar, but different, and I had a few other things that I needed to do, I wrote a quickie Perl script to compare the two. The issue is, though, that I have around 100 of these to compare side by side, and I don't know the best way to iterate over the pairs of files, passing them into the Perl script. I ended up loading all of the xls files in to one array and then the vcf file into another:
Code:
$ declare -a vcf=(*.vcf)
$ declare -a xls=(*.vcf)
Then I just iterated over the two arrays, passing them into the script:
Code:
$ for (( i=0; i<${#xls[@]}; i++ )); do perl vcf_comparison.pl ${xls[i]} ${vcf[i]}; done
I figured that was fairly safe as the arrays should have been loaded in asciibetical order and I know all the files are indeed in pairs, and it did seem to work. But, this doesn't seem like the right way to do this, and I'm wondering what the right way is. I probably could have read all the files into an array in the Perl script and then more explicitly compared them before processing. Maybe that's the only way?
What's the better way to do this, or is what I did the right way?
Bookmarks