JustRandomName
February 17th, 2006, 05:01 AM
Hi,
I am trying to sort a file, but due to my lack of knowledge of sort I have had to use hacky code. I have tried using the man pages, and various search querys for goole but I still can't seem to fathom the sort program.
What I want to do:
I have a file which has the following data:
===
<first name> <sir name> <misc data>
OR
<id> <misc data>
====
i.e. if there is no data about the name we use the ID.
I want to be able to sort the file by:
<sir name> <first name> <id>
In psuedo code it might be something like this:
if (name) then
sort by <sir name> (MAJOR KEY)
sort by <first name> (MINOR KEY)
else
sort by <id>
I tried `sort -k 2 myfile' and this *seems* to do what I would like (order by sirname (if there is two identical sirnames, order by sirname, then firstname)
The problem lies in the <id> because this is numerical that takes precedence
e.g. instead of:
Debbie Andrews <misc>
John Smith <misc>
Steve Smith <misc>
456787 <misc>
456788 <misc>
I am getting:
456787 <misc>
456788 <misc>
Debbie Andrews <misc>
John Smith <misc>
Steve Smith <misc>
The current solution I have is very time consuming and consists of:
awk #get everything begining with <id>; ouput to tempfile
awk #get everything begining with <text>; ouput to tempfile2
sort tempfile2 > mySortedFile
sort tempfile1 >> mySorted file
Any advice, or links to a decent sort tutorial (other than man pages or implementations using awk) is appreciated.
I am trying to sort a file, but due to my lack of knowledge of sort I have had to use hacky code. I have tried using the man pages, and various search querys for goole but I still can't seem to fathom the sort program.
What I want to do:
I have a file which has the following data:
===
<first name> <sir name> <misc data>
OR
<id> <misc data>
====
i.e. if there is no data about the name we use the ID.
I want to be able to sort the file by:
<sir name> <first name> <id>
In psuedo code it might be something like this:
if (name) then
sort by <sir name> (MAJOR KEY)
sort by <first name> (MINOR KEY)
else
sort by <id>
I tried `sort -k 2 myfile' and this *seems* to do what I would like (order by sirname (if there is two identical sirnames, order by sirname, then firstname)
The problem lies in the <id> because this is numerical that takes precedence
e.g. instead of:
Debbie Andrews <misc>
John Smith <misc>
Steve Smith <misc>
456787 <misc>
456788 <misc>
I am getting:
456787 <misc>
456788 <misc>
Debbie Andrews <misc>
John Smith <misc>
Steve Smith <misc>
The current solution I have is very time consuming and consists of:
awk #get everything begining with <id>; ouput to tempfile
awk #get everything begining with <text>; ouput to tempfile2
sort tempfile2 > mySortedFile
sort tempfile1 >> mySorted file
Any advice, or links to a decent sort tutorial (other than man pages or implementations using awk) is appreciated.