PDA

View Full Version : [SOLVED] nawk script tweak



onytz
October 23rd, 2008, 09:54 PM
Hello all,
I have a nawk script giving me almost the results I want.
I call it from the command-line with:


nawk -f xmlSR.awk args.xml

args.xml is:


<figure><graphic url="foo1.bar"/></figure>:<figure rend="big"><graphic url="foo1.bar"/></figure>
<figure><graphic url="foo2.bar"/></figure>:<figure rend="big"><graphic url="foo2.bar"/></figure>

xmlSR.awk is:


BEGIN {FS=":";n=1
while (getline<"source.xml">0)
line[n++]=$0
}
{for (i=1; i<n; i++) {
temp=line[i]
gsub($1, $2, temp)
print temp}
}
and source.xml is:


<root>
<parent><child/></parent>
<parent><figure><graphic url="foo1.bar"/></figure></parent>
<parent><figure><graphic url="foo2.bar"/></figure></parent>
<parent><figure><graphic url="foo3.bar"/></figure></parent>
<parent><figure><graphic url="foo4.bar"/></figure></parent>
</root>

When I invoke the script, I get the following output:


<root>
<parent><child/></parent>
<parent><figure rend="big"><graphic url="foo1.bar"/></figure></parent>
<parent><figure><graphic url="foo2.bar"/></figure></parent>
<parent><figure><graphic url="foo3.bar"/></figure></parent>
<parent><figure><graphic url="foo4.bar"/></figure></parent>
</root>
<root>
<parent><child/></parent>
<parent><figure><graphic url="foo1.bar"/></figure></parent>
<parent><figure rend="big"><graphic url="foo2.bar"/></figure></parent>
<parent><figure><graphic url="foo3.bar"/></figure></parent>
<parent><figure><graphic url="foo4.bar"/></figure></parent>
</root>

But all I want is one root element with the two modified figure elements as its children, like so:


<root>
<parent><child/></parent>
<parent><figure rend="big"><graphic url="foo1.bar"/></figure></parent>
<parent><figure rend="big"><graphic url="foo2.bar"/></figure></parent>
<parent><figure><graphic url="foo3.bar"/></figure></parent>
<parent><figure><graphic url="foo4.bar"/></figure></parent>
</root>

In English, I want to do a single search and replace on source.xml using all the colon-separated fields in args.xml. Any suggestions? Thanks in advance!

ghostdog74
October 24th, 2008, 01:53 AM
awk 'BEGIN{FS=":"}
FNR==NR{
org=$2
sub(/.*url=\"/,"",$2)
sub(/\"\/><\/figure>/,"",$2)
_[$2]=org
next
}
/url=/{
o=$0
sub(/.*url=\"/,"")
sub(/\"\/><\/figure>.*$/,"")
if ( $0 in _ ){
print "<parent><figure>" _[$0] "</parent>"
}else{
print o
}
next
}1
' args.xml source.xml

output:


# ./test.sh
<root>
<parent><child/></parent>
<parent><figure><figure rend="big"><graphic url="foo1.bar"/></figure></parent>
<parent><figure><figure rend="big"><graphic url="foo2.bar"/></figure></parent>
<parent><figure><graphic url="foo3.bar"/></figure></parent>
<parent><figure><graphic url="foo4.bar"/></figure></parent>
</root>

onytz
October 24th, 2008, 03:22 PM
Thanks!