Often with pattern matching problems, it's as important to consider what you don't want to match - remember you are trying to find the minimal set of attributes that uniquely defines the thing you want to extract. And what's the overall structure of the log? Are blocks separated by blank lines?
Since you have two quite distinct things, you might want to consider separating them into their own files for further processing. For example, you could do a multiline match like
Code:
pcregrep -nM '^.*\d{1,2}/\d{1,2}/\d{1,2}.*(\nstatus.*)+' yourlog
(I've used pcregrep - available from the repository; you may be able to use grep in PCRE mode with the -z option, but I couldn't make the -n switch work in that case) which will give you an output like
Code:
17:Process run started on 7/24/14 9:23AM
status t8 succeed
status t9 succeed
status t10 succeed
status t11 failure
status t12 succeed
32:Process run started on 7/24/14 9:23AM
status t8 succeed
status t9 succeed
status t10 succeed
status t11 failure
status t12 succeed
i.e. extracts each of the "type II" blocks, prepended with their starting line numbers in the original file - you could then further process these to find the 'failure' lines
Code:
pcregrep -nM '^.*\d{1,2}/\d{1,2}/\d{1,2}.*(\nstatus.*)+' mylog | sed -r 's/(^[0-9]+):.*/\n\1/' > mylogII
awk -vRS= -F\n '{n=$1; for(i=2;i<=NF;i++) {if ($i ~ /failure/) print n+i-1;} }' mylogII
Just some ideas to get you started
Bookmarks