PDA

View Full Version : Sed:how delete the middle text in a text



yellowpaper
April 25th, 2012, 02:59 PM
Hi Everyone !! I have a question about sed that i can't figure out.
I have this text

blahblhablahPATTERN1blhablha
blhablhablhaPATTERN2blhablha

OR

blahblhablahPATTERN1blhablha
blahblahblahblahblahlahblahlah
blhablhablhaPATTERN2blhablah

and i want , using sed , this

blahblhablah
blhablah

Is it possible?

THANKS!

SeijiSensei
April 25th, 2012, 03:13 PM
So you want to delete the PATTERN and all the following text? Use this:


sed 's/PATTERN.*//g'

Sed uses "regular expressions" to determine matches. The ".*" construct in a regular expression matches any string of characters. The "." represents a single character; the "*" is a repetition operator which matches the pattern it follows zero or more times.

See "man grep" for more details.

yellowpaper
April 25th, 2012, 03:41 PM
Thank SeijiSensei ("Arigato Sensei") but...

If i do
$ cat examp.c
Here is the text text.| This text
want to be delete without delete the line
of the pattern |End of text.

Where the pattern is a pipe, and after that i do
$ cat examp.c | sed 's/|.*//g'

the output is

Here is the text text.
want to be delete without delete the line
of the pattern

and the text should be

Here is the text text.
End of text.

Im newbie in bash/script :cry:

(I was trying i guess that i get it..sed 's/|*.*|//' maybe ...)-->>Not work fine...

yellowpaper
April 25th, 2012, 05:38 PM
OK after play with sed for hours and hours i found something that delete the middle part of the text (another comando of sed) but i dont understand (by now) what do the command only know that works.
But for finish this thread i have another cuestion.The comand s/|*.*|//' works always that the pattern has one character... BUT if the patter is || ( s/||*.*||//') doesnt work and i dont know why... (like Matrix Why why why you persist Mr Anderson ).Any suggestion ???

Vaphell
April 25th, 2012, 05:41 PM
sed is not too cool with multiline stuff. It can be forced to do that but personally i don't bother as it requires haxxorish approach, playing with internal buffers and whatnot. If it's not straightforward line-by-line, i use something else - bash script or awk or some perl oneliner found on google.


$ echo "blahblhablahPATTERN1blhablha
blahblahblahblahblahlahblahlah
blhablhablhaPATTERN2blhablah" | perl -00 -pe 's/PATTERN1.*PATTERN2/\n/sg'

blahblhablah
blhablah

this does the same as sed with the difference -00 sets record separator to null char which means that the whole text becomes one giant line (sed's separator is \n which means that record=line)

if the file is huge, reading everything at once may not be desirable. Check discussion here:
http://stackoverflow.com/questions/5862461/problem-with-perl-multiline-matching

emiller12345
April 25th, 2012, 09:56 PM
sed appears to not work well running over new lines. If you can get away with swapping out the newline characters with some other character temporarily, then this might work.

$ echo "Here is the text text.| This text
want to be delete without delete the line
of the pattern |End of text." | tr '\n' '`' | sed 's/|[^|]*|//g' | tr '`' '\n'

this produces

Here is the text text.End of text.