Find & Replace !!!

**zero2xiii** · July 5th, 2012

Originally Posted by Peterinall

hahahaha.... that's a newbi's crooked output hahaha.....
Thank you zero2xiii for ur time & excellent help.
By the way still there is a bit confusion . I tried to apply ur command to remove

Code:

[1-50] [<a href="#top">som text</a>]

Where [0-50] again are the same pattern 1 to 50 with out any success but the output file is just a messed upo file with out any related content with input file, why ?

& how the script would look like which execute both commands with a single out put ?

Thanks again for ur kind help,
Have a nice time,
Regards,
Peter.

Hay,

Can you please give a few lines of example input, and then the desired output. +1 for Vaphell. That is why I used two seperate sed statements. I forgot about the ? thing. I remember it is something to do with wildcards, but only on a specific place and only one character per ?... If I remember correctly (damn, haven't used it in some time)

Also what do you mean with:

By the way still there is a bit confusion . I tried to apply ur command to remove

Code:

[1-50] [<a href="#top">som text</a>]

Where [0-50] again are the same pattern 1 to 50 with out any success but the output file is just a messed upo file with out any related content with input file, why ?

Can you please elaborate more on precisely what you did, tried? That script works perfectly for me with generated input. Oh maybe you are copy/pasting it to terminal. You need to create a file (say edit.sh) and copy that code in it. Then go to terminal and make the file executable (chmod +x ./edit.sh) and then run it with ./edit.sh ... Thats how one uses a script.

Cherz

**Vaphell** · July 5th, 2012

I remember it is something to do with wildcards, but only on a specific place and only one character per ?... If I remember correctly (damn, haven't used it in some time)

in regular expressions ? (0-1), + (1-) and * (0-) can be used with char groups too, not only with ordinary chars (a? = optional 'a') and chars from given sets ( [abc]? = optional 'a' or 'b' or 'c' )

Code:

$ echo $'abcdefghi\nabc'
abcdefghi
abc
$ echo $'abcdefghi\nabc' | sed -r 's/abc(def)?/X/'  # abc with optional def
Xghi
X
$ echo $'abcdefghi\nabc' | sed -r 's/([a-z]{4})+/X/'  # multiples of 4 letters (1 or more 4-letter combos)
Xi
abc

**Peterinall** · July 5th, 2012

Originally Posted by zero2xiii

Hay,

Can you please give a few lines of example input, and then the desired output. +1 for Vaphell. That is why I used two seperate sed statements. I forgot about the ? thing. I remember it is something to do with wildcards, but only on a specific place and only one character per ?... If I remember correctly (damn, haven't used it in some time)

Also what do you mean with:

Can you please elaborate more on precisely what you did, tried? That script works perfectly for me with generated input. Oh maybe you are copy/pasting it to terminal. You need to create a file (say edit.sh) and copy that code in it. Then go to terminal and make the file executable (chmod +x ./edit.sh) and then run it with ./edit.sh ... Thats how one uses a script.

Cherz

Sorry for delay in replying

Here is a full sentence where colored parts have to be removed the blue ones were removed by ur previous bash script but with the same command only replacing what i want to delete in red below did not work(I am giving this example coz it was the shortest one

Code:

 [22] [<a href="#top">crtc</a>]<br></font></b>Increased sexual desire.<br>Stitches in right testicle and spermatic cord.<br>Stitches in testicles while sitting.<br><b><font color="#0000ff">Gleet&nbsp;; burning&nbsp;; green discharge.<br>Red itching eruption on glans *****.<br></font><font color="#800000">&nbsp;<a name="23"></a>&nbsp;&nbsp;</p>

Thank you very very much,
Regards.

**Peterinall** · July 5th, 2012

Originally Posted by Vaphell

in regular expressions ? (0-1), + (1-) and * (0-) can be used with char groups too, not only with ordinary chars (a? = optional 'a') and chars from given sets ( [abc]? = optional 'a' or 'b' or 'c' )

Code:

$ echo $'abcdefghi\nabc'
abcdefghi
abc
$ echo $'abcdefghi\nabc' | sed -r 's/abc(def)?/X/'  # abc with optional def
Xghi
X
$ echo $'abcdefghi\nabc' | sed -r 's/([a-z]{4})+/X/'  # multiples of 4 letters (1 or more 4-letter combos)
Xi
abc

I am just looking at these things like i look at aeroplanes going over my head they are so high above my head.

I wish i could get all these things. linux is awesome .

Regards

**zero2xiii** · July 5th, 2012

Hay,

The square brackets in the statement is what is giving the issue, so we escape them using the backslash \. This means we tell bash to ignore the character following the backslash.

so this works again:

Code:

echo '[8] [<a href="#top">crtc</a>]' | sed s_'\[[1-9]\]\ \[<a href="#top">crtc</a>\]'_nothing_g
nothing
echo '[22] [<a href="#top">crtc</a>]' | sed s_'\[[1-9][1-9]\]\ \[<a href="#top">crtc</a>\]'_nothing_g
nothing

To use it in the script I gave you earlier, just add this to after the last sed statement:

Code:

-e 's_\[[1-9]\]\ \[<a href="#top">crtc</a>\]__g' -e 's_\[[1-9][1-9]\]\ \[<a href="#top">crtc</a>\]__g'

Just look at the script how the other two lines are combined to form only one sed command and add that line after the second statement.

Cherz

**zero2xiii** · July 5th, 2012

Just for clarity:

Code:

#!/bin/sh

for file in ./*.html
	do echo $file
	sed -e 's_<font color="#800000">&nbsp;<a name="[0-9]"></a>'__g -e s_'<font color="#800000">&nbsp;<a name="[0-9][0-9]"></a>__g' "$file" >"$file"_new
	done

exit

Becomes:

Code:

#!/bin/sh

for file in ./*.html
	do echo $file
	sed -e 's_<font color="#800000">&nbsp;<a name="[0-9]"></a>'__g -e s_'<font color="#800000">&nbsp;<a name="[0-9][0-9]"></a>__g' -e 's_\[[1-9]\]\ \[<a href="#top">crtc</a>\]__g' -e 's_\[[1-9][1-9]\]\ \[<a href="#top">crtc</a>\]__g' "$file" >"$file"_new
	done

exit

You can do it with four sed statements, but then you would have to create 4 new files, with the forth file being the last, fully changed document. It causes unnecessary disk writing and adds time to the process not needed (especially if you are working with a few hundred or thousand files)

Cherz

Just to explain what happens, as I see you actualy tried to use some initiative:

In sed you can spesify a range for a value using square brackets.
So [1-5] will match, 1, 2, 3, 4 or 5
However it is NOT a numeral range so [5-10] will NOT match 5, 6, 7, 8, 9, 10.. It will Match 0, 1, 5, 6, 7, 8, 9.
Typing [12345] and [1-5] is synonymous.

In the given statement there are already square brackets:
[25] so we need to tell sed to KEEP the brackets, and give it a patern range, so we escape the last set of square brackets:
\[[1-5]\]. This tells sed to match [1], [2], [3], [4], [5].

To match dubble digtits, like 10, 12, 55 etc, we need to give 2 ranges (one for each number):
So [1-2][1-2] will match: 11, 12, 21, 22.

To understand that, play with echo and a range {1..9} in terminal and see how it works:

Code:

echo {1..9}
echo {1..9}{1..9}

You will note however you can simply say

Code:

echo {11..99}

To have the same result as the second echo in the first example. But the value range in sed does not follow the same behaviour. If it is possible to have a range in sed similar to {1..99}, then I am unaware of HOW to do it.

Hope this clears things up a bit

Cherz

**Vaphell** · July 5th, 2012

Code:

echo {1..9}{1..9}

You will note however you can simply say

Code:

echo {1..99}

To have the same result as the second echo in the first example.

not really

{1..9}{1..9} produces all combinations of [1-9][1-9] while 1..99 gives all numbers in 1-99 range.

Code:

$ echo {1..9}{1..9}
11 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46 47 48 49 51 52 53 54 55 56 57 58 59 61 62 63 64 65 66 67 68 69 71 72 73 74 75 76 77 78 79 81 82 83 84 85 86 87 88 89 91 92 93 94 95 96 97 98 99
$ echo {1..99}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

if you said {0..9}{0..9} and {00..99} then yes, they would be more or less equivalent

(bash ranges support padding with 0s to produce fixed width output)

range can be achieved using alternative
1-50 -> ([1-9]|[1-4][0-9]|50) = [1-9] or [1-4][0-9] or 50
0-50 including 0X -> ([0-4]?[0-9]|50)

**zero2xiii** · July 5th, 2012

Originally Posted by Vaphell

not really

{1..9}{1..9} produces all combinations of [1-9][1-9] while 1..99 gives all numbers in 1-99 range.

if you said {0..9}{0..9} and {00..99} then yes, they would be equivalent

(bash ranges support padding with 0s to produce fixed width output)

Haha yea sorry, had a Typo there, meant to type 11..99 not 1..99 cause 00..99 will also include 00, 01, 02, 03 and so forth. Unless you said {0..9}{0..9}, but still a typo on my side hahahaha....

Also, I lost the part on the range you gave?:

range can be achieved using alternative
1-50 -> ([1-9]|[1-4][0-9]|50) = [1-9] or [1-4][0-9] or 50
0-50 including 0X -> ([0-4]?[0-9]|50)

Does this work inside SED parameters? Cause I know the use of the above, but does it work in sed? That could make 4 arguments into 2 in the above script.

Cherz

**Vaphell** · July 5th, 2012

Haha yea sorry, had a Typo there, meant to type 11..99 not 1..99 cause 00..99 will also include 00, 01, 02, 03 and so forth. Unless you said {0..9}{0..9}, but still a typo on my side hahahaha....

again, not really

11..99 is normal number range of 11-99 and will include 20, which is not the case with {1..9}{1..9}

Does this work inside SED parameters?

Sure

Code:

$ echo \"{0..99}\" | sed -r 's/"([1-9]|[1-4][0-9]|50)"/(\1)/g'
"0" (1) (2) (3) (4) (5) (6) (7) (8) (9)
(10) (11) (12) (13) (14) (15) (16) (17) (18) (19)
(20) (21) (22) (23) (24) (25) (26) (27) (28) (29)
(30) (31) (32) (33) (34) (35) (36) (37) (38) (39)
(40) (41) (42) (43) (44) (45) (46) (47) (48) (49)
(50) "51" "52" "53" "54" "55" "56" "57" "58" "59"
"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
"90" "91" "92" "93" "94" "95" "96" "97" "98" "99"

**Peterinall** · July 6th, 2012

Originally Posted by zero2xiii

Just for clarity:

Code:

#!/bin/sh

for file in ./*.html
    do echo $file
    sed -e 's_<font color="#800000">&nbsp;<a name="[0-9]"></a>'__g -e s_'<font color="#800000">&nbsp;<a name="[0-9][0-9]"></a>__g' "$file" >"$file"_new
    done

exit

Becomes:

Code:

#!/bin/sh

for file in ./*.html
    do echo $file
    sed -e 's_<font color="#800000">&nbsp;<a name="[0-9]"></a>'__g -e s_'<font color="#800000">&nbsp;<a name="[0-9][0-9]"></a>__g' -e 's_\[[1-9]\]\ \[<a href="#top">crtc</a>\]__g' -e 's_\[[1-9][1-9]\]\ \[<a href="#top">crtc</a>\]__g' "$file" >"$file"_new
    done

exit

Hello great folks,
Sorry for such unwanted delay but work work work...

Thanks Zero 2xii for ur help but the later script does not seem to removing the second one

Code:

e 's_\[[1-9]\]\ \[<a href="#top">crtc</a>\]__g' -e  's_\[[1-9][1-9]\]\ \[<a href="#top">crtc</a>\]__g' "$file"  >"$file"_new

i am really glad to realize that ubuntu community guys r so awesome.

Regards
Peter.

Thread: Find & Replace !!!

Thread Tools

Display

Re: Find & Replace !!!

Re: Find & Replace !!!

Re: Find & Replace !!!

Re: Find & Replace !!!

Re: Find & Replace !!!

Re: Find & Replace !!!

Re: Find & Replace !!!

Re: Find & Replace !!!

Re: Find & Replace !!!

Re: Find & Replace !!!

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions