PDA

View Full Version : Need help to filter out all non-standard characters



Noggenfogger
February 14th, 2008, 12:18 AM
Hi

Does anyone have a bash/perl script or "one liner" that removes all "non" standard characters?
(I only want regular characters like a-z and 0-9 when filtering a file)

//Nogg

SledgeHammer_999
February 14th, 2008, 12:31 AM
you could check each character of the file if it is an a-z/0-9. if not remove. once you have a hit your loop should stop and go to the next character until you reach the end.

Zwack
February 14th, 2008, 12:37 AM
tr -dc [a-zA-Z0-9] <inputfile >outputfile

Z.

Noggenfogger
February 14th, 2008, 01:20 AM
That tr string did the trick.

your answers are much appreciated, thanks!

Zwack
February 14th, 2008, 04:19 PM
Sorry about the brevity earlier, I was in a hurry...

tr is intended for "translation" of characters from one set into another.... -d says "delete characters that appear in the first set" and -c says "take the complement of the first set before you use it" So this deletes characters that aren't in the first set. [] in a regular expression mean any one of the characters inside, 0-9a-zA-Z means three different ranges of characters. The second set to tr isn't used here.

tr can also be used easily for rot-13 by using tr [a-zA-Z] [n-za-mN-ZA-M]

Another option might have been to use the strings command.

Z.