View Full Version : Parser for Regexp
Andrea_44
February 25th, 2007, 06:56 PM
I am attempting to write a program to interpret PCRE (perl compatible regular expression) into literal text.
For example:
$./MyProgram /^MDTM(?!\n)\s[^\n]{100}/smi
Output:
MDTM 1234567890!@£$%^&*()ABCDEFGHIJaaaaaaaaaaaaaa...
Are there any existing program that can do this already?
or
Could you please give me some ideas on how I should approach the issues?
i.e. Can I use "Lex"? Is it a good idea to use C or Java? What libraries should I use?
Any help will be very nice.
Thanks,
Andrea
nereid
February 25th, 2007, 07:19 PM
Why don't you use Perl? BTW there are some web regexp parsers around. You can give them a regexp and some text and they will highlight the matched pattern.
Edit:
A program which parses the regexp themselves is a little bit tricky to do, which requires some serious knowledge of them. If you've got the knowledge you don't need the program anymore ;)
Andrea_44
February 27th, 2007, 08:28 AM
BTW there are some web regexp parsers around. You can give them a regexp and some text and they will highlight the matched pattern.
Thanks for the reply, but that is not quite what I meant.
What I meant was that you give my program a regexp and it will generate some text that will matched the pattern.
For example: MyProgram can parse in the Regexp below....
$./MyProgram /^MDTM(?!\n)\s[^\n]{100}/smi
And then generate the text below that will match the Regexp MyProgram has just parsed...
MDTM 1234567890!@£$%^&*()ABCDEFGHIJaaaaaaaaaaaaaa...
Is it even possible to do that?
Cheers~
Andrea
FYI, I am attempting to write a program that parse in Snort rules(containing PCRE) and generate packets (containing payload that matches the PCRE) which will trigger the Snort rules.
nereid
February 27th, 2007, 10:22 AM
It should be possible but requires some serious thought. Which I ain't have the time for at the moment.
You should parse the regexp. Take a look at the modifiers behind them and watch out for modifiers in the regexp. Second look out for the ^ and the $. Then you'll have to model the lookahead and lookbehing modifier. And so on...This requires some heavy knowledge and in the end you'll build a compiler from regexp to plain text, which isn't easy.
Powered by vBulletin® Version 4.2.2 Copyright © 2024 vBulletin Solutions, Inc. All rights reserved.