PDA

View Full Version : [SOLVED] How to: regular expression



khatkarrohit
April 25th, 2014, 09:56 AM
I was reading Regular expression in JavaScript. Regular expresion are just patterns used to match or find character combinations in strings. I am confused with some of them.



(x) (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#special-capturing-parentheses)
Matches 'x' and remembers the match, as the following example shows. The parentheses are called capturing parentheses.

The '(foo)' and '(bar)' in the pattern /(foo) (bar) \1 \2/ match and remember the first two words in the string "foo bar foo bar". The \1 and \2 in the pattern match the string's last two words. Note that \1, \2, \n are used in the matching part of the regex. In the replacement part of a regex the syntax $1, $2, $n must be used, e.g.: 'bar foo'.replace( /(...) (...)/, '$2 $1' ).


(?: x) (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#special-non-capturing-parentheses)

Matches 'x' but does not remember the match. The parentheses are called non-capturing parentheses, and let you define subexpressions for regular expression operators to work with. Consider the sample expression /(?:foo){1,2}/. Without the non-capturing parentheses, the {1,2} characters would apply only to the last 'o' in 'foo'. With the non-capturing parentheses, the {1,2} applies to the entire word 'foo'.



x(?=y) (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#special-lookahead)
Matches 'x' only if 'x' is followed by 'y'. This is called a lookahead.

For example, /Jack(?=Sprat)/ matches 'Jack' only if it is followed by 'Sprat'. /Jack(?=Sprat|Frost)/ matches 'Jack' only if it is followed by 'Sprat' or 'Frost'. However, neither 'Sprat' nor 'Frost' is part of the match results.


x(?!y) (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#special-negated-look-ahead)
Matches 'x' only if 'x' is not followed by 'y'. This is called a negated lookahead.
For example, /\d+(?!\.)/ matches a number only if it is not followed by a decimal point. The regular expression /\d+(?!\.)/.exec("3.141") matches '141' but not '3.141'.




Confusion is why it says (X) act as memory device (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions) and also its explanation does not get into my head. Please explain me whats going on in the first regular expression and its example.

Vaphell
April 25th, 2014, 10:29 AM
when you have pattern (X)(Y)\1\2 first () is assigned to \1, second to \2, thus the regex "remembers" what it matched. That means you can reuse matched chunks down the road using these positional placeholders to get XYXY patterns.


$ echo $'abab\nabec\nabcd'
abab
abec
abcd
$ echo $'abab\nabec\nabcd' | grep -P '([ae])([bc])([ae])([bc])'
abab
abec
$ echo $'abab\nabec\nabcd' | grep -P '([ae])([bc])\1\2'
abab


'abec' line stopped matching the regex because once the parentheses matched 'a' and 'b', they locked the values of \1 and \2 to 'a' and 'b'. This is useful to match exact repetitions.

another example: match words that end with the same char they start with

$ echo $'aura\ntort\nlol\nwart'
aura
tort
lol
wart
$ echo $'aura\ntort\nlol\nwart' | grep -P '^(.).+\1$'
aura
tort
lol

khatkarrohit
April 26th, 2014, 08:54 AM
Thank you very much...this is what I need..now i will do a some practice on regular expressions...