Help with regular expressions

**carranty** · November 14th, 2011

I'm trying to develop a better understanding of the command line by working my way through the book "The Linux Command Line" by Schotts, available to view here

http://linuxcommand.org/tlcl.php

I'm on the section dealing with regular expressions on page 262. He uses as an example the regular expression that will "only match lines consisting of groups of one or more alphabetic characters separated by single spaces:"

Code:

^([[:alpha:]]+ ?)+$

As an example he uses the grep command

Code:

carranty@carranty-desktop ~ $ echo "This that" | grep -E '^([[:alpha:]]+ ?)+$'
This that

The problem I'm having is that the + command seems not to work. I get the same result if I leave it out

Code:

carranty@carranty-desktop ~ $ echo "This that" | grep -E '^([[:alpha:]] ?)+$' 
This that

and say I want to only match lines consisting of one alphanumeric character followed by a space. I would think

Code:

^([[:alpha:]]? ?)+$

would do the job (I've replaced the + with a ?), but it doesn't

Code:

carranty@carranty-desktop ~ $ echo "This that" | grep -E '^([[:alpha:]]? ?)+$'
This that

it still returns the line even though it has more than one character before the space.

Weirdly, the third quantifier (the first ? in the original code) works because if I pass a line with more than one space

Code:

carranty@carranty-desktop ~ $ echo "This  that" | grep -E '^([[:alpha:]]+ ?)+$' 
carranty@carranty-desktop ~ $

It doesn't output the line. Can someone please help me out and explain what I'm doing wrong.

**carranty** · November 14th, 2011

Ok, the above post may be a little long so heres a more basic example. Say I want to match any lines consisting of MORE than 1 character. Shouldn't

Code:

[[:alpha:]]*

do it?? Because it isn't working for me

Code:

carranty@carranty-desktop ~ $ echo "T" | grep -E '[[:alpha:]]*'
T

**mutley89** · November 15th, 2011

The problem I'm having is that the + command seems not to work. I get the same result if I leave it out

Code:

^([[:alpha:]] ?)+$

This will match the beginning of the line, followed by a letter and an optional space (because of the '?'), one or more times. To match a line containing only 1 letter followed by a space use:

Code:

^[[:alpha:]] $

To match multiple letters each followed by a space (eg. 'a b d e ', note the trailing space) use:

Code:

^([[:alpha:]] )+$

to remove the requirement for a trailing space use:

Code:

^([[:alpha:]] )+[[:alpha:]]$

and say I want to only match lines consisting of one alphanumeric character followed by a space. I would think

Code:

^([[:alpha:]]? ?)+$

would do the job (I've replaced the + with a ?), but it doesn't

This will match th beginning of the line, an optional letter followed by an optional space, one or more times, and the end of a line. It will therefore match any combination of letters and spaces.

Weirdly, the third quantifier (the first ? in the original code) works because if I pass a line with more than one space

Code:

     carranty@carranty-desktop ~ $ echo "This  that" | grep -E '^([[:alpha:]]+ ?)+$'  carranty@carranty-desktop ~ $

It doesn't output the line. Can someone please help me out and explain what I'm doing wrong.

Again the expression inside the parens matches one or more letters followed by a single optional space.

The '?' matches the preceding expression 0 or 1 times. To match something exactly once, just use it on it's own.

Ok, the above post may be a little long so heres a more basic example. Say I want to match any lines consisting of MORE than 1 character. Shouldn't

Code: