Results 1 to 9 of 9

Thread: can someone explain this regular expression?

  1. #1
    Join Date
    Jan 2010
    Location
    Wheeling WV USA
    Beans
    2,023
    Distro
    Xubuntu 20.04 Focal Fossa

    can someone explain this regular expression?

    this regular expression:
    Code:
    ^(\d+)x(\d+)(?:\+(\d+))?(?:\+(\d+))?$
    works for the original intention of matching X11 command -geometry= values. now,i want to change it to match some other strings. but, i don't understand the extra layering of () and what the ? are doing to it. can someone who knows this explain it?
    Mask wearer, Social distancer, System Administrator, Programmer, Linux advocate, Command Line user, Ham radio operator (KA9WGN/8, tech), Photographer (hobby), occasional tweetXer

  2. #2
    Join Date
    Nov 2007
    Location
    London, England
    Beans
    7,701

    Re: can someone explain this regular expression?

    For my own education, I'll have a stab:
    Code:
    ^               At the start of the line
    (\d+)           One or more digits. Remember this as group 1
    x               The letter x
    (\d+)           One or more digits. Remember this as group 2
    (?:             Start a non-capturing group (not remembered), containing:
        \+(\d+)         A plus followed by one or more digits. Remember (I think) the digits as group 3
        )?              End the non-capturing group. The whole group may or may not exist.
    (?:\+(\d+))?    A possible second non-capturing group like the last one (digits become group 4)
    $               End of line
    I presume that this is inside a programming language that will then look at the captured groups. I think it will capture 2, 3 or 4 groups of digits if it finds a match at all. The first two groups must be separated by an 'x', and the third and fourth optional groups must be preceded with a '+'. e.g. "123x456+7+89"

    I would be grateful if someone more familiar with regex's could confirm or correct me.

  3. #3
    Join Date
    Jun 2016
    Beans
    2,831
    Distro
    Xubuntu

    Re: can someone explain this regular expression?

    The Cog is correct.

    Depending on the context of that regex, ^ and $ might only match start and end of the entire string, respectively, rather than start and end of one line.
    Xubuntu 22.04, ArchLinux ♦ System76 hardware, virt-manager/KVM, VirtualBox
    When your questions are resolved to your satisfaction, please use Thread Tools > "Mark this thread as solved..."

  4. #4
    Join Date
    Jan 2010
    Location
    Wheeling WV USA
    Beans
    2,023
    Distro
    Xubuntu 20.04 Focal Fossa

    Re: can someone explain this regular expression?

    i'm more curious what the question marks are doing and if the string "600x300" (without the position values) is supposed to be able to match. i was thinking the ? after that big group makes it an optional match. i tried to change it and it didn't work at all so i must have messed it up.

    i'd like to know how to make 2 or more expressions where exactly one of them, either or any one, must match.
    Mask wearer, Social distancer, System Administrator, Programmer, Linux advocate, Command Line user, Ham radio operator (KA9WGN/8, tech), Photographer (hobby), occasional tweetXer

  5. #5
    Join Date
    Feb 2007
    Location
    Romania
    Beans
    Hidden!

    Re: can someone explain this regular expression?

    What kind of regexen are we talking about?

  6. #6
    Join Date
    Jan 2010
    Location
    Wheeling WV USA
    Beans
    2,023
    Distro
    Xubuntu 20.04 Focal Fossa

    Re: can someone explain this regular expression?

    supposedly pcre compatible used by python3's module named re.
    Mask wearer, Social distancer, System Administrator, Programmer, Linux advocate, Command Line user, Ham radio operator (KA9WGN/8, tech), Photographer (hobby), occasional tweetXer

  7. #7
    Join Date
    Nov 2007
    Location
    London, England
    Beans
    7,701

    Re: can someone explain this regular expression?

    The question marks are in two places.
    The one in (?: starts a group, but the ?: says it's non-capturing, i.e. it doesn't get listed in the list of group contents.
    The one after the groups say that the preceding item is just a possibility: it might exist or might not, but either way the match is successful.
    This whole piece (?:\+(\d+))? describes a group containing a + followed by some digits. Ignore the + (it's a non-capturing group) but remember the digits (an inner group). The trailing question mark says the whole group may or may not exist, and its absence does not cause the match to fail.

    "600x300" should match. The two optional groups are not present so the list of groups found will contain some None items. Beware of a index-out-of-range error if you try to read the content of the third or fourth group.

    Code:
    #!/usr/bin/python3
    
    import re
    
    match = re.match('^(\d+)x(\d+)(?:\+(\d+))?(?:\+(\d+))?$', '300x600+123+456')
    print (match.groups())
    match = re.match('^(\d+)x(\d+)(?:\+(\d+))?(?:\+(\d+))?$', '300x600')
    print (match.groups())
    produces
    Code:
    ('300', '600', '123', '456')
    ('300', '600', None, None)
    Thanks to halogen2 for confirming what I thought it meant.
    Last edited by The Cog; September 15th, 2019 at 11:19 AM.

  8. #8
    Join Date
    Sep 2019
    Beans
    7

    Re: can someone explain this regular expression?

    I use this tool quite often when dealing with regex. It helps you visualise the result, and if you paste your regex in at the bottom you have a step by step explanation/breakdown: https://regexr.com/

  9. #9
    Join Date
    Jan 2010
    Location
    Wheeling WV USA
    Beans
    2,023
    Distro
    Xubuntu 20.04 Focal Fossa

    Re: can someone explain this regular expression?

    Quote Originally Posted by ryansenn View Post
    I use this tool quite often when dealing with regex. It helps you visualise the result, and if you paste your regex in at the bottom you have a step by step explanation/breakdown: https://regexr.com/
    that is an awesome tool. thanks very much.
    Mask wearer, Social distancer, System Administrator, Programmer, Linux advocate, Command Line user, Ham radio operator (KA9WGN/8, tech), Photographer (hobby), occasional tweetXer

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •