Results 1 to 8 of 8

Thread: python- re

  1. #1
    Join Date
    Jan 2007
    Location
    London, UK
    Beans
    3,525
    Distro
    Ubuntu Development Release

    python- re

    Guys need some help with forming regular expression in python 2.7. Below are lines from text file i need regular expression to replace matching strings to blanks, I want to replace "JACK PETER", "MIKE MR" and "PATRICK SNOW" to " " and shouldn't replace 101,AGH and 630:

    Code:
    000185 JACK Peter     101     15107.50          -            -       15107.50	
    000481 MIKE MR        AGH     24790.51      8909.91          -       33700.42	
    000647 PATRICK SNOW   630     10244.54          -            -       10244.54

    Thanks
    You came empty handed, that is how you shall leave. Whatever you claim as yours today, belonged to someone else yesterday, will be someone else's tomorrow.

  2. #2
    Join Date
    Feb 2009
    Beans
    789
    Distro
    Ubuntu 10.04 Lucid Lynx

    Re: python- re

    I don't know about regular expressions but what's the regularity here? Is it always the case that the second and third items needs to be replaced? Or can there also be names that consist of one word or three or more words?

  3. #3
    Join Date
    Sep 2006
    Beans
    2,914

    Re: python- re

    here's a reference. Its not Python, but you can adapt the regex pattern into Python re.

    Code:
    $ cat file
    000185 JACK Peter     101     15107.50          -            -       15107.50
    000481 MIKE MR        AGH     24790.51      8909.91          -       33700.42
    000647 PATRICK SNOW   630     10244.54          -            -       10244.54
    
    $ sed -r 's/(.[^ \t]* )(.*)([ \t]+[0-9A-Z]{3}[ \t]+.*)/\1\3/' file
    000185  101     15107.50          -            -       15107.50
    000481  AGH     24790.51      8909.91          -       33700.42
    000647  630     10244.54          -            -       10244.54

  4. #4
    Join Date
    Jan 2007
    Location
    London, UK
    Beans
    3,525
    Distro
    Ubuntu Development Release

    Re: python- re

    Quote Originally Posted by ghostdog74 View Post
    here's a reference. Its not Python, but you can adapt the regex pattern into Python re.

    Code:
    $ cat file
    000185 JACK Peter     101     15107.50          -            -       15107.50
    000481 MIKE MR        AGH     24790.51      8909.91          -       33700.42
    000647 PATRICK SNOW   630     10244.54          -            -       10244.54
    
    $ sed -r 's/(.[^ \t]* )(.*)([ \t]+[0-9A-Z]{3}[ \t]+.*)/\1\3/' file
    000185  101     15107.50          -            -       15107.50
    000481  AGH     24790.51      8909.91          -       33700.42
    000647  630     10244.54          -            -       10244.54

    nice work will try that.

    Thanks
    You came empty handed, that is how you shall leave. Whatever you claim as yours today, belonged to someone else yesterday, will be someone else's tomorrow.

  5. #5
    Join Date
    Jan 2007
    Location
    London, UK
    Beans
    3,525
    Distro
    Ubuntu Development Release

    Re: python- re

    Quote Originally Posted by simeon87 View Post
    I don't know about regular expressions but what's the regularity here? Is it always the case that the second and third items needs to be replaced? Or can there also be names that consist of one word or three or more words?
    Regularity is any word (no digits) longer than 3 alphabets and can have space between letters
    Last edited by ukripper; July 27th, 2010 at 03:04 PM.
    You came empty handed, that is how you shall leave. Whatever you claim as yours today, belonged to someone else yesterday, will be someone else's tomorrow.

  6. #6
    Join Date
    Jan 2007
    Location
    London, UK
    Beans
    3,525
    Distro
    Ubuntu Development Release

    Re: python- re

    Quote Originally Posted by ghostdog74 View Post
    here's a reference. Its not Python, but you can adapt the regex pattern into Python re.

    Code:
    $ cat file
    000185 JACK Peter     101     15107.50          -            -       15107.50
    000481 MIKE MR        AGH     24790.51      8909.91          -       33700.42
    000647 PATRICK SNOW   630     10244.54          -            -       10244.54
    
    $ sed -r 's/(.[^ \t]* )(.*)([ \t]+[0-9A-Z]{3}[ \t]+.*)/\1\3/' file
    000185  101     15107.50          -            -       15107.50
    000481  AGH     24790.51      8909.91          -       33700.42
    000647  630     10244.54          -            -       10244.54
    Sorry sed regex in my test string didn't work with python.
    here is the test case:
    Code:
    import re
    testString = '000185 JACK Peter          630     15107.50          -            -       15107.50'
    testReg = re.sub("s/(.[^ \t]* )(.*)([ \t]+[0-9A-Z]{3}[ \t]+.*)/\1\3/", " ",testString)
    print testReg
    Last edited by ukripper; July 27th, 2010 at 03:46 PM.
    You came empty handed, that is how you shall leave. Whatever you claim as yours today, belonged to someone else yesterday, will be someone else's tomorrow.

  7. #7
    Join Date
    Sep 2006
    Beans
    2,914

    Re: python- re

    that's because of the syntax, not the regex. In Python's re, there is not s/// syntax, so remove them. I will give it to you this time.

    Code:
    >>> import re
    >>> testString = '000185 JACK Peter          630     15107.50          -            -       15107.50'
    >>> re.sub("(.[^ \t]* )(.*)([ \t]+[0-9A-Z]{3}[ \t]+.*)", "\\1\\3",testString)
    '000185  630     15107.50          -            -       15107.50'
    please read up on regex and Python docs from now on.

  8. #8
    Join Date
    Jan 2007
    Location
    London, UK
    Beans
    3,525
    Distro
    Ubuntu Development Release

    Re: python- re

    Quote Originally Posted by ghostdog74 View Post
    that's because of the syntax, not the regex. In Python's re, there is not s/// syntax, so remove them. I will give it to you this time.

    Code:
    >>> import re
    >>> testString = '000185 JACK Peter          630     15107.50          -            -       15107.50'
    >>> re.sub("(.[^ \t]* )(.*)([ \t]+[0-9A-Z]{3}[ \t]+.*)", "\\1\\3",testString)
    '000185  630     15107.50          -            -       15107.50'
    please read up on regex and Python docs from now on.
    Thank you very much. It seems to have worked.
    I have been reading python docs but i am in learning stage so hopefully will get someday there and master re module
    You came empty handed, that is how you shall leave. Whatever you claim as yours today, belonged to someone else yesterday, will be someone else's tomorrow.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •