Results 1 to 8 of 8

Thread: Using Bash to convert upper to lower case with accented letters

  1. #1
    Join Date
    Feb 2007
    Location
    Everywhere
    Beans
    1,529
    Distro
    Ubuntu Development Release

    Using Bash to convert upper to lower case with accented letters

    How is it done?

    Code:
    echo "Prlude  L'Aprs-Midi D'Un Faune" | tr [:upper:] [:lower:]
    prlude  l'aprs-midi d'un faune
    As you can see, the is still capitalized because tr doesn't seem to handle accented letters.

    An even better idea would be to convert all accented letters to lower case without the accents at all.

    Any one know how? To do it with tr is quite lengthy.
    "Knowledge is power. Who said that?" - Dave Lister

  2. #2
    Join Date
    Feb 2006
    Location
    Vancouver, BC, Canada
    Beans
    318

    Re: Using Bash to convert upper to lower case with accented letters

    Python 3 has very good native unicode support.
    How about this?

    Code:
    fhanson@fhanson:/tmp$ sudo aptitude install python3
    ...
    
    fhanson@fhanson:/tmp$ echo "Prlude  L'Aprs-Midi D'Un Faune"  | python3 -c 'import sys; print( sys.stdin.read().lower() )'
    prlude  l'aprs-midi d'un faune

  3. #3
    Join Date
    Feb 2007
    Location
    Everywhere
    Beans
    1,529
    Distro
    Ubuntu Development Release

    Re: Using Bash to convert upper to lower case with accented letters

    Nice one. Can it be used for removing the accents?
    "Knowledge is power. Who said that?" - Dave Lister

  4. #4
    Join Date
    Feb 2006
    Location
    Vancouver, BC, Canada
    Beans
    318

    Re: Using Bash to convert upper to lower case with accented letters

    Hmm. I'm sure it can but I don't know how.

    I got this far:

    Code:
    fhanson@fhanson:/tmp$ echo "Prlude  L'Aprs-Midi D'Un Faune" | python -c 'import sys, unicodedata; print unicodedata.normalize("NFKD", unicode(sys.stdin.read()).encode("ASCII","replace") )'
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)
    Perhaps some of the local python gurus can fix it. I'm out of time.

  5. #5
    Join Date
    Mar 2008
    Beans
    4,714
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: Using Bash to convert upper to lower case with accented letters

    Code:
    apt-cache search accent
    yields
    Code:
    unaccent - Replace accented letters by their unaccented equivalent
    Code:
    apt-cache show unaccent
    yields
    Code:
    Description: Replace accented letters by their unaccented equivalent
     read data from stdin, replace accented letters by their unaccented
     equivalent and write the result on stdout.
    Code:
    apt-file list unaccent
    yields
    Code:
    unaccent: /usr/bin/unaccent

  6. #6
    Join Date
    Feb 2007
    Location
    Everywhere
    Beans
    1,529
    Distro
    Ubuntu Development Release

    Re: Using Bash to convert upper to lower case with accented letters

    Who'd of thought the package would be called unaccent?

    Anyway, so:
    Code:
     echo "Prlude  L'Aprs-Midi D'Un Faune" | unaccent UTF-8 | tr [:upper:] [:lower:]
    prelude a l'apres-midi d'un faune
    ...pretty much does the job.

    Thanks all.
    "Knowledge is power. Who said that?" - Dave Lister

  7. #7
    Join Date
    Mar 2012
    Beans
    1

    Re: Using Bash to convert upper to lower case with accented letters

    Quote Originally Posted by phenest View Post
    To do it with tr is quite lengthy.
    Not that lengthy :

    Code:
    tr 'A-Z' 'a-z'

  8. #8
    Join Date
    May 2006
    Beans
    1,790

    Re: Using Bash to convert upper to lower case with accented letters

    Quote Originally Posted by PierreJourlin View Post
    Not that lengthy :

    Code:
    tr 'A-Z' 'a-z'
    You forgot .

    And the original poster may not read this thread anymore.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •