Results 1 to 6 of 6

Thread: simple character count script

  1. #1
    Join Date
    Mar 2008
    Location
    Copenhagen Denmark
    Beans
    722
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    simple character count script

    I want to do a faroese DVORAK keyboard layout, I cant find any list that tells me witch characters are most used, so is there a way of doing a bash script, that can take large amounts of text and count the instances of each letter, and maybe sort them by frequency "not case sensitive"? bear in mind that Faroese contains special letters like , , , , and
    Last edited by jakupl; March 30th, 2009 at 11:24 PM.
    Ubuntu 10.10 Maverick | ASUS A6Rp | Intel(R) Celeron(R) M CPU 420 @ 1.60GHz | 4 GB ram |
    Graphic Card: ATI Technologies inc RC410 [Radeon Xpress 200M]

  2. #2
    Join Date
    Mar 2008
    Location
    Copenhagen Denmark
    Beans
    722
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: simple character count script

    bump
    Ubuntu 10.10 Maverick | ASUS A6Rp | Intel(R) Celeron(R) M CPU 420 @ 1.60GHz | 4 GB ram |
    Graphic Card: ATI Technologies inc RC410 [Radeon Xpress 200M]

  3. #3
    Join Date
    May 2006
    Beans
    1,790

    Re: simple character count script

    Quote Originally Posted by jakupl View Post
    I want to do a faroese DVORAK keyboard layout, I cant find any list that tells me witch characters are most used, so is there a way of doing a bash script, that can take large amounts of text and count the instances of each letter, and maybe sort them by frequency "not case sensitive"? bear in mind that Faroese contains special letters like , , , , and
    I would do it in C or Perl. Does it have to be a shell script?

    Maybe this old thread will help: http://ubuntuforums.org/showthread.php?t=957610

  4. #4
    Join Date
    Sep 2006
    Beans
    2,914

    Re: simple character count script

    there are many of such scripts lying around in the internet. just have to a search on them.
    Code:
    awk 'BEGIN{FS=""}{for(i=1;i<NF;i++)a[$i]++}END{for(o in a) {print a[o],o}}' file

  5. #5
    Join Date
    Mar 2008
    Location
    Copenhagen Denmark
    Beans
    722
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: simple character count script

    Quote Originally Posted by ghostdog74 View Post
    there are many of such scripts lying around in the internet. just have to a search on them.
    Code:
    awk 'BEGIN{FS=""}{for(i=1;i<NF;i++)a[$i]++}END{for(o in a) {print a[o],o}}' file

    I tried to search, but I didn't find anything. Thanks, I will try this when I get home
    Ubuntu 10.10 Maverick | ASUS A6Rp | Intel(R) Celeron(R) M CPU 420 @ 1.60GHz | 4 GB ram |
    Graphic Card: ATI Technologies inc RC410 [Radeon Xpress 200M]

  6. #6
    Join Date
    Mar 2008
    Location
    Copenhagen Denmark
    Beans
    722
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: simple character count script

    This works great. Now I am going to figure out how to get percentages, make it count these letters: , , , , and , I also would like the output arranged in order of what is used the most. And exclude "space", "enter" and "tab", and all the strange letters that are shown as a question mark.
    Ubuntu 10.10 Maverick | ASUS A6Rp | Intel(R) Celeron(R) M CPU 420 @ 1.60GHz | 4 GB ram |
    Graphic Card: ATI Technologies inc RC410 [Radeon Xpress 200M]

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •