Page 2 of 2 FirstFirst 12
Results 11 to 14 of 14

Thread: URGENT: UNIX command line to solve the median problem.

  1. #11
    Join Date
    Feb 2009
    Beans
    6

    Re: URGENT: UNIX command line to solve the median problem.

    Quote Originally Posted by gunksta View Post
    R is easy to script and I often use it for small jobs simply because it's easy to use. I prefer to re-use tested code rather than over-think easy problems. I prefer:

    Code:
    median(k)
    over
    Code:
    set n=`wc input_file` && sort -n input_file | awk '{if(NR==(N+1)/2) print  }' N=$n[1]
    There is also the added advantage that I know R correctly handles lists with an even number of elements. With ad-hoc solutions (solutions not included in a distributed program or library) I would first have to read / understand the code to be sure that it operated correctly.

    If R is simply too much effort, I would also recommend googling for numpy, which can also calculate the median. I didn't think of it last night when I wrote my original reply. Since it is a python library, it might appeal more to people who believe R is too difficult to script (although I don't understand this at all). There is probably something similar for perl but I don't know perl very well.

    EDIT: Regarding the thread referred to by ahmatti, I find it interesting that most of the responses are examples of how to use R to do the same thing. But, I want to thank ahmatti for posting this thread here, because I have now learned about the "little r" program, which is designed to help you use R in a bash script. The number of uses to this little trick are incredible.
    gunksta,
    You know what is R program? Actually this is my first time listen about it. Hope you can provide me some source or tutorial to further understand about it. Thanks a lot for your help.

  2. #12
    Join Date
    Oct 2005
    Location
    Albany, NY
    Beans
    842
    Distro
    Ubuntu

    Re: URGENT: UNIX command line to solve the median problem.

    R is a terrific tool for data analysis. R has a couple of options for scripting. The littler option mentioned previously is harder to use, because you have to install additional stuff into the system. I would probably not recommend this for right now. Instead, I would look at Rscript. To learn more enter:

    Code:
    man Rscript
    after you have installed R. The man page is useful. If you have already started R, then you can use:

    Code:
    ?Rscript
    Your other is to write an executable script that can use R, in the same way a python script uses python. In the first line of your script you will need something that looks like:

    Code:
    #! /usr/bin/Rscript
    args <- commandArgs(TRUE)
    
    blah blah blah blah blah
    
    #q(status=<exit status code>)
    If you want you learn R, I would recommend two resources. First, _the_ PDF on R is:
    http://cran.r-project.org/doc/manuals/R-intro.pdf

    Secondly, there is the site Quick-R. This second resource is especially useful to anyone with a background in SPSS / PSPP.

    Since this _is_ a homework assignment, I don't want to give the whole shebang away, and I have technically told you how to write 90% of the script already. I'll let you read R-intro. It's boring but informative. The stuff in the very back discusses scripting R from the command line.

    If you have any specific questions, feel free to post them, but do be aware that the Ubuntu Forums do have a policy against answering homework questions directly/completely.
    Please Insert Funny Statement Here.

  3. #13
    Join Date
    Mar 2007
    Location
    Finland
    Beans
    256
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: URGENT: UNIX command line to solve the median problem.

    Patrick,
    Follow this link: http://sourceforge.net/projects/average/ (its from the thread I pointed).

    Gunksta,
    I don't think R is too difficult to script and I use it daily, but for some problems command line tools are a lot faster to use. Especially if I just want to find simple things from large data files I don't always want to wait for R to start and read in the whole 200 Mb file just to find the average etc..

  4. #14
    Join Date
    Mar 2007
    Beans
    275

    Re: URGENT: UNIX command line to solve the median problem.

    Patrick Chia,

    What do you mean, the command line stuff I suggested can't find the median, max or sum? It does on my computer so I guess it will on yours. As I said, the complex line was in csh, not bash, and tat was just used to get the number of lines into the variable $n. (Maybe you have to type "csh" first).

    The simple awk lines (for sum or max) will work in any shell.

Page 2 of 2 FirstFirst 12

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •