Page 2 of 2 FirstFirst 12
Results 11 to 15 of 15

Thread: Comparison of Frequency Distributions for 2 populations (unpaired data)

  1. #11
    Join Date
    Sep 2007
    Beans
    207

    Re: Comparison of Frequency Distributions for 2 populations (unpaired data)

    Quote Originally Posted by ugm6hr View Post
    1. Can I get a jpg / png instead of pdf?
    Easy. To see what types you can do, check out the help:
    Code:
    ?png
    It will show you how you can do bmp, jpg, png, and tiff.
    Quote Originally Posted by ugm6hr View Post
    2. Will it plot a normal distribution on top of each frequency-density plot using the calculated mean/sd?
    Check out this paper on fitting distributions. In particular, look at page 16 (the code for that plot is on page 15).
    Quote Originally Posted by ugm6hr View Post
    3. How can I control the width of the histogram bars (I would like them at 10, as my original - R seems to have used 20)?
    Check out the help for the histogram function:
    Code:
    ?hist
    I believe the parameter you are interested in is "breaks," and some examples are given. Another helpful page is here.

  2. #12
    Join Date
    Apr 2006
    Location
    UK
    Beans
    6,646
    Distro
    Ubuntu 12.04 Precise Pangolin

    Re: Comparison of Frequency Distributions for 2 populations (unpaired data)

    Thanks all... I think I clearly need to start using R more! Apologies in advance for any future help I ask for.

    My final script
    Code:
    # Import data in columns (header = top row)
    b<-read.table('/media/sdhome/black.txt',header=T,fill=TRUE)
    w<-read.table('/media/sdhome/white.txt',header=T,fill=TRUE)
    bf<-read.table('/media/sdhome/blackf.txt',header=T,fill=TRUE)
    wf<-read.table('/media/sdhome/whitef.txt',header=T,fill=TRUE)
    
    #Divide into 4 plot areas vertically
    par(mfcol=c(4,1))
    
    # Draw histogram b
    hb<-hist(b$b,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(160,160))), xlab='BM',main=' ',breaks=15,col="beige")
    
    # Create N distribution plot to match
    bxfit<-seq(min(300),max(500),length=40)
    byfit<-dnorm(bxfit,mean=mean(b$b),sd=sd(b$b))
    
    byfit <- byfit*diff(hb$mids[1:2])*length(b$b) 
    
    # Create N distribution for histogram w
    wxfit<-seq(min(300),max(500),length=40)
    wyfit<-dnorm(wxfit,mean=mean(w$w),sd=sd(w$w))
    
    wyfit <- wyfit*diff(hb$mids[1:2])*length(b$b) 
    
    lines(bxfit, byfit, col="red", lty=1, lwd=2)
    # lines(wxfit, wyfit, col="blue", lty=2, lwd=2)
    
    # Draw 2nd histogram w
    hw<-hist(w$w,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(160,160))), xlab='WM',main=' ',breaks=15,col="beige")
    
    # Create N distribution to fit w
    bxfit<-seq(min(300),max(500),length=40)
    byfit<-dnorm(bxfit,mean=mean(b$b),sd=sd(b$b))
    
    byfit <- byfit*diff(hw$mids[1:2])*length(w$w) 
    
    wxfit<-seq(min(300),max(500),length=40)
    wyfit<-dnorm(wxfit,mean=mean(w$w),sd=sd(w$w))
    
    wyfit <- wyfit*diff(hw$mids[1:2])*length(w$w) 
    
    # Draw N distributions
    lines(wxfit, wyfit, col="blue", lty=1, lwd=2)
    # lines(bxfit, byfit, col="red", lty=2, lwd=2)
    
    
    # Draw histogram bf
    hbf<-hist(bf$b,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(20,20))), xlab='BF',main=' ',breaks=15,col="grey")
    
    # Create N distribution plot to match
    bfxfit<-seq(min(300),max(500),length=40)
    bfyfit<-dnorm(bfxfit,mean=mean(bf$b),sd=sd(bf$b))
    
    bfyfit <- bfyfit*diff(hbf$mids[1:2])*length(bf$b) 
    
    # Create N distribution for histogram w
    wfxfit<-seq(min(300),max(500),length=40)
    wfyfit<-dnorm(wfxfit,mean=mean(wf$w),sd=sd(wf$w))
    
    wfyfit <- wfyfit*diff(hbf$mids[1:2])*length(bf$b) 
    
    lines(bfxfit, bfyfit, col="brown", lty=1, lwd=2)
    # lines(wfxfit, wfyfit, col="purple", lty=2, lwd=2)
    
    # Draw 2nd histogram w
    hwf<-hist(wf$w,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(50,50))), xlab='WF',main=' ',breaks=15,col="grey")
    
    # Create N distribution to fit w
    bfxfit<-seq(min(300),max(500),length=40)
    bfyfit<-dnorm(bfxfit,mean=mean(bf$b),sd=sd(bf$b))
    
    bfyfit <- bfyfit*diff(hwf$mids[1:2])*length(wf$w) 
    
    wfxfit<-seq(min(300),max(500),length=40)
    wfyfit<-dnorm(wfxfit,mean=mean(wf$w),sd=sd(wf$w))
    
    wfyfit <- wfyfit*diff(hwf$mids[1:2])*length(wf$w) 
    
    # Draw N distributions
    lines(wfxfit, wfyfit, col="purple", lty=1, lwd=2)
    # lines(bfxfit, bfyfit, col="brown", lty=2, lwd=2)
    I then exported as svg (rkward allows this to be done with a few clicks), and edited the appearance with Inkscape for the final labels (excluded from the attached picture).

    Thanks again!
    Attached Images Attached Images

  3. #13
    Join Date
    Feb 2009
    Location
    Slovenija
    Beans
    Hidden!

    Re: Comparison of Frequency Distributions for 2 populations (unpaired data)

    I need help with almost the same thing. I would like to plot 2 histograms with pdf's on top of the histogram, but i get the picture as shown in attachment. So, what's wrong with the pdf's? If i draw them on separate graph, everything works. Thanks for the help in advance!

    Here is the code:

    Code:
    par(mfcol=c(2,1))
    
    h1 <- hist(Data$V1,scale="frequency", col="darkgray", xlab="Time per km")
    .x <- seq(min(0), max(5), length=100)
    g1 <- length(Data$V1)*dgamma(.x, shape=1.312491, scale=0.795368)
    remove(.x)
    lines(g1,col="red")
    
    h2 <- hist(Data$V1,scale="frequency", col="darkgray", xlab="Time per km")
    numSummary(Podatki[,"V1"], statistics=c("mean", "sd", "quantiles"), 
      quantiles=c(0,1))
    .x <- seq(min(0), max(5), length=100)
    g2 <- length(Data$V1)*dgamma(.x, shape=2.632, scale=1.59542)
    remove(.x)
    lines(g2,col="purple")

    P.S: attaching png's doesn't seem to work, so here is the picture:

    Attached Images Attached Images
    Last edited by R33D3M33R; February 18th, 2011 at 09:26 AM. Reason: reattached picture

  4. #14
    Join Date
    Mar 2007
    Beans
    763

    Re: Comparison of Frequency Distributions for 2 populations (unpaired data)

    I don't know what's wrong, but in case you can't figure it out, maybe you could figure it out with ggplot2, which makes graphs in R a lot easier.

  5. #15
    Join Date
    Feb 2009
    Location
    Slovenija
    Beans
    Hidden!

    Re: Comparison of Frequency Distributions for 2 populations (unpaired data)

    I solved this by changing:
    Code:
    remove(.x)
    lines(g1,col="red")
    to:

    Code:
    lines(.x,g1,col="red")
    remove(.x)

Page 2 of 2 FirstFirst 12

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •