Comparison of Frequency Distributions for 2 populations (unpaired data)

**hubie** · May 28th, 2010

Originally Posted by ugm6hr

1. Can I get a jpg / png instead of pdf?

Easy. To see what types you can do, check out the help:

Code:

?png

It will show you how you can do bmp, jpg, png, and tiff.

Originally Posted by ugm6hr

2. Will it plot a normal distribution on top of each frequency-density plot using the calculated mean/sd?

Check out this paper on fitting distributions. In particular, look at page 16 (the code for that plot is on page 15).

Originally Posted by ugm6hr

3. How can I control the width of the histogram bars (I would like them at 10, as my original - R seems to have used 20)?

Check out the help for the histogram function:

Code:

?hist

I believe the parameter you are interested in is "breaks," and some examples are given. Another helpful page is here.

**ugm6hr** · May 29th, 2010

Thanks all... I think I clearly need to start using R more! Apologies in advance for any future help I ask for.

My final script

Code:

# Import data in columns (header = top row)
b<-read.table('/media/sdhome/black.txt',header=T,fill=TRUE)
w<-read.table('/media/sdhome/white.txt',header=T,fill=TRUE)
bf<-read.table('/media/sdhome/blackf.txt',header=T,fill=TRUE)
wf<-read.table('/media/sdhome/whitef.txt',header=T,fill=TRUE)

#Divide into 4 plot areas vertically
par(mfcol=c(4,1))

# Draw histogram b
hb<-hist(b$b,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(160,160))), xlab='BM',main=' ',breaks=15,col="beige")

# Create N distribution plot to match
bxfit<-seq(min(300),max(500),length=40)
byfit<-dnorm(bxfit,mean=mean(b$b),sd=sd(b$b))

byfit <- byfit*diff(hb$mids[1:2])*length(b$b) 

# Create N distribution for histogram w
wxfit<-seq(min(300),max(500),length=40)
wyfit<-dnorm(wxfit,mean=mean(w$w),sd=sd(w$w))

wyfit <- wyfit*diff(hb$mids[1:2])*length(b$b) 

lines(bxfit, byfit, col="red", lty=1, lwd=2)
# lines(wxfit, wyfit, col="blue", lty=2, lwd=2)

# Draw 2nd histogram w
hw<-hist(w$w,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(160,160))), xlab='WM',main=' ',breaks=15,col="beige")

# Create N distribution to fit w
bxfit<-seq(min(300),max(500),length=40)
byfit<-dnorm(bxfit,mean=mean(b$b),sd=sd(b$b))

byfit <- byfit*diff(hw$mids[1:2])*length(w$w) 

wxfit<-seq(min(300),max(500),length=40)
wyfit<-dnorm(wxfit,mean=mean(w$w),sd=sd(w$w))

wyfit <- wyfit*diff(hw$mids[1:2])*length(w$w) 

# Draw N distributions
lines(wxfit, wyfit, col="blue", lty=1, lwd=2)
# lines(bxfit, byfit, col="red", lty=2, lwd=2)


# Draw histogram bf
hbf<-hist(bf$b,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(20,20))), xlab='BF',main=' ',breaks=15,col="grey")

# Create N distribution plot to match
bfxfit<-seq(min(300),max(500),length=40)
bfyfit<-dnorm(bfxfit,mean=mean(bf$b),sd=sd(bf$b))

bfyfit <- bfyfit*diff(hbf$mids[1:2])*length(bf$b) 

# Create N distribution for histogram w
wfxfit<-seq(min(300),max(500),length=40)
wfyfit<-dnorm(wfxfit,mean=mean(wf$w),sd=sd(wf$w))

wfyfit <- wfyfit*diff(hbf$mids[1:2])*length(bf$b) 

lines(bfxfit, bfyfit, col="brown", lty=1, lwd=2)
# lines(wfxfit, wfyfit, col="purple", lty=2, lwd=2)

# Draw 2nd histogram w
hwf<-hist(wf$w,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(50,50))), xlab='WF',main=' ',breaks=15,col="grey")

# Create N distribution to fit w
bfxfit<-seq(min(300),max(500),length=40)
bfyfit<-dnorm(bfxfit,mean=mean(bf$b),sd=sd(bf$b))

bfyfit <- bfyfit*diff(hwf$mids[1:2])*length(wf$w) 

wfxfit<-seq(min(300),max(500),length=40)
wfyfit<-dnorm(wfxfit,mean=mean(wf$w),sd=sd(wf$w))

wfyfit <- wfyfit*diff(hwf$mids[1:2])*length(wf$w) 

# Draw N distributions
lines(wfxfit, wfyfit, col="purple", lty=1, lwd=2)
# lines(bfxfit, bfyfit, col="brown", lty=2, lwd=2)

I then exported as svg (rkward allows this to be done with a few clicks), and edited the appearance with Inkscape for the final labels (excluded from the attached picture).

Thanks again!

**R33D3M33R** · February 18th, 2011

I need help with almost the same thing. I would like to plot 2 histograms with pdf's on top of the histogram, but i get the picture as shown in attachment. So, what's wrong with the pdf's? If i draw them on separate graph, everything works. Thanks for the help in advance!

Here is the code:

Code:

par(mfcol=c(2,1))

h1 <- hist(Data$V1,scale="frequency", col="darkgray", xlab="Time per km")
.x <- seq(min(0), max(5), length=100)
g1 <- length(Data$V1)*dgamma(.x, shape=1.312491, scale=0.795368)
remove(.x)
lines(g1,col="red")

h2 <- hist(Data$V1,scale="frequency", col="darkgray", xlab="Time per km")
numSummary(Podatki[,"V1"], statistics=c("mean", "sd", "quantiles"), 
  quantiles=c(0,1))
.x <- seq(min(0), max(5), length=100)
g2 <- length(Data$V1)*dgamma(.x, shape=2.632, scale=1.59542)
remove(.x)
lines(g2,col="purple")

P.S: attaching png's doesn't seem to work, so here is the picture: