# Thread: Comparison of Frequency Distributions for 2 populations (unpaired data)

1. ## Re: Comparison of Frequency Distributions for 2 populations (unpaired data)

Originally Posted by ugm6hr
1. Can I get a jpg / png instead of pdf?
Easy. To see what types you can do, check out the help:
Code:
`?png`
It will show you how you can do bmp, jpg, png, and tiff.
Originally Posted by ugm6hr
2. Will it plot a normal distribution on top of each frequency-density plot using the calculated mean/sd?
Check out this paper on fitting distributions. In particular, look at page 16 (the code for that plot is on page 15).
Originally Posted by ugm6hr
3. How can I control the width of the histogram bars (I would like them at 10, as my original - R seems to have used 20)?
Check out the help for the histogram function:
Code:
`?hist`
I believe the parameter you are interested in is "breaks," and some examples are given. Another helpful page is here.

2. ## Re: Comparison of Frequency Distributions for 2 populations (unpaired data)

Thanks all... I think I clearly need to start using R more! Apologies in advance for any future help I ask for.

My final script
Code:
```# Import data in columns (header = top row)

#Divide into 4 plot areas vertically
par(mfcol=c(4,1))

# Draw histogram b
hb<-hist(b\$b,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(160,160))), xlab='BM',main=' ',breaks=15,col="beige")

# Create N distribution plot to match
bxfit<-seq(min(300),max(500),length=40)
byfit<-dnorm(bxfit,mean=mean(b\$b),sd=sd(b\$b))

byfit <- byfit*diff(hb\$mids[1:2])*length(b\$b)

# Create N distribution for histogram w
wxfit<-seq(min(300),max(500),length=40)
wyfit<-dnorm(wxfit,mean=mean(w\$w),sd=sd(w\$w))

wyfit <- wyfit*diff(hb\$mids[1:2])*length(b\$b)

lines(bxfit, byfit, col="red", lty=1, lwd=2)
# lines(wxfit, wyfit, col="blue", lty=2, lwd=2)

# Draw 2nd histogram w
hw<-hist(w\$w,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(160,160))), xlab='WM',main=' ',breaks=15,col="beige")

# Create N distribution to fit w
bxfit<-seq(min(300),max(500),length=40)
byfit<-dnorm(bxfit,mean=mean(b\$b),sd=sd(b\$b))

byfit <- byfit*diff(hw\$mids[1:2])*length(w\$w)

wxfit<-seq(min(300),max(500),length=40)
wyfit<-dnorm(wxfit,mean=mean(w\$w),sd=sd(w\$w))

wyfit <- wyfit*diff(hw\$mids[1:2])*length(w\$w)

# Draw N distributions
lines(wxfit, wyfit, col="blue", lty=1, lwd=2)
# lines(bxfit, byfit, col="red", lty=2, lwd=2)

# Draw histogram bf
hbf<-hist(bf\$b,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(20,20))), xlab='BF',main=' ',breaks=15,col="grey")

# Create N distribution plot to match
bfxfit<-seq(min(300),max(500),length=40)
bfyfit<-dnorm(bfxfit,mean=mean(bf\$b),sd=sd(bf\$b))

bfyfit <- bfyfit*diff(hbf\$mids[1:2])*length(bf\$b)

# Create N distribution for histogram w
wfxfit<-seq(min(300),max(500),length=40)
wfyfit<-dnorm(wfxfit,mean=mean(wf\$w),sd=sd(wf\$w))

wfyfit <- wfyfit*diff(hbf\$mids[1:2])*length(bf\$b)

lines(bfxfit, bfyfit, col="brown", lty=1, lwd=2)
# lines(wfxfit, wfyfit, col="purple", lty=2, lwd=2)

# Draw 2nd histogram w
hwf<-hist(wf\$w,xlim=c(floor(min(300,300)),ceiling(max(500,500))),ylim=c(floor(min(0,0)),ceiling(max(50,50))), xlab='WF',main=' ',breaks=15,col="grey")

# Create N distribution to fit w
bfxfit<-seq(min(300),max(500),length=40)
bfyfit<-dnorm(bfxfit,mean=mean(bf\$b),sd=sd(bf\$b))

bfyfit <- bfyfit*diff(hwf\$mids[1:2])*length(wf\$w)

wfxfit<-seq(min(300),max(500),length=40)
wfyfit<-dnorm(wfxfit,mean=mean(wf\$w),sd=sd(wf\$w))

wfyfit <- wfyfit*diff(hwf\$mids[1:2])*length(wf\$w)

# Draw N distributions
lines(wfxfit, wfyfit, col="purple", lty=1, lwd=2)
# lines(bfxfit, bfyfit, col="brown", lty=2, lwd=2)```
I then exported as svg (rkward allows this to be done with a few clicks), and edited the appearance with Inkscape for the final labels (excluded from the attached picture).

Thanks again!

3. Dipped in Ubuntu
Join Date
Feb 2009
Location
Slovenija
Beans
Hidden!

## Re: Comparison of Frequency Distributions for 2 populations (unpaired data)

I need help with almost the same thing. I would like to plot 2 histograms with pdf's on top of the histogram, but i get the picture as shown in attachment. So, what's wrong with the pdf's? If i draw them on separate graph, everything works. Thanks for the help in advance!

Here is the code:

Code:
```par(mfcol=c(2,1))

h1 <- hist(Data\$V1,scale="frequency", col="darkgray", xlab="Time per km")
.x <- seq(min(0), max(5), length=100)
g1 <- length(Data\$V1)*dgamma(.x, shape=1.312491, scale=0.795368)
remove(.x)
lines(g1,col="red")

h2 <- hist(Data\$V1,scale="frequency", col="darkgray", xlab="Time per km")
numSummary(Podatki[,"V1"], statistics=c("mean", "sd", "quantiles"),
quantiles=c(0,1))
.x <- seq(min(0), max(5), length=100)
g2 <- length(Data\$V1)*dgamma(.x, shape=2.632, scale=1.59542)
remove(.x)
lines(g2,col="purple")```

P.S: attaching png's doesn't seem to work, so here is the picture:

Last edited by R33D3M33R; February 18th, 2011 at 09:26 AM. Reason: reattached picture

4. Extra Foam Sugar Free Ubuntu
Join Date
Mar 2007
Beans
763

## Re: Comparison of Frequency Distributions for 2 populations (unpaired data)

I don't know what's wrong, but in case you can't figure it out, maybe you could figure it out with ggplot2, which makes graphs in R a lot easier.

5. Dipped in Ubuntu
Join Date
Feb 2009
Location
Slovenija
Beans
Hidden!

## Re: Comparison of Frequency Distributions for 2 populations (unpaired data)

I solved this by changing:
Code:
```remove(.x)
lines(g1,col="red")```
to:

Code:
```lines(.x,g1,col="red")
remove(.x)```