SNYP40A1

November 6th, 2010, 09:14 PM

I have a large collection of floating point numbers, tens of thousands of values, that describe a distribution. I want to form a model of this distribution so that when I see a new number, I can determine it's percentile relative to the distribution. The Apache Math library has a distribution class which does this, but it involves storing all values of the distribution model in memory, sorting the distribution, and then performing a binary search to find the position in the array which it then takes size and index to infer the exact percentile. So the question I am trying to answer is:

Given a set of floating-point numbers, what percentile does a given number rank in that set distribution? I don't need to know that the given number is exactly at 56.9883 percentile, I just need to know that it's somewhere around 50-60th percentile. Is there a Java library that can do this? If I was going to do this myself, I would simply put all the values in an array, sort the array, and then record the value at, say every 5th percentile:

for(int i = 0; i < 20; i++)

{

record[i] = distribution.get(0.05*i * distribution.size());

}

Kinda hoping not to reinvent the wheel.

Given a set of floating-point numbers, what percentile does a given number rank in that set distribution? I don't need to know that the given number is exactly at 56.9883 percentile, I just need to know that it's somewhere around 50-60th percentile. Is there a Java library that can do this? If I was going to do this myself, I would simply put all the values in an array, sort the array, and then record the value at, say every 5th percentile:

for(int i = 0; i < 20; i++)

{

record[i] = distribution.get(0.05*i * distribution.size());

}

Kinda hoping not to reinvent the wheel.