View Single Post
  #5   Report Post  
Tushar Mehta
 
Posts: n/a
Default

My best guess (and it is only a guess) is that the problem occurs
because of floating point limitations and is exacerbated by something
in the FREQUENCY function. I did the following tests:

Generated 101 numbers in A1:A101 between 0 and 10. Entered 0 in A1,
=A1+0.1 in A2 and copied A2 down as needed.

In C1 entered the formula =A1+1. C2 contained =C1+1 and copied C2 to
C3:C10.

D1 contained the array formula =FREQUENCY(A1:A101,C1:C10).

E1 contained the formula =COUNTIF($A$1:$A$101,"<="&C1)-COUNTIF($A$1:$A
$101,"<="&(C1-1)). E1 was copied down to E2:E10.

When A1 contains a zero, the COUNTIF formulas returned the correct
results, whereas FREQUENCY returned the error you found.

Changing A1 to 60 caused the FREQUENCY function to report 10 in the 61
bin and 11 in the 65 bin. But, now COUNTIF function reported 9 in the
64 bin and 11 in the 65 bin.

Setting the value in A1 to different starting values results in a
shifting error pattern in the results of the 2 functions.

However, change the A column so that the numbers are generated with the
formula =$A$1+0.1*(ROW()-1) and all the results from both FREQUENCY and
the COUNTIF are always correct.

--
Regards,

Tushar Mehta
www.tushar-mehta.com
Excel, PowerPoint, and VBA add-ins, tutorials
Custom MS Office productivity solutions

In article ,
says...
Perhaps I was not clear in my original posting, but the data is the following
set of 101 data elements: {0.0, 0.1, 0.2, 0.3, ..., 0.9, 1.0, 1.1, ... 1.9,
2.0, 2.1, ...,2.9, 3.0, 3.1, ...3.9, 4.0, 4.1, ... 4.9, 5.0, 5.1,...5.9, 6.0,
6.1, ...6.9, 7.0, 7.1,...7.9, 8.0, 8.1,...8.9, 9.0, 9.1,... 9.9, 10.0} and
the bins are {1,2,3,4,5,6,7,8,9,10}. Midpoints is not relevant to the
problem. The problem is that theoretically, the calculated bin frequencies
(beginning with bin 1) should be {11,10,10,10,10,10,10,10,10,10}; instead,
these calculated bin frequencies are {11,9,10,10, 11, 10,10,10,10,10}

"Tushar Mehta" wrote:

Since you don't share the raw data, it is hard to know what the
frequency function is doing wrong. However, it, by *definition*
considers the bin value as the upper bound for the bin and not the mid-
point. So, if that is the problem you allude to, it is an XL design
feature. If it is something else, maybe you can share the data for 1-
to-5 so that it becomes easier to understand how the FREQUENCY function
is getting it wrong.

--
Regards,

Tushar Mehta
www.tushar-mehta.com
Excel, PowerPoint, and VBA add-ins, tutorials
Custom MS Office productivity solutions

In article ,
says...
I am using Excel 2002. There is a problem with the frequency function, which
is used for histograms. The frequency counts for some (not all) bins is
incorrect. The case that I learned of in another posting involves 101 data
values: 0, 10, and values in between that are created by using an increment
of 0.1. The frequency value for the first bin is (correctly) 11, and most of
the other bins have the correct frequency (10). However, the bin
representing the interval from 1 to 2 has a frequency of 9, which the bin
representing the interval from 4 to 5 has a frequency of 11! I have also
used the "Better Histrogram" downloadable file for this same problem, but
apparently the "Better Histogram" procedure starts with the Frequency
function, as it ends up with a histogram with the same results. Any
suggestions for correctly calculating all bin frequencies?