ExcelBanter - View Single Post

jimbo

As an example, here are seven patients. On different months their
phosphorus dropped, some significantly, some less so. With some
patients, e.g. p1 and p2, the drop was significant, and the subsequent
month, the phosphorus value went back to its previous level. In one
patient, p3, the drop wqas sustained.

jul sep oct nov dec
p1 5.9 6.0 6.1 1.5 6.2
p2 6.1 0.9 6.9 6.8 6.9
p3 4.1 3.9 4.1 2.8 3.2
p4 4.9 4.9 4.0 3.7 4.3
p5 3.9 3.9 2.7 3.9 3.8
p7 5.6 3.6 5.6 5.6 5.0
p8 5.3 6.2 3.5 5.6 5.9

So, I try and run some manner of statistics on the data looking for 1)
the mean of the five months, MIN, SD, and calculate the Z score
((mean-MIN)/SD), and the difference between the previous and subsequent
value.

mean min sd z delta delta%
p1 5.1 1.5 2.0 1.8 0.10 2%
p2 5.5 0.9 2.6 1.8 0.80 13%
p3 3.6 2.8 0.6 1.4 -0.90 -28%
p4 4.4 3.7 0.5 1.2 0.30 8%
p5 3.6 2.7 0.5 1.8 0.00 0%
p7 5.1 3.6 0.9 1.7 0.00 0%
p8 5.3 3.5 1.1 1.7 -0.60 -10%

The problem is that I have 12 months of cumulative data on some 50,000
patients, and each patient's MIN occurs anywhere in that 12 month range
(I could only show five months because of the limitation of the email).
What I currently try and do is to sort each month and look for the
lowest values for each month. But this is extremly time consuming and
anytime something takes that long in Excel, I know that there is a
differrent way to approach the data.

So, I am looking for some formula that I could add to the above,
indicating the MIN and the previous and subsequent values to the MIN
value.

Thank yiou for your response. I could not get your suggestion to work,
but assume that it is because of my limited understanding of Excel, or
it was inadequate explanation of my problem.

Thanks for your time.

Niek Otten wrote:
Use

=MATCH(MIN(A1:A50000),A1:A50000,0) to locate the row where the MIN is. You can then use INDEX to retrieve cells from that row.
The MATCH() may be time-consuming, so do it only once, store the result in a cell and use that cell in several INDEX() formulas;
INDEX() is very fast.
--
Kind regards,

Niek Otten
Microsoft MVP - Excel

"jimbo" wrote in message ps.com...
|I have 50,000 rows of 12 values in each column. I am looking for low
| outliers, so I use MIN across each row and get the lowest value in each
| group of 12 values. From there I calculate the mean, then the SD, and
| get a Z score for each row.
|
| My problem is that if cell d4 is the MIN of the 12 values in that row,
| I need to know what the values were in C4 and E4, because if C4 is
| approximately equal to E4, and the Z score is high, then D4 is an
| outlier. In a clinical setting though, it's important that the
| previous and subsequent values are approximately equal.
|
| Right now I do this manually. I find the MIN in the group of 12 data
| points and copy the three values into seperate rows so That I can apply
| formulas to the data. But I know that anytime I have to do that much
| manual minipulation of data, I'm doing something wrong.
|
| Any assistance would be appreciated.
|