![]() |
Identifying Outliers
I have a population that I belive it will be represent a Gaussian (normal)
distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","")
You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
Thanks Dave,
This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
Well, FALSE is an improvement because it indicates the formula is calculating.
IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
It appears to be right. I don't know what is wrong. I am comparing:
=IF(OR(G3027<(G3035-(3*G3040)),G3027(G3037+(3*G3040))),"outlier?","") =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") My formula vs. your notation for the formula I am using. The cells are all values resulting from other formulas on my spreadsheet; all are numbers. I was reading more about the OR function, it appears that the OR function always return a True or FALSE value. But how do I make understand Excel that it needs to treat the content of the OR formula as valid (in this case) so that the result is printed as "outlier?" ??? Maybe there is something I am missing... do you think is because my cells are not number-values but numbers resulting from other formulas? everything else seems to be working properly on my Excel spreadsheet. "Dave F" wrote: Well, FALSE is an improvement because it indicates the formula is calculating. IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
It's hard to say what the issue is. What are the actual values in G3027,
etc. Are they formatted as numbers, as opposed to text? OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332) resolves to FALSE because neither calculation is true. Therefore your formula says: "If [calculation A] OR [calculation B] is correct, or both are correct, then consider it an outlier, else return an empty string." Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: It appears to be right. I don't know what is wrong. I am comparing: =IF(OR(G3027<(G3035-(3*G3040)),G3027(G3037+(3*G3040))),"outlier?","") =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") My formula vs. your notation for the formula I am using. The cells are all values resulting from other formulas on my spreadsheet; all are numbers. I was reading more about the OR function, it appears that the OR function always return a True or FALSE value. But how do I make understand Excel that it needs to treat the content of the OR formula as valid (in this case) so that the result is printed as "outlier?" ??? Maybe there is something I am missing... do you think is because my cells are not number-values but numbers resulting from other formulas? everything else seems to be working properly on my Excel spreadsheet. "Dave F" wrote: Well, FALSE is an improvement because it indicates the formula is calculating. IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
The following are the values and respective format I have on the cells:
G3027 value is 0.257 and Format Cell as Number using 3 decimal places. G3035 value is 0.528 and Format Cell as Number using 3 decimal places. G3040 value is 0.253 and Format Cell as Number using 3 decimal places. G3037 value is 0.781 and Format Cell as Number using 3 decimal places. Eugenio "Dave F" wrote: It's hard to say what the issue is. What are the actual values in G3027, etc. Are they formatted as numbers, as opposed to text? OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332) resolves to FALSE because neither calculation is true. Therefore your formula says: "If [calculation A] OR [calculation B] is correct, or both are correct, then consider it an outlier, else return an empty string." Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: It appears to be right. I don't know what is wrong. I am comparing: =IF(OR(G3027<(G3035-(3*G3040)),G3027(G3037+(3*G3040))),"outlier?","") =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") My formula vs. your notation for the formula I am using. The cells are all values resulting from other formulas on my spreadsheet; all are numbers. I was reading more about the OR function, it appears that the OR function always return a True or FALSE value. But how do I make understand Excel that it needs to treat the content of the OR formula as valid (in this case) so that the result is printed as "outlier?" ??? Maybe there is something I am missing... do you think is because my cells are not number-values but numbers resulting from other formulas? everything else seems to be working properly on my Excel spreadsheet. "Dave F" wrote: Well, FALSE is an improvement because it indicates the formula is calculating. IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
Using the number values instead of the cell references,
=IF(OR(0.257<(0.528-(3*0.253)),0.257(0.781+(3*0.253))),"outlier?","") resolves to a blank. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: The following are the values and respective format I have on the cells: G3027 value is 0.257 and Format Cell as Number using 3 decimal places. G3035 value is 0.528 and Format Cell as Number using 3 decimal places. G3040 value is 0.253 and Format Cell as Number using 3 decimal places. G3037 value is 0.781 and Format Cell as Number using 3 decimal places. Eugenio "Dave F" wrote: It's hard to say what the issue is. What are the actual values in G3027, etc. Are they formatted as numbers, as opposed to text? OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332) resolves to FALSE because neither calculation is true. Therefore your formula says: "If [calculation A] OR [calculation B] is correct, or both are correct, then consider it an outlier, else return an empty string." Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: It appears to be right. I don't know what is wrong. I am comparing: =IF(OR(G3027<(G3035-(3*G3040)),G3027(G3037+(3*G3040))),"outlier?","") =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") My formula vs. your notation for the formula I am using. The cells are all values resulting from other formulas on my spreadsheet; all are numbers. I was reading more about the OR function, it appears that the OR function always return a True or FALSE value. But how do I make understand Excel that it needs to treat the content of the OR formula as valid (in this case) so that the result is printed as "outlier?" ??? Maybe there is something I am missing... do you think is because my cells are not number-values but numbers resulting from other formulas? everything else seems to be working properly on my Excel spreadsheet. "Dave F" wrote: Well, FALSE is an improvement because it indicates the formula is calculating. IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
OK. What Excel version are you using? I still get the FALSE value as a
result. This is very strange. I am now thinking my Excel version is not reading correctly my formula. Is that possible? I am using the trial version of Excel 2007. Eugenio "Dave F" wrote: Using the number values instead of the cell references, =IF(OR(0.257<(0.528-(3*0.253)),0.257(0.781+(3*0.253))),"outlier?","") resolves to a blank. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: The following are the values and respective format I have on the cells: G3027 value is 0.257 and Format Cell as Number using 3 decimal places. G3035 value is 0.528 and Format Cell as Number using 3 decimal places. G3040 value is 0.253 and Format Cell as Number using 3 decimal places. G3037 value is 0.781 and Format Cell as Number using 3 decimal places. Eugenio "Dave F" wrote: It's hard to say what the issue is. What are the actual values in G3027, etc. Are they formatted as numbers, as opposed to text? OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332) resolves to FALSE because neither calculation is true. Therefore your formula says: "If [calculation A] OR [calculation B] is correct, or both are correct, then consider it an outlier, else return an empty string." Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: It appears to be right. I don't know what is wrong. I am comparing: =IF(OR(G3027<(G3035-(3*G3040)),G3027(G3037+(3*G3040))),"outlier?","") =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") My formula vs. your notation for the formula I am using. The cells are all values resulting from other formulas on my spreadsheet; all are numbers. I was reading more about the OR function, it appears that the OR function always return a True or FALSE value. But how do I make understand Excel that it needs to treat the content of the OR formula as valid (in this case) so that the result is printed as "outlier?" ??? Maybe there is something I am missing... do you think is because my cells are not number-values but numbers resulting from other formulas? everything else seems to be working properly on my Excel spreadsheet. "Dave F" wrote: Well, FALSE is an improvement because it indicates the formula is calculating. IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
I'm using XL 03.
Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: OK. What Excel version are you using? I still get the FALSE value as a result. This is very strange. I am now thinking my Excel version is not reading correctly my formula. Is that possible? I am using the trial version of Excel 2007. Eugenio "Dave F" wrote: Using the number values instead of the cell references, =IF(OR(0.257<(0.528-(3*0.253)),0.257(0.781+(3*0.253))),"outlier?","") resolves to a blank. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: The following are the values and respective format I have on the cells: G3027 value is 0.257 and Format Cell as Number using 3 decimal places. G3035 value is 0.528 and Format Cell as Number using 3 decimal places. G3040 value is 0.253 and Format Cell as Number using 3 decimal places. G3037 value is 0.781 and Format Cell as Number using 3 decimal places. Eugenio "Dave F" wrote: It's hard to say what the issue is. What are the actual values in G3027, etc. Are they formatted as numbers, as opposed to text? OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332) resolves to FALSE because neither calculation is true. Therefore your formula says: "If [calculation A] OR [calculation B] is correct, or both are correct, then consider it an outlier, else return an empty string." Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: It appears to be right. I don't know what is wrong. I am comparing: =IF(OR(G3027<(G3035-(3*G3040)),G3027(G3037+(3*G3040))),"outlier?","") =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") My formula vs. your notation for the formula I am using. The cells are all values resulting from other formulas on my spreadsheet; all are numbers. I was reading more about the OR function, it appears that the OR function always return a True or FALSE value. But how do I make understand Excel that it needs to treat the content of the OR formula as valid (in this case) so that the result is printed as "outlier?" ??? Maybe there is something I am missing... do you think is because my cells are not number-values but numbers resulting from other formulas? everything else seems to be working properly on my Excel spreadsheet. "Dave F" wrote: Well, FALSE is an improvement because it indicates the formula is calculating. IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
Do not retype Dave's formula; copy from his post and paste it into the
formula bar. The structure of Dave's formula is =IF(<condition,"outlier?","") the only possible return values (in any version of Excel) are therefore "outlier?" or "". For you to get FALSE, you must have mistyped the formula. Jerry "Quco" wrote: OK. What Excel version are you using? I still get the FALSE value as a result. This is very strange. I am now thinking my Excel version is not reading correctly my formula. Is that possible? I am using the trial version of Excel 2007. Eugenio "Dave F" wrote: Using the number values instead of the cell references, =IF(OR(0.257<(0.528-(3*0.253)),0.257(0.781+(3*0.253))),"outlier?","") resolves to a blank. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: The following are the values and respective format I have on the cells: G3027 value is 0.257 and Format Cell as Number using 3 decimal places. G3035 value is 0.528 and Format Cell as Number using 3 decimal places. G3040 value is 0.253 and Format Cell as Number using 3 decimal places. G3037 value is 0.781 and Format Cell as Number using 3 decimal places. Eugenio "Dave F" wrote: It's hard to say what the issue is. What are the actual values in G3027, etc. Are they formatted as numbers, as opposed to text? OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332) resolves to FALSE because neither calculation is true. Therefore your formula says: "If [calculation A] OR [calculation B] is correct, or both are correct, then consider it an outlier, else return an empty string." Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: It appears to be right. I don't know what is wrong. I am comparing: =IF(OR(G3027<(G3035-(3*G3040)),G3027(G3037+(3*G3040))),"outlier?","") =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") My formula vs. your notation for the formula I am using. The cells are all values resulting from other formulas on my spreadsheet; all are numbers. I was reading more about the OR function, it appears that the OR function always return a True or FALSE value. But how do I make understand Excel that it needs to treat the content of the OR formula as valid (in this case) so that the result is printed as "outlier?" ??? Maybe there is something I am missing... do you think is because my cells are not number-values but numbers resulting from other formulas? everything else seems to be working properly on my Excel spreadsheet. "Dave F" wrote: Well, FALSE is an improvement because it indicates the formula is calculating. IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
You shouldn't be getting FALSE from that formula, whatever the inputs,
because both outcomes of the IF are defined as strings. Did you literally copy and paste from the formula bar to the newsgroup posting, or might something have been mistyped? Try copying back from your posting below to the spreadsheet, or try copying =IF(OR(G3027<G3035-3*G3040,G3027G3037+3*G3040),"outlier?","") into your spreadsheet to check. -- David Biddulph "Quco" wrote in message ... It appears to be right. I don't know what is wrong. I am comparing: =IF(OR(G3027<(G3035-(3*G3040)),G3027(G3037+(3*G3040))),"outlier?","") =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") My formula vs. your notation for the formula I am using. The cells are all values resulting from other formulas on my spreadsheet; all are numbers. I was reading more about the OR function, it appears that the OR function always return a True or FALSE value. But how do I make understand Excel that it needs to treat the content of the OR formula as valid (in this case) so that the result is printed as "outlier?" ??? Maybe there is something I am missing... do you think is because my cells are not number-values but numbers resulting from other formulas? everything else seems to be working properly on my Excel spreadsheet. "Dave F" wrote: Well, FALSE is an improvement because it indicates the formula is calculating. IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
I HAVE FOUND THE CAUSE!!!
I did type the formulas, and then I did copied them from this discussion, but it was still not working. I then opened the same spreadsheet using another computer with Excel 2003. Excel 2003 does not show the formula as Excel 2007 does. Here's what it was causing the problem: Excel 2007 shows by default a one-line portion of the formula bar affecting the cell. There is an icon at the far right side of the formula type area. If you place the cursor there, a legend that says "Expand Formula Bar (Ctrl + Shift + U)" appears. Excel 2003 showed me the whole content of the formula bar when I opened the same spreadsheet. It shows the whole content of the formula bar area! Excel 2007 did not showed me the remaining portions of an older formula within that cell. That's why it was acting that way. Can you beleive this? Thanks all for your help! "Jerry W. Lewis" wrote: Do not retype Dave's formula; copy from his post and paste it into the formula bar. The structure of Dave's formula is =IF(<condition,"outlier?","") the only possible return values (in any version of Excel) are therefore "outlier?" or "". For you to get FALSE, you must have mistyped the formula. Jerry "Quco" wrote: OK. What Excel version are you using? I still get the FALSE value as a result. This is very strange. I am now thinking my Excel version is not reading correctly my formula. Is that possible? I am using the trial version of Excel 2007. Eugenio "Dave F" wrote: Using the number values instead of the cell references, =IF(OR(0.257<(0.528-(3*0.253)),0.257(0.781+(3*0.253))),"outlier?","") resolves to a blank. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: The following are the values and respective format I have on the cells: G3027 value is 0.257 and Format Cell as Number using 3 decimal places. G3035 value is 0.528 and Format Cell as Number using 3 decimal places. G3040 value is 0.253 and Format Cell as Number using 3 decimal places. G3037 value is 0.781 and Format Cell as Number using 3 decimal places. Eugenio "Dave F" wrote: It's hard to say what the issue is. What are the actual values in G3027, etc. Are they formatted as numbers, as opposed to text? OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332) resolves to FALSE because neither calculation is true. Therefore your formula says: "If [calculation A] OR [calculation B] is correct, or both are correct, then consider it an outlier, else return an empty string." Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: It appears to be right. I don't know what is wrong. I am comparing: =IF(OR(G3027<(G3035-(3*G3040)),G3027(G3037+(3*G3040))),"outlier?","") =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") My formula vs. your notation for the formula I am using. The cells are all values resulting from other formulas on my spreadsheet; all are numbers. I was reading more about the OR function, it appears that the OR function always return a True or FALSE value. But how do I make understand Excel that it needs to treat the content of the OR formula as valid (in this case) so that the result is printed as "outlier?" ??? Maybe there is something I am missing... do you think is because my cells are not number-values but numbers resulting from other formulas? everything else seems to be working properly on my Excel spreadsheet. "Dave F" wrote: Well, FALSE is an improvement because it indicates the formula is calculating. IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
Identifying Outliers
That's a weird one. I say you sue Microsoft.
-- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I HAVE FOUND THE CAUSE!!! I did type the formulas, and then I did copied them from this discussion, but it was still not working. I then opened the same spreadsheet using another computer with Excel 2003. Excel 2003 does not show the formula as Excel 2007 does. Here's what it was causing the problem: Excel 2007 shows by default a one-line portion of the formula bar affecting the cell. There is an icon at the far right side of the formula type area. If you place the cursor there, a legend that says "Expand Formula Bar (Ctrl + Shift + U)" appears. Excel 2003 showed me the whole content of the formula bar when I opened the same spreadsheet. It shows the whole content of the formula bar area! Excel 2007 did not showed me the remaining portions of an older formula within that cell. That's why it was acting that way. Can you beleive this? Thanks all for your help! "Jerry W. Lewis" wrote: Do not retype Dave's formula; copy from his post and paste it into the formula bar. The structure of Dave's formula is =IF(<condition,"outlier?","") the only possible return values (in any version of Excel) are therefore "outlier?" or "". For you to get FALSE, you must have mistyped the formula. Jerry "Quco" wrote: OK. What Excel version are you using? I still get the FALSE value as a result. This is very strange. I am now thinking my Excel version is not reading correctly my formula. Is that possible? I am using the trial version of Excel 2007. Eugenio "Dave F" wrote: Using the number values instead of the cell references, =IF(OR(0.257<(0.528-(3*0.253)),0.257(0.781+(3*0.253))),"outlier?","") resolves to a blank. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: The following are the values and respective format I have on the cells: G3027 value is 0.257 and Format Cell as Number using 3 decimal places. G3035 value is 0.528 and Format Cell as Number using 3 decimal places. G3040 value is 0.253 and Format Cell as Number using 3 decimal places. G3037 value is 0.781 and Format Cell as Number using 3 decimal places. Eugenio "Dave F" wrote: It's hard to say what the issue is. What are the actual values in G3027, etc. Are they formatted as numbers, as opposed to text? OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332) resolves to FALSE because neither calculation is true. Therefore your formula says: "If [calculation A] OR [calculation B] is correct, or both are correct, then consider it an outlier, else return an empty string." Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: It appears to be right. I don't know what is wrong. I am comparing: =IF(OR(G3027<(G3035-(3*G3040)),G3027(G3037+(3*G3040))),"outlier?","") =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") My formula vs. your notation for the formula I am using. The cells are all values resulting from other formulas on my spreadsheet; all are numbers. I was reading more about the OR function, it appears that the OR function always return a True or FALSE value. But how do I make understand Excel that it needs to treat the content of the OR formula as valid (in this case) so that the result is printed as "outlier?" ??? Maybe there is something I am missing... do you think is because my cells are not number-values but numbers resulting from other formulas? everything else seems to be working properly on my Excel spreadsheet. "Dave F" wrote: Well, FALSE is an improvement because it indicates the formula is calculating. IF statements work with the following logic: IF A, Then B, Else C. Stated another way: If A, value if A is true, value if A is false. Absent a value in the false argument, FALSE is returned. So: check your formula and the values in the cells that the formula refers to. Are the values in the cells numbers? Are they formatted as numbers? Do you have all your parentheses in order? Etc. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: Thanks Dave, This is is very interesting, now I am not getting the error message #VALUE!; now I am getting the word FALSE in the cell as a result instead getting the text "outlier?". "Dave F" wrote: This works for me: =IF(OR(D1<(A1-(3*B1)),D1(A2+(3*B1))),"outlier?","") You're missing many parentheses in your formula below; also the spaces make it hard to read what you're trying to do. See if my proposal works. Dave -- A hint to posters: Specific, detailed questions are more likely to be answered than questions that provide no detail about your problem. "Quco" wrote: I have a population that I belive it will be represent a Gaussian (normal) distribution if I am able to identify and remove the outliers properly. I now understand I cannot use the Grubb's Test as outlined in a couple of websites I found (see http://www.graphpad.com/articles/outlier.htm and http://www.itl.nist.gov/div898/softw...r/grubtest.htm) because I have many outliers and the distribution is not normal. Based on a discussion I found on this forums (subject: "how do I identify statistical outliers with excel", dated: 1/7/2007), I decided to use the the Interquartile Range method (IQR); but the following formula is giving me an annoying #VALUE! error: =if(or(D1 < A1 - 3*B1, D1 A2 + 3*B1), "outlier?", "") I modified that formula so that D1 is the value I am testing, A1 is the value of the 1st quartile, B1 is the value between the 3rd quartile minus the 1st quartile, and A2 is the value of the 3rd quartile. But I got the same error message. Then I removed the spaces in the formula, but I still get the same error. It is very frustating, and I don't know how to fix this. Any suggestions? |
All times are GMT +1. The time now is 04:00 AM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com