Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 119
Default Removing statistical outliers

I have a large range of data (thousands of rows) and I was performing the
functions average and max on cells that had time format 37:30:55. I realized
in the data there were a few records that had an abnormally large value in
them and from discussion they were due to unavoidable circumstances. There
are a few records in each data set I am working with which are skewing my
numbers (very large values). Is there a way to tell these functions (or use a
different function that returns the same results) to exclude outliers or any
outlier larger/smaller then what I specify?

I know you can use the average function, select a range, skip over the
outlier and select another range after the value, and repeat, but now that
the data has so many recrds it is getting to be a tiresome task with the new
data (thousands of additional records) every month that I put on a new tab.
It takes a lot of time to go through thousands of rows to find the outlier
and not select them in the average/max function.

  #2   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 2,722
Default Removing statistical outliers

One way to discard outlier is to use an array* function similar to this:

=AVERAGE(IF(A2:A2000<100,A2:A2000))

This formula is designed to take the average of all number sin the range
A2:A2000 that are less than 100.

*Array functions are confirmed using Ctrl+Shift+Enter, not just Enter.
--
Best Regards,

Luke M
*Remember to click "yes" if this post helped you!*


"William" wrote:

I have a large range of data (thousands of rows) and I was performing the
functions average and max on cells that had time format 37:30:55. I realized
in the data there were a few records that had an abnormally large value in
them and from discussion they were due to unavoidable circumstances. There
are a few records in each data set I am working with which are skewing my
numbers (very large values). Is there a way to tell these functions (or use a
different function that returns the same results) to exclude outliers or any
outlier larger/smaller then what I specify?

I know you can use the average function, select a range, skip over the
outlier and select another range after the value, and repeat, but now that
the data has so many recrds it is getting to be a tiresome task with the new
data (thousands of additional records) every month that I put on a new tab.
It takes a lot of time to go through thousands of rows to find the outlier
and not select them in the average/max function.

  #3   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 119
Default Removing statistical outliers

Awesome, thanks. Will this work for the MAX function too?

"Luke M" wrote:

One way to discard outlier is to use an array* function similar to this:

=AVERAGE(IF(A2:A2000<100,A2:A2000))

This formula is designed to take the average of all number sin the range
A2:A2000 that are less than 100.

*Array functions are confirmed using Ctrl+Shift+Enter, not just Enter.
--
Best Regards,

Luke M
*Remember to click "yes" if this post helped you!*


"William" wrote:

I have a large range of data (thousands of rows) and I was performing the
functions average and max on cells that had time format 37:30:55. I realized
in the data there were a few records that had an abnormally large value in
them and from discussion they were due to unavoidable circumstances. There
are a few records in each data set I am working with which are skewing my
numbers (very large values). Is there a way to tell these functions (or use a
different function that returns the same results) to exclude outliers or any
outlier larger/smaller then what I specify?

I know you can use the average function, select a range, skip over the
outlier and select another range after the value, and repeat, but now that
the data has so many recrds it is getting to be a tiresome task with the new
data (thousands of additional records) every month that I put on a new tab.
It takes a lot of time to go through thousands of rows to find the outlier
and not select them in the average/max function.

  #4   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 119
Default Removing statistical outliers

Luke, how do I use time in that array? My time is format hh:mm:ss do I use
quotes, single quotes??

"Luke M" wrote:

One way to discard outlier is to use an array* function similar to this:

=AVERAGE(IF(A2:A2000<100,A2:A2000))

This formula is designed to take the average of all number sin the range
A2:A2000 that are less than 100.

*Array functions are confirmed using Ctrl+Shift+Enter, not just Enter.
--
Best Regards,

Luke M
*Remember to click "yes" if this post helped you!*


"William" wrote:

I have a large range of data (thousands of rows) and I was performing the
functions average and max on cells that had time format 37:30:55. I realized
in the data there were a few records that had an abnormally large value in
them and from discussion they were due to unavoidable circumstances. There
are a few records in each data set I am working with which are skewing my
numbers (very large values). Is there a way to tell these functions (or use a
different function that returns the same results) to exclude outliers or any
outlier larger/smaller then what I specify?

I know you can use the average function, select a range, skip over the
outlier and select another range after the value, and repeat, but now that
the data has so many recrds it is getting to be a tiresome task with the new
data (thousands of additional records) every month that I put on a new tab.
It takes a lot of time to go through thousands of rows to find the outlier
and not select them in the average/max function.

  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 8,520
Default Removing statistical outliers

Just to add one; check out the below link by Chip to know how time values are
stored in excel..

http://www.cpearson.com/excel/datetime.htm

If this post helps click Yes
---------------
Jacob Skaria


"William" wrote:

Luke, how do I use time in that array? My time is format hh:mm:ss do I use
quotes, single quotes??

"Luke M" wrote:

One way to discard outlier is to use an array* function similar to this:

=AVERAGE(IF(A2:A2000<100,A2:A2000))

This formula is designed to take the average of all number sin the range
A2:A2000 that are less than 100.

*Array functions are confirmed using Ctrl+Shift+Enter, not just Enter.
--
Best Regards,

Luke M
*Remember to click "yes" if this post helped you!*


"William" wrote:

I have a large range of data (thousands of rows) and I was performing the
functions average and max on cells that had time format 37:30:55. I realized
in the data there were a few records that had an abnormally large value in
them and from discussion they were due to unavoidable circumstances. There
are a few records in each data set I am working with which are skewing my
numbers (very large values). Is there a way to tell these functions (or use a
different function that returns the same results) to exclude outliers or any
outlier larger/smaller then what I specify?

I know you can use the average function, select a range, skip over the
outlier and select another range after the value, and repeat, but now that
the data has so many recrds it is getting to be a tiresome task with the new
data (thousands of additional records) every month that I put on a new tab.
It takes a lot of time to go through thousands of rows to find the outlier
and not select them in the average/max function.



  #6   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 8,520
Default Removing statistical outliers

24:00 equates to 1. So in the below formula you are filtering entries which
are less than 100 which means you are filtering down to hours which are less
than 2400:00:00 which should an ideal case to go with...

=MAX(IF(A2:A2000<100,A2:A2000))
for finding the max

If this post helps click Yes
---------------
Jacob Skaria


"William" wrote:

Luke, how do I use time in that array? My time is format hh:mm:ss do I use
quotes, single quotes??

"Luke M" wrote:

One way to discard outlier is to use an array* function similar to this:

=AVERAGE(IF(A2:A2000<100,A2:A2000))

This formula is designed to take the average of all number sin the range
A2:A2000 that are less than 100.

*Array functions are confirmed using Ctrl+Shift+Enter, not just Enter.
--
Best Regards,

Luke M
*Remember to click "yes" if this post helped you!*


"William" wrote:

I have a large range of data (thousands of rows) and I was performing the
functions average and max on cells that had time format 37:30:55. I realized
in the data there were a few records that had an abnormally large value in
them and from discussion they were due to unavoidable circumstances. There
are a few records in each data set I am working with which are skewing my
numbers (very large values). Is there a way to tell these functions (or use a
different function that returns the same results) to exclude outliers or any
outlier larger/smaller then what I specify?

I know you can use the average function, select a range, skip over the
outlier and select another range after the value, and repeat, but now that
the data has so many recrds it is getting to be a tiresome task with the new
data (thousands of additional records) every month that I put on a new tab.
It takes a lot of time to go through thousands of rows to find the outlier
and not select them in the average/max function.

  #7   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 119
Default Removing statistical outliers

That kind of sucks. I have dozens and dozens of data sets, each with a
different outlier some are wierd like < 434 hours or < 820 hours. It will be
a nightmare for me to try and caclulate what 434 equals if 24 is 1. No easier
way, like just using the real time value????

"Jacob Skaria" wrote:

24:00 equates to 1. So in the below formula you are filtering entries which
are less than 100 which means you are filtering down to hours which are less
than 2400:00:00 which should an ideal case to go with...

=MAX(IF(A2:A2000<100,A2:A2000))
for finding the max

If this post helps click Yes
---------------
Jacob Skaria


"William" wrote:

Luke, how do I use time in that array? My time is format hh:mm:ss do I use
quotes, single quotes??

"Luke M" wrote:

One way to discard outlier is to use an array* function similar to this:

=AVERAGE(IF(A2:A2000<100,A2:A2000))

This formula is designed to take the average of all number sin the range
A2:A2000 that are less than 100.

*Array functions are confirmed using Ctrl+Shift+Enter, not just Enter.
--
Best Regards,

Luke M
*Remember to click "yes" if this post helped you!*


"William" wrote:

I have a large range of data (thousands of rows) and I was performing the
functions average and max on cells that had time format 37:30:55. I realized
in the data there were a few records that had an abnormally large value in
them and from discussion they were due to unavoidable circumstances. There
are a few records in each data set I am working with which are skewing my
numbers (very large values). Is there a way to tell these functions (or use a
different function that returns the same results) to exclude outliers or any
outlier larger/smaller then what I specify?

I know you can use the average function, select a range, skip over the
outlier and select another range after the value, and repeat, but now that
the data has so many recrds it is getting to be a tiresome task with the new
data (thousands of additional records) every month that I put on a new tab.
It takes a lot of time to go through thousands of rows to find the outlier
and not select them in the average/max function.

  #8   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 8,520
Default Removing statistical outliers

I understand you are looking to find Average and max from large data sets...

Have you tried the formula Luke has given on average and the below max
formulas (array formulas)..

=MAX(IF(A2:A2000<100,A2:A2000))

It would be easy for us to suggest a solution if you can give more
information on how your data is arrranged....

If this post helps click Yes
---------------
Jacob Skaria


"William" wrote:

That kind of sucks. I have dozens and dozens of data sets, each with a
different outlier some are wierd like < 434 hours or < 820 hours. It will be
a nightmare for me to try and caclulate what 434 equals if 24 is 1. No easier
way, like just using the real time value????

"Jacob Skaria" wrote:

24:00 equates to 1. So in the below formula you are filtering entries which
are less than 100 which means you are filtering down to hours which are less
than 2400:00:00 which should an ideal case to go with...

=MAX(IF(A2:A2000<100,A2:A2000))
for finding the max

If this post helps click Yes
---------------
Jacob Skaria


"William" wrote:

Luke, how do I use time in that array? My time is format hh:mm:ss do I use
quotes, single quotes??

"Luke M" wrote:

One way to discard outlier is to use an array* function similar to this:

=AVERAGE(IF(A2:A2000<100,A2:A2000))

This formula is designed to take the average of all number sin the range
A2:A2000 that are less than 100.

*Array functions are confirmed using Ctrl+Shift+Enter, not just Enter.
--
Best Regards,

Luke M
*Remember to click "yes" if this post helped you!*


"William" wrote:

I have a large range of data (thousands of rows) and I was performing the
functions average and max on cells that had time format 37:30:55. I realized
in the data there were a few records that had an abnormally large value in
them and from discussion they were due to unavoidable circumstances. There
are a few records in each data set I am working with which are skewing my
numbers (very large values). Is there a way to tell these functions (or use a
different function that returns the same results) to exclude outliers or any
outlier larger/smaller then what I specify?

I know you can use the average function, select a range, skip over the
outlier and select another range after the value, and repeat, but now that
the data has so many recrds it is getting to be a tiresome task with the new
data (thousands of additional records) every month that I put on a new tab.
It takes a lot of time to go through thousands of rows to find the outlier
and not select them in the average/max function.

  #9   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 897
Default Removing statistical outliers

I have an article here that might be useful to you:

http://www.codeforexcelandoutlook.co...lude-extremes/

--JP

On Oct 6, 1:59*pm, William wrote:
I have a large range of data (thousands of rows) and I was performing the
functions average and max on cells that had time format 37:30:55. I realized
in the data there were a few records that had an abnormally large value in
them and from discussion they were due to unavoidable circumstances. There
are a few records in each data set I am working with which are skewing my
numbers (very large values). Is there a way to tell these functions (or use a
different function that returns the same results) to exclude outliers or any
outlier larger/smaller then what I specify?

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
how do I identify statistical outliers with excel? [email protected] Excel Discussion (Misc queries) 8 April 23rd 23 07:42 PM
how to calculate outliers ismhs Charts and Charting in Excel 1 June 1st 08 02:49 PM
how do I identify statistical outliers with excel Jeff Excel Discussion (Misc queries) 0 January 7th 07 07:53 AM
Boxplots with outliers Confuzzled. Charts and Charting in Excel 1 March 17th 06 07:36 PM
outliers/histograms Julie Excel Discussion (Misc queries) 1 January 14th 06 07:28 AM


All times are GMT +1. The time now is 08:20 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"