ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Worksheet Functions (https://www.excelbanter.com/excel-worksheet-functions/)
-   -   Simple statistical analysis (https://www.excelbanter.com/excel-worksheet-functions/25101-simple-statistical-analysis.html)

Steve Wylie

Simple statistical analysis
 
I have a workbook going across several dozen sheet tabs, containing
demographic information on about 10,000 people (I work for a local authority
- I'm not some direct marketer or spammer or anything!)

One of the items of data is their date of birth. I need to use this in
Excel somehow to extract percentages & counts of how many people fit into
certain age groups (18-30, 31-40, 41-50 etc).

Is this possible without moving the data out of Excel, using formulas? We
have a survey/data analysis program but I am loath to transfer all that data
unnecessarily when it could just be done in Excel...

Thank you for any help anyone can give.

Steve Wylie
Canterbury
United Kingdom


Tom Ogilvy

=countif(sheet1!A:A,"=01/01/1985")-countif(sheet1!A:A,"31/12/1995")
would give you the count of people born between 1985 and 1995 inclusive and
listed on sheet1 assuming birthdate is in column A (as an example)

from this, you should be able to figure out how to address other sheets.
Perhaps a separate sheet where you gather data from each individual sheet
and consolidate or combine formulas.

--
Regards,
Tom Ogilvy



"Steve Wylie" wrote in message
...
I have a workbook going across several dozen sheet tabs, containing
demographic information on about 10,000 people (I work for a local

authority
- I'm not some direct marketer or spammer or anything!)

One of the items of data is their date of birth. I need to use this in
Excel somehow to extract percentages & counts of how many people fit into
certain age groups (18-30, 31-40, 41-50 etc).

Is this possible without moving the data out of Excel, using formulas? We
have a survey/data analysis program but I am loath to transfer all that

data
unnecessarily when it could just be done in Excel...

Thank you for any help anyone can give.

Steve Wylie
Canterbury
United Kingdom




Jerry W. Lewis

=SUMPRODUCT((ROUNDDOWN((TODAY()-dates)/365.25,0)<=40)*(ROUNDDOWN((TODAY()-dates)/365.25,0)30))
will count the number on a single sheet that are in the 31-40 age group.
Unfortunately, it will not work with 3-D ranges.

Jerry

Steve Wylie wrote:

I have a workbook going across several dozen sheet tabs, containing
demographic information on about 10,000 people (I work for a local authority
- I'm not some direct marketer or spammer or anything!)

One of the items of data is their date of birth. I need to use this in
Excel somehow to extract percentages & counts of how many people fit into
certain age groups (18-30, 31-40, 41-50 etc).

Is this possible without moving the data out of Excel, using formulas? We
have a survey/data analysis program but I am loath to transfer all that data
unnecessarily when it could just be done in Excel...

Thank you for any help anyone can give.

Steve Wylie
Canterbury
United Kingdom




[email protected]

Unfortunately, I cannot get this formula to work on the sheet I am
using - it just says #VALUE! in the cell.

On reflection, I think the sheet I am using is too messed-about with to
use a formula. I'll paste the dates into my analysis program.

Thanks anyway

Steve


Jerry W. Lewis

If you are not trying to use 3D references, then the only way to get
#VALUE! is if there is a #VALUE error in your dates range, or if at
least one of the cells in the dates range contains text that cannot be
coerced into a date.

Jerry

wrote:

Unfortunately, I cannot get this formula to work on the sheet I am
using - it just says #VALUE! in the cell.

On reflection, I think the sheet I am using is too messed-about with to
use a formula. I'll paste the dates into my analysis program.

Thanks anyway

Steve



Steve Wylie

Yeah, that's the trouble. The dates have not been inputted consistently.
There are many false entries where people have put "16 Dec" and no year (it
should all be dd.mm.yy) or just "age 42" or rubbish like that. The analysis
program I use just ignores all that, whereas Excel throws up an error.

And I suspect your formula doesn't like the years in two-digit format
either...

Thanks, but I did it using the analysis program in the end. Shame tho.

Steve


Steve Wylie

I just did a quick "example" run of your formula on some dummy data in
uniform format, and needless to say it worked. I shall make a note of the
formula for future use if I ever get any decent data sent to me that allows
me to use it!

Thanks again
Steve


Jerry W. Lewis

Regardless of how they are formatted, Excel dates are stored as the
number of days since the beginning of 1900. Provided that the entry is
an Excel date or can be coerced into an Excel date, the formula should work.

Data QC is often the biggest portion of data analysis.

Jerry

Steve Wylie wrote:

Yeah, that's the trouble. The dates have not been inputted consistently.
There are many false entries where people have put "16 Dec" and no year (it
should all be dd.mm.yy) or just "age 42" or rubbish like that. The analysis
program I use just ignores all that, whereas Excel throws up an error.

And I suspect your formula doesn't like the years in two-digit format
either...

Thanks, but I did it using the analysis program in the end. Shame tho.

Steve



Myrna Larson

Hi, Jerry:

I was surprised by the OP's statement that "the analysis program ... just
ignores all that". If it just throws out the data, the results will be
worthless. If it in fact interprets those entries by calculating a date, the
user should be aware of that.

But the bottom line is that there should be data validation in place that
disallows entries that aren't dd.mm.yy; and after all of the fuss about Y2K, 2
digit years should have been disallowed too.



On Mon, 09 May 2005 14:46:11 -0400, "Jerry W. Lewis"
wrote:

Regardless of how they are formatted, Excel dates are stored as the
number of days since the beginning of 1900. Provided that the entry is
an Excel date or can be coerced into an Excel date, the formula should work.

Data QC is often the biggest portion of data analysis.

Jerry

Steve Wylie wrote:

Yeah, that's the trouble. The dates have not been inputted consistently.
There are many false entries where people have put "16 Dec" and no year (it
should all be dd.mm.yy) or just "age 42" or rubbish like that. The

analysis
program I use just ignores all that, whereas Excel throws up an error.

And I suspect your formula doesn't like the years in two-digit format
either...

Thanks, but I did it using the analysis program in the end. Shame tho.

Steve



Jerry W. Lewis

Ignoring inappropriate values is not unreasonable, provided that it
calls your attention to what it has done. The accuracy of (pre-2003)
LINEST is comparable to PROC GLM in SAS. Excel gets slammed and SAS
doesn't because SAS warns the user when results are not numerically
trustworthy.

Jerry

Myrna Larson wrote:

Hi, Jerry:

I was surprised by the OP's statement that "the analysis program ... just
ignores all that". If it just throws out the data, the results will be
worthless. If it in fact interprets those entries by calculating a date, the
user should be aware of that.

But the bottom line is that there should be data validation in place that
disallows entries that aren't dd.mm.yy; and after all of the fuss about Y2K, 2
digit years should have been disallowed too.



On Mon, 09 May 2005 14:46:11 -0400, "Jerry W. Lewis"
wrote:


Regardless of how they are formatted, Excel dates are stored as the
number of days since the beginning of 1900. Provided that the entry is
an Excel date or can be coerced into an Excel date, the formula should work.

Data QC is often the biggest portion of data analysis.

Jerry

Steve Wylie wrote:


Yeah, that's the trouble. The dates have not been inputted consistently.
There are many false entries where people have put "16 Dec" and no year (it
should all be dd.mm.yy) or just "age 42" or rubbish like that. The

analysis

program I use just ignores all that, whereas Excel throws up an error.

And I suspect your formula doesn't like the years in two-digit format
either...

Thanks, but I did it using the analysis program in the end. Shame tho.

Steve




All times are GMT +1. The time now is 12:22 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
ExcelBanter.com