View Single Post
  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Mike Middleton Mike Middleton is offline
external usenet poster
 
Posts: 762
Default How do I run a regression on data that is not numerical?

NG -

For four levels of a categorical variable, e.g., A or B or C or D, use three
indicator variables. Select one level as the base case, e.g., A, and the
value of each indicator variable (B, C, D) shows whether an observation is B
or not B, C or not C, etc. For an observation with level A, the value of all
three indicator variables is zero. The regression coefficients measure how
different B,C,D are from the base case A, on the average.

I use the same approach of gender, e.g., 0 for male and 1 for female, in
which case the regression coefficent for the gender indicator shows how
females differ from males, on the average.

- Mike
http://www.mikemiddleton.com

"NG" wrote in message
...
Can you explain the Excel 2003 or later indicator variables a little more?
I
have four non-numerical values for race.

Thanks for the information on gender. I was using 1 for men and 2 for
females.

"Jerry W. Lewis" wrote:

Where only two values are possible (as with gender) then you use a single
variable with +1 for one gender and -1 for the other. Extending to more
than
two values is possible, but non-trivial.

Alternately, if you have Excel 2003 or later, you can create an indicator
variable (0 or 1) for each possible non-numeric value. This approach
directly permits more than 2 possible values.

Jerry

"NG" wrote:

I am using Microsoft Excel 2003. I have been running regressions on
numerical data and am curious to know how to run one if part of my data
is
non-numerical such as gender or race.