Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 8
Default Splitting Names Down Columns

This one's a toughie... be prepared!

I have a spreadsheet with several thousand entries that should have been a
database instead. It holds a lot of user information, such as names,
numbers, addresses, etc.

I need to the entries under the Name column into multiple colums.
Specifically, Title1, First1, Middle1, Last1, Suffix1, Title2, First2,
Middle2, Last2, Suffix2. Of course, all the entries under the Name column
are not standard:

Craig and Dolores Craig
Jeffrey and Cynthia Beegle
Jason Piel
Mr Kaske
Debra Cork-Wahl and Charles Cork
Micro Systems
Matthew G. Kovar and Karina Rand-Kovar
Terrance and Mary Mathews

The most common format is "First1 and First2 Last1/2", but that is not the
case with a good chunk of the entries. No matter how you look at it, someone
is going to have to go through it manually to make sure there are no
mistakes, but I was hoping that there would be some way to script a
semi-solution to cut down on some of the work.

Any takers? :-)

Thanks!

Nick
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 71
Default Splitting Names Down Columns

i think you're right, it will take some manual work, but if the majority is
in one format and the data is separated the same each time (eg, by a comma
or by a fixed width) I'd use the 'Text to columns' option from the Data menu
and follow the prompts. I'm sure someone will be along shortly with a much
more eloquent solution.

Tim


"Tatakau" wrote in message
...
This one's a toughie... be prepared!

I have a spreadsheet with several thousand entries that should have been a
database instead. It holds a lot of user information, such as names,
numbers, addresses, etc.

I need to the entries under the Name column into multiple colums.
Specifically, Title1, First1, Middle1, Last1, Suffix1, Title2, First2,
Middle2, Last2, Suffix2. Of course, all the entries under the Name column
are not standard:

Craig and Dolores Craig
Jeffrey and Cynthia Beegle
Jason Piel
Mr Kaske
Debra Cork-Wahl and Charles Cork
Micro Systems
Matthew G. Kovar and Karina Rand-Kovar
Terrance and Mary Mathews

The most common format is "First1 and First2 Last1/2", but that is not the
case with a good chunk of the entries. No matter how you look at it,

someone
is going to have to go through it manually to make sure there are no
mistakes, but I was hoping that there would be some way to script a
semi-solution to cut down on some of the work.

Any takers? :-)

Thanks!

Nick



  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 158
Default Splitting Names Down Columns

I've written a few data clean up procedures in the past. As you wrote in
your OP, you just have to do the best job and then tackle it manually.

Two initial approaches:

(a) Write code to analyse and split it according to rules. This is
lengthy but feasible.

(b) If option (a) is unsuitable because it's the rules overlap so much,
but they're consistent then you could go through the spreadsheet
inserting a code as to how the name should be cleaned up. e.g. 1,2,1,1,5
etc. in an adjacent cell. This would make coding simpler, the final
process more accurate albeit requiring a little manual work - but it's
only a few thousand so teh time saved coding might be greater. (You
could leave the most common format blank...)

Then... for (a) and (b) I would do something like (unchecked):

(set Option Compare Text at the top of your module)

For each c in myrange.cells

myStr = c.value

'Check for suffices

'first one
Select Case trim(Left$(myStr ,3))
case "MR": Suffix1 = "Mr"
case "MRS": Suffix1 = "Mrs"
'etc. for Dr, Ms, Sir
'should be ok, no names begin with these
'dr is ok cos "drew" would return "dre"
End Select

'remove suffix from our string
If len(Suffix1)0 then myStr = trim(mid$(myStr,LEN(Suffix1)+1))


'now check for a second suffix
iPos = instr(mystr," MRS ")
If iPos 0 then
Suffix2 = "Mrs"
myStr = Replace(myStr," MRS ", " ",,vbTextCompare)

End if

iPos = instr(mystr," MS ")
If iPos 0 then
Suffix2 = "Ms"
myStr = Replace(myStr," Ms ", " ",,vbTextCompare)
End if
'etc.

'that's the suffices take care of

'split up on the spaces
'(you may like to remove doublespaces first
mySplit = Split(c.value," ")

'see if second word is 'and'
'and we only have 4 words
if mySplit(1) = "AND" and Ubound(mySplit) = 3 then
First1 = mySplit(0)
First2 = mySplit(2)
Last1 = mySplit(3)
Last2 = mySplit(3)

elseif mySplit(2) = "AND" and Ubound(mySplit) = 4 then

First1 = mySplit(0)
Last2 = mySplit(1)
First2 = mySplit(3)
Last1 = mySplit(4)
'obviously if it was a middle name rather than
'a surname you have no way of knowing.
'That's why I would suggest using option (b)
end if

'continue ad nauseum.

next c

HTH,
Gareth


Tatakau wrote:
This one's a toughie... be prepared!

I have a spreadsheet with several thousand entries that should have been a
database instead. It holds a lot of user information, such as names,
numbers, addresses, etc.

I need to the entries under the Name column into multiple colums.
Specifically, Title1, First1, Middle1, Last1, Suffix1, Title2, First2,
Middle2, Last2, Suffix2. Of course, all the entries under the Name column
are not standard:

Craig and Dolores Craig
Jeffrey and Cynthia Beegle
Jason Piel
Mr Kaske
Debra Cork-Wahl and Charles Cork
Micro Systems
Matthew G. Kovar and Karina Rand-Kovar
Terrance and Mary Mathews

The most common format is "First1 and First2 Last1/2", but that is not the
case with a good chunk of the entries. No matter how you look at it, someone
is going to have to go through it manually to make sure there are no
mistakes, but I was hoping that there would be some way to script a
semi-solution to cut down on some of the work.

Any takers? :-)

Thanks!

Nick

  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 8
Default Splitting Names Down Columns

I used the wizard to split to columns by spaces. After that, I was able to
manually fix about 200 records in five minutes.

Thanks Tim!

Nick

"Tim" wrote:

i think you're right, it will take some manual work, but if the majority is
in one format and the data is separated the same each time (eg, by a comma
or by a fixed width) I'd use the 'Text to columns' option from the Data menu
and follow the prompts. I'm sure someone will be along shortly with a much
more eloquent solution.

Tim


"Tatakau" wrote in message
...
This one's a toughie... be prepared!

I have a spreadsheet with several thousand entries that should have been a
database instead. It holds a lot of user information, such as names,
numbers, addresses, etc.

I need to the entries under the Name column into multiple colums.
Specifically, Title1, First1, Middle1, Last1, Suffix1, Title2, First2,
Middle2, Last2, Suffix2. Of course, all the entries under the Name column
are not standard:

Craig and Dolores Craig
Jeffrey and Cynthia Beegle
Jason Piel
Mr Kaske
Debra Cork-Wahl and Charles Cork
Micro Systems
Matthew G. Kovar and Karina Rand-Kovar
Terrance and Mary Mathews

The most common format is "First1 and First2 Last1/2", but that is not the
case with a good chunk of the entries. No matter how you look at it,

someone
is going to have to go through it manually to make sure there are no
mistakes, but I was hoping that there would be some way to script a
semi-solution to cut down on some of the work.

Any takers? :-)

Thanks!

Nick

  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 8
Default Splitting Names Down Columns

Gareth, I remember writing delimiting programs in college... no fun at all.
Your solution is of course much more elegant, but I don't think I'm willing
to risk any more brain cells (due to banging my head on a desk). :-)

Thanks again!

Nick

"Gareth" wrote:

I've written a few data clean up procedures in the past. As you wrote in
your OP, you just have to do the best job and then tackle it manually.

Two initial approaches:

(a) Write code to analyse and split it according to rules. This is
lengthy but feasible.

(b) If option (a) is unsuitable because it's the rules overlap so much,
but they're consistent then you could go through the spreadsheet
inserting a code as to how the name should be cleaned up. e.g. 1,2,1,1,5
etc. in an adjacent cell. This would make coding simpler, the final
process more accurate albeit requiring a little manual work - but it's
only a few thousand so teh time saved coding might be greater. (You
could leave the most common format blank...)

Then... for (a) and (b) I would do something like (unchecked):

(set Option Compare Text at the top of your module)

For each c in myrange.cells

myStr = c.value

'Check for suffices

'first one
Select Case trim(Left$(myStr ,3))
case "MR": Suffix1 = "Mr"
case "MRS": Suffix1 = "Mrs"
'etc. for Dr, Ms, Sir
'should be ok, no names begin with these
'dr is ok cos "drew" would return "dre"
End Select

'remove suffix from our string
If len(Suffix1)0 then myStr = trim(mid$(myStr,LEN(Suffix1)+1))


'now check for a second suffix
iPos = instr(mystr," MRS ")
If iPos 0 then
Suffix2 = "Mrs"
myStr = Replace(myStr," MRS ", " ",,vbTextCompare)

End if

iPos = instr(mystr," MS ")
If iPos 0 then
Suffix2 = "Ms"
myStr = Replace(myStr," Ms ", " ",,vbTextCompare)
End if
'etc.

'that's the suffices take care of

'split up on the spaces
'(you may like to remove doublespaces first
mySplit = Split(c.value," ")

'see if second word is 'and'
'and we only have 4 words
if mySplit(1) = "AND" and Ubound(mySplit) = 3 then
First1 = mySplit(0)
First2 = mySplit(2)
Last1 = mySplit(3)
Last2 = mySplit(3)

elseif mySplit(2) = "AND" and Ubound(mySplit) = 4 then

First1 = mySplit(0)
Last2 = mySplit(1)
First2 = mySplit(3)
Last1 = mySplit(4)
'obviously if it was a middle name rather than
'a surname you have no way of knowing.
'That's why I would suggest using option (b)
end if

'continue ad nauseum.

next c

HTH,
Gareth


Tatakau wrote:
This one's a toughie... be prepared!

I have a spreadsheet with several thousand entries that should have been a
database instead. It holds a lot of user information, such as names,
numbers, addresses, etc.

I need to the entries under the Name column into multiple colums.
Specifically, Title1, First1, Middle1, Last1, Suffix1, Title2, First2,
Middle2, Last2, Suffix2. Of course, all the entries under the Name column
are not standard:

Craig and Dolores Craig
Jeffrey and Cynthia Beegle
Jason Piel
Mr Kaske
Debra Cork-Wahl and Charles Cork
Micro Systems
Matthew G. Kovar and Karina Rand-Kovar
Terrance and Mary Mathews

The most common format is "First1 and First2 Last1/2", but that is not the
case with a good chunk of the entries. No matter how you look at it, someone
is going to have to go through it manually to make sure there are no
mistakes, but I was hoping that there would be some way to script a
semi-solution to cut down on some of the work.

Any takers? :-)

Thanks!

Nick


Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Splitting names [email protected] Excel Discussion (Misc queries) 4 January 12th 07 08:44 PM
Question for Bob Phillips re Splitting Names from Cells Paul Sheppard Excel Discussion (Misc queries) 8 August 3rd 05 09:00 AM
Splitting names from cells GoesLikeStink Excel Discussion (Misc queries) 2 July 30th 05 07:16 AM
splitting names & email add. Cheker Excel Programming 5 October 6th 04 06:15 AM
Siple but what was the formula for splitting names? Martyn Excel Programming 13 August 25th 04 04:07 PM


All times are GMT +1. The time now is 03:15 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"