Home |
Search |
Today's Posts |
|
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
I just had some really great help with some Excel files from this group where
the formatting was done by various people over the last 15-20 years. These are names for genealogy, and I am loading many Excel files into Access so people can search by name and soundex. I don't know if the other problem I am dealing with can be handled programatically, but some of the people in the past entered extra spaces between the first and middle names. I am going thru the files manually now and removing the extra space. I have more files to do and thought it was worth asking if a macro could do this. Here are some examples: John William (I change to John William by using delete key) Doris Doe Smith (I change to Doris Doe Smith) John W. ( change to John W.) I found many posts on this site that talks about Trim, but could not find anything that specifically addressed just removing extra space within text. Thanks in advance if there is any help for me, Sherry |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
Given that you indicated you used Ron Rosenfeld's routine in your earlier
thread, I modified that to add the additional functionality... Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" txt = re.Replace(c.Text, "$1.") re.Pattern = "\s{2,}" c.Value = re.Replace(txt, " ") Next c End Sub Now a word of caution... this is my first regular expression construction in some 15 years, so I am rusty. What I posted works, but I can't guarantee it is the most efficient construction. So, check back here to see if Ron posts a better constructed regular expression solution than this one. Rick "SherryScrapDog" wrote in message ... I just had some really great help with some Excel files from this group where the formatting was done by various people over the last 15-20 years. These are names for genealogy, and I am loading many Excel files into Access so people can search by name and soundex. I don't know if the other problem I am dealing with can be handled programatically, but some of the people in the past entered extra spaces between the first and middle names. I am going thru the files manually now and removing the extra space. I have more files to do and thought it was worth asking if a macro could do this. Here are some examples: John William (I change to John William by using delete key) Doris Doe Smith (I change to Doris Doe Smith) John W. ( change to John W.) I found many posts on this site that talks about Trim, but could not find anything that specifically addressed just removing extra space within text. Thanks in advance if there is any help for me, Sherry |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
This might actually be better... it also removes all leading and trailing
spaces, if any... Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" txt = re.Replace(c.Text, "$1.") re.Pattern = "\s{2,}" c.Value = re.Replace(Trim(txt), " ") Next c End Sub Rick "Rick Rothstein (MVP - VB)" wrote in message ... Given that you indicated you used Ron Rosenfeld's routine in your earlier thread, I modified that to add the additional functionality... Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" txt = re.Replace(c.Text, "$1.") re.Pattern = "\s{2,}" c.Value = re.Replace(txt, " ") Next c End Sub Now a word of caution... this is my first regular expression construction in some 15 years, so I am rusty. What I posted works, but I can't guarantee it is the most efficient construction. So, check back here to see if Ron posts a better constructed regular expression solution than this one. Rick "SherryScrapDog" wrote in message ... I just had some really great help with some Excel files from this group where the formatting was done by various people over the last 15-20 years. These are names for genealogy, and I am loading many Excel files into Access so people can search by name and soundex. I don't know if the other problem I am dealing with can be handled programatically, but some of the people in the past entered extra spaces between the first and middle names. I am going thru the files manually now and removing the extra space. I have more files to do and thought it was worth asking if a macro could do this. Here are some examples: John William (I change to John William by using delete key) Doris Doe Smith (I change to Doris Doe Smith) John W. ( change to John W.) I found many posts on this site that talks about Trim, but could not find anything that specifically addressed just removing extra space within text. Thanks in advance if there is any help for me, Sherry |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
On Sun, 23 Dec 2007 22:31:49 -0500, "Rick Rothstein \(MVP - VB\)"
wrote: This might actually be better... it also removes all leading and trailing spaces, if any... Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" txt = re.Replace(c.Text, "$1.") re.Pattern = "\s{2,}" c.Value = re.Replace(Trim(txt), " ") Next c End Sub A bit shorter: Option Explicit Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" c.Value = Application.WorksheetFunction.Trim _ (re.Replace(c.Text, "$1.")) Next c End Sub --ron |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
Another idea might be to move the Pattern outside the loop since it's a
constant. Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("VbScript.RegExp") re.IgnoreCase = True re.Global = True re.Pattern = "\b([A-Z])\b(?!\.)" With WorksheetFunction For Each c In Selection.Cells c.Value = .Trim(re.Replace(c.Text, "$1.")) Next c End With End Sub -- Dana DeLouis "Ron Rosenfeld" wrote in message ... On Sun, 23 Dec 2007 22:31:49 -0500, "Rick Rothstein \(MVP - VB\)" wrote: This might actually be better... it also removes all leading and trailing spaces, if any... Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" txt = re.Replace(c.Text, "$1.") re.Pattern = "\s{2,}" c.Value = re.Replace(Trim(txt), " ") Next c End Sub A bit shorter: Option Explicit Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" c.Value = Application.WorksheetFunction.Trim _ (re.Replace(c.Text, "$1.")) Next c End Sub --ron |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
A bit shorter:
Option Explicit Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" c.Value = Application.WorksheetFunction.Trim _ (re.Replace(c.Text, "$1.")) Next c End Sub I thought about that solution, but I was not sure what was more efficient in the end... continually running out to the spreadsheet level to get one of its functions within the loop or simply resetting the Pattern each time. Can you, or anyone reading this thread, give me an idea of the "penalty" incurred, if any, by a macro when it reaches out to the spreadsheet world in order to execute one of its functions via the Application.WorksheetFunction property? Rick |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
Can you, or anyone reading this thread, give me an idea of the "penalty"
incurred, if any, by a macro when it reaches out to the spreadsheet world in order to execute one of its functions via the Application.WorksheetFunction property? Rick Worksheet functions, whilst extremely efficient in cells, are slow when called in VBA. It's often quicker to recreate the worksheet function. Although in another part of this thread I suggested Application.Worksheetfunction.Trim( personally I wouldn't use it in in anything time sensitive, say as a UDF in a large number of cells or as part of a long loop. Other ways to replace multiple spaces with singles besides RegExp. If using multiple worksheet functions in the same loop use 'wf' as in the following Sub foo() Dim wf As WorksheetFunction Set wf = Application.WorksheetFunction s = wf.Trim(" a b c ") MsgBox s End Sub Even setting a ref to WorksheetFunction object only speeds up a little. Also I wouldn't use RegExp in a UDF and probably not in a function that was called in a much larger loop. Although RegExp is extremely efficient, creating the object is (relatively) slow. IOW, great in macro or function that receives an array to process, but don't send each element to a separate function that does CreateObject("vbscript.regexp"). I suppose could store the object at module level. Regards, Peter T |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
On Mon, 24 Dec 2007 12:52:47 -0500, "Rick Rothstein \(MVP - VB\)"
wrote: A bit shorter: Option Explicit Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" c.Value = Application.WorksheetFunction.Trim _ (re.Replace(c.Text, "$1.")) Next c End Sub I thought about that solution, but I was not sure what was more efficient in the end... continually running out to the spreadsheet level to get one of its functions within the loop or simply resetting the Pattern each time. Can you, or anyone reading this thread, give me an idea of the "penalty" incurred, if any, by a macro when it reaches out to the spreadsheet world in order to execute one of its functions via the Application.WorksheetFunction property? Rick I'm not sure which is more efficient. I agree with Dana about moving pattern outside the loop if you only are using a single pattern. I was still thinking about perhaps using multiple patterns. I suppose you could do something like: ===================================== Sub AddDot() Dim c As Range Dim re As Object Dim wsf As WorksheetFunction Set wsf = Application.WorksheetFunction Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True re.Pattern = "\b([A-Z])\b(?!\.)" For Each c In Selection c.Value = wsf.Proper(wsf.Trim(re.Replace(c.Text, "$1."))) Next c End Sub =========================================== to speed it up a bit, or even set a reference (tools/references) to Microsoft VBSCript 5.5 and then use: ============================================= Option Explicit Sub AddDot() 'requires reference set to Microsoft _ ' VBScript Regular Expressions 5.5 Dim c As Range Dim re As RegExp Dim wsf As WorksheetFunction Set wsf = Application.WorksheetFunction Set re = New RegExp re.IgnoreCase = True re.Global = True re.Pattern = "\b([A-Z])\b(?!\.)" For Each c In Selection c.Value = wsf.Proper(wsf.Trim(re.Replace(c.Text, "$1."))) Next c End Sub ========================================== And I don't know whether, with this structure, it makes sense to explicitly release the objects at the end of the Sub. ================================================== = Sub AddDot() 'requires reference set to Microsoft _ ' VBScript Regular Expressions 5.5 Dim c As Range Dim re As RegExp Dim wsf As WorksheetFunction Set wsf = Application.WorksheetFunction Set re = New RegExp re.IgnoreCase = True re.Global = True re.Pattern = "\b([A-Z])\b(?!\.)" For Each c In Selection c.Value = wsf.Proper(wsf.Trim(re.Replace(c.Text, "$1."))) Next c Set re = Nothing Set wsf = Nothing End Sub ================================================== = --ron |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
Rick,
Thanks for this suggestion too. This is a problem I have in these files too, and I have a query in the Access database where I load the files to take the spaces out. I might as well do it in the Excel files though and eliminate one of the queries I do to load the files! Many thanks! Sherry "Rick Rothstein (MVP - VB)" wrote: This might actually be better... it also removes all leading and trailing spaces, if any... Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" txt = re.Replace(c.Text, "$1.") re.Pattern = "\s{2,}" c.Value = re.Replace(Trim(txt), " ") Next c End Sub Rick |
#10
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
Rick, Thanks tons! It works so well I wish I had asked this before. You
have saved me hours and hours of work. This is a volunteer project for me and I just love working on it, but as you can probably imagine, it gets boring going thru these files cell by cell and deleting spaces and adding periods. My goal is to give the Society a database with data as clean as possible. Thanks, thanks and thanks again, Sherry (I will check Ron's post if he does respond.) "Rick Rothstein (MVP - VB)" wrote: Given that you indicated you used Ron Rosenfeld's routine in your earlier thread, I modified that to add the additional functionality... Sub AddDot() Dim c As Range Dim txt As String Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True For Each c In Selection re.Pattern = "\b([A-Z])\b(?!\.)" txt = re.Replace(c.Text, "$1.") re.Pattern = "\s{2,}" c.Value = re.Replace(txt, " ") Next c End Sub Now a word of caution... this is my first regular expression construction in some 15 years, so I am rusty. What I posted works, but I can't guarantee it is the most efficient construction. So, check back here to see if Ron posts a better constructed regular expression solution than this one. Rick "SherryScrapDog" wrote in message ... I just had some really great help with some Excel files from this group where the formatting was done by various people over the last 15-20 years. These are names for genealogy, and I am loading many Excel files into Access so people can search by name and soundex. I don't know if the other problem I am dealing with can be handled programatically, but some of the people in the past entered extra spaces between the first and middle names. I am going thru the files manually now and removing the extra space. I have more files to do and thought it was worth asking if a macro could do this. Here are some examples: John William (I change to John William by using delete key) Doris Doe Smith (I change to Doris Doe Smith) John W. ( change to John W.) I found many posts on this site that talks about Trim, but could not find anything that specifically addressed just removing extra space within text. Thanks in advance if there is any help for me, Sherry |
#11
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
On Sun, 23 Dec 2007 22:20:59 -0500, "Rick Rothstein \(MVP - VB\)"
wrote: Now a word of caution... this is my first regular expression construction in some 15 years, so I am rusty. What I posted works, but I can't guarantee it is the most efficient construction. So, check back here to see if Ron posts a better constructed regular expression solution than this one. It works fine. And it is certainly possible to devise one regular expression that will also remove leading and trailing spaces. But I think it is simpler to just use the TRIM "worksheet" function (not the VBA function) to remove leading, trailing and doubled spaces in one swoop. But, you could run a third regex with pattern: ^\s+|\s+$ to remove leading and trailing white space characters. Or, if you only wanted to remove <space and <tab characters: "^[ \t]+|[ \t]+$" (note the <space at the start of the character class.) --ron |
#12
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
Worksheet & VB/VBA Trim functions are slightly different, inasmuch the
Worksheet function replaces any multiple spaces in the middle of text with a single space. Your sample text as posted threw me at first (when pasted directly into a cell), but try this sTmp = Replace(sOrig, Chr(160), " ") sOut = Application.WorksheetFunction.Trim(sTmp) where sOrig contains your text. As a cell formula - =TRIM(SUBSTITUTE(A1,CHAR(160)," ")) Regards, Peter T "SherryScrapDog" wrote in message ... I just had some really great help with some Excel files from this group where the formatting was done by various people over the last 15-20 years. These are names for genealogy, and I am loading many Excel files into Access so people can search by name and soundex. I don't know if the other problem I am dealing with can be handled programatically, but some of the people in the past entered extra spaces between the first and middle names. I am going thru the files manually now and removing the extra space. I have more files to do and thought it was worth asking if a macro could do this. Here are some examples: John William (I change to John William by using delete key) Doris Doe Smith (I change to Doris Doe Smith) John W. ( change to John W.) I found many posts on this site that talks about Trim, but could not find anything that specifically addressed just removing extra space within text. Thanks in advance if there is any help for me, Sherry |
#13
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
Hi Peter,
Thanks for this response! I can't believe how helpful everyone is in this group and I appreciate it so much. I'm not exactly sure what you mean when you say replace sOrig with my text and wonder if this would be the cell column. However, the previous posts macro is working wonderful, and I'm not sure what this would do different than what I have now. I can see why you said my examples threw you because as I look at them now, they do not look like they have the 2 spaces in them as I had typed them. Please let me know if there is something this does that the other macro is not doing. And, I assume this is a macro that you are giving me; let me know if it is something else. This is my first attempt at doing things programmatically in Excel, so I'm very ignorant and appreciate all the help I have received. Again, Thanks! Sherry "Peter T" wrote: Worksheet & VB/VBA Trim functions are slightly different, inasmuch the Worksheet function replaces any multiple spaces in the middle of text with a single space. Your sample text as posted threw me at first (when pasted directly into a cell), but try this sTmp = Replace(sOrig, Chr(160), " ") sOut = Application.WorksheetFunction.Trim(sTmp) where sOrig contains your text. As a cell formula - =TRIM(SUBSTITUTE(A1,CHAR(160)," ")) Regards, Peter T "SherryScrapDog" wrote in message ... I just had some really great help with some Excel files from this group where the formatting was done by various people over the last 15-20 years. These are names for genealogy, and I am loading many Excel files into Access so people can search by name and soundex. I don't know if the other problem I am dealing with can be handled programatically, but some of the people in the past entered extra spaces between the first and middle names. I am going thru the files manually now and removing the extra space. I have more files to do and thought it was worth asking if a macro could do this. Here are some examples: John William (I change to John William by using delete key) Doris Doe Smith (I change to Doris Doe Smith) John W. ( change to John W.) I found many posts on this site that talks about Trim, but could not find anything that specifically addressed just removing extra space within text. Thanks in advance if there is any help for me, Sherry |
#14
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
Hi Sherry,
I can't believe how helpful everyone is in this group Just normal service ! I'm not exactly sure what you mean when you say replace sOrig with my text and wonder if this would be the cell column. For use in the example as posted I meant first assign the text you want to process to the variable sOrig. The variable would have been declared like this Dim sOrig as String or it might be an argument in a function purely for testing copy "Doris Doe Smith" without the quotes into a cell, select the cell and run this macro Sub test1() Dim s As String, sReturn As String s = ActiveCell sReturn = TrimSpaces(s) MsgBox s & vbNewLine & sReturn End Sub Function TrimSpaces(sOrig As String) As String Dim sTmp As String Dim sOut As String sTmp = Replace(sOrig, Chr(160), " ") sOut = Application.WorksheetFunction.Trim(sTmp) TrimSpaces = sOut End Function The above may not be efficient and not necessary to use the additional sTmp variable. It depends on where your data (text to be processed) is coming from and what you want to do with it. You might, for example, pass an array to a function and loop each element. If it's a one off type of thing, copy and paste all data to cells in a column, copy down the cell formula as posted previously. If you are sure the pseudo space chr(160) only crept into the examples in your post, and never exists in your data, you'd only need the worksheet Trim function. If you want to filter or make further changes RegExp may will provide much greater flexibility, otherwise go with the simplest and/or fastest method you are sure will work for your needs. Regards, Peter T "SherryScrapDog" wrote in message ... Hi Peter, Thanks for this response! I can't believe how helpful everyone is in this group and I appreciate it so much. I'm not exactly sure what you mean when you say replace sOrig with my text and wonder if this would be the cell column. However, the previous posts macro is working wonderful, and I'm not sure what this would do different than what I have now. I can see why you said my examples threw you because as I look at them now, they do not look like they have the 2 spaces in them as I had typed them. Please let me know if there is something this does that the other macro is not doing. And, I assume this is a macro that you are giving me; let me know if it is something else. This is my first attempt at doing things programmatically in Excel, so I'm very ignorant and appreciate all the help I have received. Again, Thanks! Sherry "Peter T" wrote: Worksheet & VB/VBA Trim functions are slightly different, inasmuch the Worksheet function replaces any multiple spaces in the middle of text with a single space. Your sample text as posted threw me at first (when pasted directly into a cell), but try this sTmp = Replace(sOrig, Chr(160), " ") sOut = Application.WorksheetFunction.Trim(sTmp) where sOrig contains your text. As a cell formula - =TRIM(SUBSTITUTE(A1,CHAR(160)," ")) Regards, Peter T "SherryScrapDog" wrote in message ... I just had some really great help with some Excel files from this group where the formatting was done by various people over the last 15-20 years. These are names for genealogy, and I am loading many Excel files into Access so people can search by name and soundex. I don't know if the other problem I am dealing with can be handled programatically, but some of the people in the past entered extra spaces between the first and middle names. I am going thru the files manually now and removing the extra space. I have more files to do and thought it was worth asking if a macro could do this. Here are some examples: John William (I change to John William by using delete key) Doris Doe Smith (I change to Doris Doe Smith) John W. ( change to John W.) I found many posts on this site that talks about Trim, but could not find anything that specifically addressed just removing extra space within text. Thanks in advance if there is any help for me, Sherry |
#15
Posted to microsoft.public.excel.programming
|
|||
|
|||
Remove extra space within text
Thanks much Peter for clarification, and for teaching me more. I very much
enjoy leaning all I can! Sherry "Peter T" wrote: Hi Sherry, I can't believe how helpful everyone is in this group Just normal service ! I'm not exactly sure what you mean when you say replace sOrig with my text and wonder if this would be the cell column. For use in the example as posted I meant first assign the text you want to process to the variable sOrig. The variable would have been declared like this Dim sOrig as String or it might be an argument in a function purely for testing copy "Doris Doe Smith" without the quotes into a cell, select the cell and run this macro Sub test1() Dim s As String, sReturn As String s = ActiveCell sReturn = TrimSpaces(s) MsgBox s & vbNewLine & sReturn End Sub Function TrimSpaces(sOrig As String) As String Dim sTmp As String Dim sOut As String sTmp = Replace(sOrig, Chr(160), " ") sOut = Application.WorksheetFunction.Trim(sTmp) TrimSpaces = sOut End Function The above may not be efficient and not necessary to use the additional sTmp variable. It depends on where your data (text to be processed) is coming from and what you want to do with it. You might, for example, pass an array to a function and loop each element. If it's a one off type of thing, copy and paste all data to cells in a column, copy down the cell formula as posted previously. If you are sure the pseudo space chr(160) only crept into the examples in your post, and never exists in your data, you'd only need the worksheet Trim function. If you want to filter or make further changes RegExp may will provide much greater flexibility, otherwise go with the simplest and/or fastest method you are sure will work for your needs. Regards, Peter T |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
How do I remove extra space when text wrapping? | Excel Worksheet Functions | |||
How can I remove extra space when using autofit in Excel 2003? | Excel Discussion (Misc queries) | |||
How do I remove extra space when text wrapping? | Excel Discussion (Misc queries) | |||
Remove an extra space after a number. | Excel Worksheet Functions | |||
Remove Space in Text | Excel Discussion (Misc queries) |