![]() |
Extracting numbers from a single sell
Hi, there!
The situation goes like that : I have a messy cell that contains both adresses and a couple of phone numbers per client. The problem is that i need only the phone numbers. Is there a way that i can exctract each one of them in a different cell - i mean if this client has 3 numbers specified in the messy cell, they should be distributed in 3 separated cells. Also as you can guess the adresses includes numbers, that are not needed, and basically this can be worked around ( i guess ) by specifying that we need numbers with more that 5 digits. P.S. I`m not very advanced users, so I`ll apreciate detailed ( as for dummies ) explanation. Thanks in Advance, Georgi |
Extracting numbers from a single sell
Hi
You can use "TextToColumns", if address and each phone number has a seperator like a comma. Select the cells, goto Data menu TextToColumns, select Delemited Next select your seperator and follow instructions. Hopes it helps. Regards, Per "Georgi" skrev i meddelelsen ... Hi, there! The situation goes like that : I have a messy cell that contains both adresses and a couple of phone numbers per client. The problem is that i need only the phone numbers. Is there a way that i can exctract each one of them in a different cell - i mean if this client has 3 numbers specified in the messy cell, they should be distributed in 3 separated cells. Also as you can guess the adresses includes numbers, that are not needed, and basically this can be worked around ( i guess ) by specifying that we need numbers with more that 5 digits. P.S. I`m not very advanced users, so I`ll apreciate detailed ( as for dummies ) explanation. Thanks in Advance, Georgi |
Extracting numbers from a single sell
What do the phone numbers look like??
123-456-7890 -- Gary''s Student - gsnu2007k "Georgi" wrote: Hi, there! The situation goes like that : I have a messy cell that contains both adresses and a couple of phone numbers per client. The problem is that i need only the phone numbers. Is there a way that i can exctract each one of them in a different cell - i mean if this client has 3 numbers specified in the messy cell, they should be distributed in 3 separated cells. Also as you can guess the adresses includes numbers, that are not needed, and basically this can be worked around ( i guess ) by specifying that we need numbers with more that 5 digits. P.S. I`m not very advanced users, so I`ll apreciate detailed ( as for dummies ) explanation. Thanks in Advance, Georgi |
Extracting numbers from a single sell
On Tue, 23 Dec 2008 06:16:08 -0800 (PST), Georgi wrote:
Hi, there! The situation goes like that : I have a messy cell that contains both adresses and a couple of phone numbers per client. The problem is that i need only the phone numbers. Is there a way that i can exctract each one of them in a different cell - i mean if this client has 3 numbers specified in the messy cell, they should be distributed in 3 separated cells. Also as you can guess the adresses includes numbers, that are not needed, and basically this can be worked around ( i guess ) by specifying that we need numbers with more that 5 digits. P.S. I`m not very advanced users, so I`ll apreciate detailed ( as for dummies ) explanation. Thanks in Advance, Georgi Yes, it can be done. Please post some examples of the cell contents that contains the variations in telephone number formats for your region. --ron |
Extracting numbers from a single sell
First of all I would like to say Merry Christmas to all of you that
are celebrating it and thanks for the input ! @ Per : the only separators are spaces so TextToColumns, will give me 10 cells with different information, so again I need a way to select only the ones that I need ( the cells with phone number ) @ Gary, @ Ron : here is the content of a exemplary cell : adr:Sofia, P.K. 1715 Druzhba district 4 flat 407 entrance 6 floor 6 apartment 38 9754543 0898892158 9754543 000 The numbers that we need are 9754543 ; 0898892158 ; 9754543, everything else is needless, i.e. the phone numbers contains between 5 and 10 ( maybe in some case more ) digits without spaces or dashes in it. Also the number of phone numbers provided can vary, i.e. the cell can contain 1,2,3,4,5 different numbers that we should extract. I`m really confused and I`ll be really grateful if you can help me. Thank you in advance ! Georgi |
Extracting numbers from a single sell
On Sun, 28 Dec 2008 10:43:56 -0800 (PST), Georgi wrote:
First of all I would like to say Merry Christmas to all of you that are celebrating it and thanks for the input ! @ Per : the only separators are spaces so TextToColumns, will give me 10 cells with different information, so again I need a way to select only the ones that I need ( the cells with phone number ) @ Gary, @ Ron : here is the content of a exemplary cell : adr:Sofia, P.K. 1715 Druzhba district 4 flat 407 entrance 6 floor 6 apartment 38 9754543 0898892158 9754543 000 The numbers that we need are 9754543 ; 0898892158 ; 9754543, everything else is needless, i.e. the phone numbers contains between 5 and 10 ( maybe in some case more ) digits without spaces or dashes in it. Also the number of phone numbers provided can vary, i.e. the cell can contain 1,2,3,4,5 different numbers that we should extract. I`m really confused and I`ll be really grateful if you can help me. Thank you in advance ! Georgi Using a UDF (User Defined Function) making use of Regular Expressions (see below for details): With your data in, for example, A1, to extract all instances of 5 or more consecutive digits: B1: =RegexMid($A1,"\d{5,}",COLUMNS($A:A)) Then fill right far enough to be sure to get all of the returns To be able to use this UDF: <alt-F11 opens the VBEditor. Ensure your project is highlighted in the Project Explorer window, then Insert/Module and paste the code below into the window that opens: =========================================== Option Explicit Function RegexMid(Str As String, Pattern As String, _ Optional Index As Variant = 1, _ Optional CaseSensitive As Boolean = True, _ Optional MultiLin As Boolean = False) _ As Variant 'Variant as value may be string or array 'Index -- negative values return groups counting from end of string Dim objRegExp As Object Dim objMatch As Object Dim colMatches As Object Dim i As Long 'counter Dim T() As String 'container for array results ' Create a regular expression object. Set objRegExp = CreateObject("vbscript.regexp") 'Set the pattern by using the Pattern property. objRegExp.Pattern = Pattern ' Set Case Insensitivity. objRegExp.IgnoreCase = Not CaseSensitive 'Set global applicability. objRegExp.Global = True 'Set multiline objRegExp.MultiLine = MultiLin 'Test whether the String can be compared. If (objRegExp.Test(Str) = True) Then 'Get the matches. Set colMatches = objRegExp.Execute(Str) ' Execute search. On Error Resume Next 'return null string if a colmatch index is non-existent If IsArray(Index) Then ReDim T(1 To UBound(Index)) For i = 1 To UBound(Index) T(i) = colMatches(IIf(Index(i) 0, Index(i) - 1, Index(i) + colMatches.Count)) Next i RegexMid = T() Else RegexMid = CStr(colMatches(IIf(Index 0, Index - 1, Index + colMatches.Count))) If IsEmpty(RegexMid) Then RegexMid = "" End If On Error GoTo 0 'reset error handler Else RegexMid = "" End If End Function ===================================== Regular Expressions http://www.regular-expressions.info/reference.html http://support.microsoft.com/default...02&Product=vbb http://msdn2.microsoft.com/en-us/library/6wzad2b2.aspx http://msdn2.microsoft.com/en-us/library/ms974619.aspx http://www.regex-guru.info/ --ron |
Extracting numbers from a single sell
On Sun, 28 Dec 2008 15:47:28 -0500, Ron Rosenfeld
wrote: On Sun, 28 Dec 2008 10:43:56 -0800 (PST), Georgi wrote: First of all I would like to say Merry Christmas to all of you that are celebrating it and thanks for the input ! @ Per : the only separators are spaces so TextToColumns, will give me 10 cells with different information, so again I need a way to select only the ones that I need ( the cells with phone number ) @ Gary, @ Ron : here is the content of a exemplary cell : adr:Sofia, P.K. 1715 Druzhba district 4 flat 407 entrance 6 floor 6 apartment 38 9754543 0898892158 9754543 000 The numbers that we need are 9754543 ; 0898892158 ; 9754543, everything else is needless, i.e. the phone numbers contains between 5 and 10 ( maybe in some case more ) digits without spaces or dashes in it. Also the number of phone numbers provided can vary, i.e. the cell can contain 1,2,3,4,5 different numbers that we should extract. I`m really confused and I`ll be really grateful if you can help me. Thank you in advance ! Georgi Using a UDF (User Defined Function) making use of Regular Expressions (see below for details): With your data in, for example, A1, to extract all instances of 5 or more consecutive digits: B1: =RegexMid($A1,"\d{5,}",COLUMNS($A:A)) Then fill right far enough to be sure to get all of the returns To be able to use this UDF: <alt-F11 opens the VBEditor. Ensure your project is highlighted in the Project Explorer window, then Insert/Module and paste the code below into the window that opens: =========================================== Option Explicit Function RegexMid(Str As String, Pattern As String, _ Optional Index As Variant = 1, _ Optional CaseSensitive As Boolean = True, _ Optional MultiLin As Boolean = False) _ As Variant 'Variant as value may be string or array 'Index -- negative values return groups counting from end of string Dim objRegExp As Object Dim objMatch As Object Dim colMatches As Object Dim i As Long 'counter Dim T() As String 'container for array results ' Create a regular expression object. Set objRegExp = CreateObject("vbscript.regexp") 'Set the pattern by using the Pattern property. objRegExp.Pattern = Pattern ' Set Case Insensitivity. objRegExp.IgnoreCase = Not CaseSensitive 'Set global applicability. objRegExp.Global = True 'Set multiline objRegExp.MultiLine = MultiLin 'Test whether the String can be compared. If (objRegExp.Test(Str) = True) Then 'Get the matches. Set colMatches = objRegExp.Execute(Str) ' Execute search. On Error Resume Next 'return null string if a colmatch index is non-existent If IsArray(Index) Then ReDim T(1 To UBound(Index)) For i = 1 To UBound(Index) T(i) = colMatches(IIf(Index(i) 0, Index(i) - 1, Index(i) + colMatches.Count)) Next i RegexMid = T() Else RegexMid = CStr(colMatches(IIf(Index 0, Index - 1, Index + colMatches.Count))) If IsEmpty(RegexMid) Then RegexMid = "" End If On Error GoTo 0 'reset error handler Else RegexMid = "" End If End Function ===================================== Regular Expressions http://www.regular-expressions.info/reference.html http://support.microsoft.com/default...02&Product=vbb http://msdn2.microsoft.com/en-us/library/6wzad2b2.aspx http://msdn2.microsoft.com/en-us/library/ms974619.aspx http://www.regex-guru.info/ --ron By the way, should you have, within your data, strings of digits that are greater than 5, but are not phone numbers (e.g. ... 123456abc) where you do not want to return the 123456, then we will need to modify the regular expression pattern: e.g. "\b\d{5,}\b" So, the formula would be: B1: =RegexMid($A1,"\b\d{5,}\b",COLUMNS($A:A)) Also, with the information you have supplied, there is no way to differentiate a 5 digit telephone number from a 5 digit address. --ron |
Extracting numbers from a single sell
On Sun, 28 Dec 2008 15:47:28 -0500, Ron Rosenfeld
wrote: I noticed an inappropriate line wrap in the UDF. Below should correct it: ================================ Option Explicit Function RegexMid(Str As String, Pattern As String, _ Optional Index As Variant = 1, _ Optional CaseSensitive As Boolean = True, _ Optional MultiLin As Boolean = False) _ As Variant 'Variant as value may be string or array 'Index -- negative values return groups counting from end of string Dim objRegExp As Object Dim objMatch As Object Dim colMatches As Object Dim i As Long 'counter Dim T() As String 'container for array results ' Create a regular expression object. Set objRegExp = CreateObject("vbscript.regexp") 'Set the pattern by using the Pattern property. objRegExp.Pattern = Pattern ' Set Case Insensitivity. objRegExp.IgnoreCase = Not CaseSensitive 'Set global applicability. objRegExp.Global = True 'Set multiline objRegExp.MultiLine = MultiLin 'Test whether the String can be compared. If (objRegExp.Test(Str) = True) Then 'Get the matches. Set colMatches = objRegExp.Execute(Str) ' Execute search. On Error Resume Next 'return null string if a colmatch index is non-existent If IsArray(Index) Then ReDim T(1 To UBound(Index)) For i = 1 To UBound(Index) T(i) = colMatches(IIf(Index(i) 0, Index(i) - 1, Index(i) + _ colMatches.Count)) Next i RegexMid = T() Else RegexMid = CStr(colMatches(IIf(Index 0, Index - 1, Index + _ colMatches.Count))) If IsEmpty(RegexMid) Then RegexMid = "" End If On Error GoTo 0 'reset error handler Else RegexMid = "" End If End Function ===================================== --ron |
Extracting numbers from a single sell
On Dec 29, 3:51*am, Ron Rosenfeld wrote:
On Sun, 28 Dec 2008 15:47:28 -0500, Ron Rosenfeld wrote: I noticed an inappropriate line wrap in the UDF. *Below should correct it: ================================ Option Explicit Function RegexMid(Str As String, Pattern As String, _ * * Optional Index As Variant = 1, _ * * Optional CaseSensitive As Boolean = True, _ * * Optional MultiLin As Boolean = False) _ * * As Variant 'Variant as value may be string or array 'Index -- negative values return groups counting from end of string Dim objRegExp As Object Dim objMatch As Object Dim colMatches * As Object Dim i As Long 'counter Dim T() As String 'container for array results ' Create a regular expression object. * *Set objRegExp = CreateObject("vbscript.regexp") * *'Set the pattern by using the Pattern property. * *objRegExp.Pattern = Pattern * *' Set Case Insensitivity. * *objRegExp.IgnoreCase = Not CaseSensitive * *'Set global applicability. * *objRegExp.Global = True * *'Set multiline * *objRegExp.MultiLine = MultiLin * *'Test whether the String can be compared. * *If (objRegExp.Test(Str) = True) Then * *'Get the matches. * * Set colMatches = objRegExp.Execute(Str) * ' Execute search. On Error Resume Next 'return null string if a colmatch index is non-existent If IsArray(Index) Then * * ReDim T(1 To UBound(Index)) * * For i = 1 To UBound(Index) * * * * T(i) = colMatches(IIf(Index(i) 0, Index(i) - 1, Index(i) + _ * * * * * * * * colMatches.Count)) * * Next i * * RegexMid = T() Else * * RegexMid = CStr(colMatches(IIf(Index 0, Index - 1, Index + _ * * * * colMatches.Count))) * * If IsEmpty(RegexMid) Then RegexMid = "" End If On Error GoTo 0 * * 'reset error handler *Else * * * * RegexMid = "" * * End If End Function ===================================== --ron Hi Ron, Thanks for your help, but I do have a problem with this. First of all I forgot to tell you that I`m using Excel 2007. Here is what I did : I`ve opened Visual Basic ( by pressing Alt+F11 ), then I`ve copied the lines you provided and saved the Module. Now I have Module1 in Modules in the VBA project of the excel document that I wanna exctract phone numbers from ( I do think that I did that part OK ). After that I`ve copied the formula that you`ve provided in a cell V2 ( the cell that I wanna exctract information from is U2 ), so cell V2 looks like that : " =RegexMid($U2,"\d{5,}",COLUMNS($U:U)) ". This formula don`t works, so I`ve changed the commas with semicolons, and V2 looks like that " =RegexMid($U2;"\d{5,}";COLUMNS($U;U)) ". The problem is that now the formula works, but it returns only the first phone number of the cell, but not the other ones. I`m sorry for disturbing you again, but I`m trying to get around this, but I cannot find what`s wrong (since I am a dummy as I said ). P.S. I do have a exemplary worksheet with just 5 examples, but as far as I know in my country there is no case that a adress contains number with more than 4 digits. Even if we find some, it`s not going to be that crucial since it`s going to be one in a thousand so we can correct it manually. Thank you very much, again, Georgi |
Extracting numbers from a single sell
On Mon, 29 Dec 2008 00:10:40 -0800 (PST), Georgi wrote:
On Dec 29, 3:51*am, Ron Rosenfeld wrote: On Sun, 28 Dec 2008 15:47:28 -0500, Ron Rosenfeld wrote: I noticed an inappropriate line wrap in the UDF. *Below should correct it: ================================ Option Explicit Function RegexMid(Str As String, Pattern As String, _ * * Optional Index As Variant = 1, _ * * Optional CaseSensitive As Boolean = True, _ * * Optional MultiLin As Boolean = False) _ * * As Variant 'Variant as value may be string or array 'Index -- negative values return groups counting from end of string Dim objRegExp As Object Dim objMatch As Object Dim colMatches * As Object Dim i As Long 'counter Dim T() As String 'container for array results ' Create a regular expression object. * *Set objRegExp = CreateObject("vbscript.regexp") * *'Set the pattern by using the Pattern property. * *objRegExp.Pattern = Pattern * *' Set Case Insensitivity. * *objRegExp.IgnoreCase = Not CaseSensitive * *'Set global applicability. * *objRegExp.Global = True * *'Set multiline * *objRegExp.MultiLine = MultiLin * *'Test whether the String can be compared. * *If (objRegExp.Test(Str) = True) Then * *'Get the matches. * * Set colMatches = objRegExp.Execute(Str) * ' Execute search. On Error Resume Next 'return null string if a colmatch index is non-existent If IsArray(Index) Then * * ReDim T(1 To UBound(Index)) * * For i = 1 To UBound(Index) * * * * T(i) = colMatches(IIf(Index(i) 0, Index(i) - 1, Index(i) + _ * * * * * * * * colMatches.Count)) * * Next i * * RegexMid = T() Else * * RegexMid = CStr(colMatches(IIf(Index 0, Index - 1, Index + _ * * * * colMatches.Count))) * * If IsEmpty(RegexMid) Then RegexMid = "" End If On Error GoTo 0 * * 'reset error handler *Else * * * * RegexMid = "" * * End If End Function ===================================== --ron Hi Ron, Thanks for your help, but I do have a problem with this. First of all I forgot to tell you that I`m using Excel 2007. Here is what I did : I`ve opened Visual Basic ( by pressing Alt+F11 ), then I`ve copied the lines you provided and saved the Module. Now I have Module1 in Modules in the VBA project of the excel document that I wanna exctract phone numbers from ( I do think that I did that part OK ). After that I`ve copied the formula that you`ve provided in a cell V2 ( the cell that I wanna exctract information from is U2 ), so cell V2 looks like that : " =RegexMid($U2,"\d{5,}",COLUMNS($U:U)) ". This formula don`t works, so I`ve changed the commas with semicolons, and V2 looks like that " =RegexMid($U2;"\d{5,}";COLUMNS($U;U)) ". The problem is that now the formula works, but it returns only the first phone number of the cell, but not the other ones. I`m sorry for disturbing you again, but I`m trying to get around this, but I cannot find what`s wrong (since I am a dummy as I said ). P.S. I do have a exemplary worksheet with just 5 examples, but as far as I know in my country there is no case that a adress contains number with more than 4 digits. Even if we find some, it`s not going to be that crucial since it`s going to be one in a thousand so we can correct it manually. Thank you very much, again, Georgi Georgi, I have Excel 2007 also. But this routine should work on most versions. You are correct in that the commas should be replaced by whatever your country separators are -- semicolon is common. You initially wrote that you wanted the numbers "distributed in 3 separated cells." So each cell with the formula will only return one number. In my instructions, I had written to "Then fill right far enough to be sure to get all of the returns". Did you do that? What does the formula look like when you filled it into W2, X2, Y2 etc? --ron |
Extracting numbers from a single sell
On 29 δΕΛ, 14:07, Ron Rosenfeld wrote:
On Mon, 29 Dec 2008 00:10:40 -0800 (PST), Georgi wrote: On Dec 29, 3:51am, Ron Rosenfeld wrote: On Sun, 28 Dec 2008 15:47:28 -0500, Ron Rosenfeld wrote: I noticed an inappropriate line wrap in the UDF. Below should correct it: ================================ Option Explicit Function RegexMid(Str As String, Pattern As String, _ Optional Index As Variant = 1, _ Optional CaseSensitive As Boolean = True, _ Optional MultiLin As Boolean = False) _ As Variant 'Variant as value may be string or array 'Index -- negative values return groups counting from end of string Dim objRegExp As Object Dim objMatch As Object Dim colMatches As Object Dim i As Long 'counter Dim T() As String 'container for array results ' Create a regular expression object. Set objRegExp = CreateObject("vbscript.regexp") 'Set the pattern by using the Pattern property. objRegExp.Pattern = Pattern ' Set Case Insensitivity. objRegExp.IgnoreCase = Not CaseSensitive 'Set global applicability. objRegExp.Global = True 'Set multiline objRegExp.MultiLine = MultiLin 'Test whether the String can be compared. If (objRegExp.Test(Str) = True) Then 'Get the matches. Set colMatches = objRegExp.Execute(Str) ' Execute search.. On Error Resume Next 'return null string if a colmatch index is non-existent If IsArray(Index) Then ReDim T(1 To UBound(Index)) For i = 1 To UBound(Index) T(i) = colMatches(IIf(Index(i) 0, Index(i) - 1, Index(i) + _ colMatches.Count)) Next i RegexMid = T() Else RegexMid = CStr(colMatches(IIf(Index 0, Index - 1, Index + _ colMatches.Count))) If IsEmpty(RegexMid) Then RegexMid = "" End If On Error GoTo 0 'reset error handler Else RegexMid = "" End If End Function ===================================== --ron Hi Ron, Thanks for your help, but I do have a problem with this. First of all I forgot to tell you that I`m using Excel 2007. Here is what I did : I`ve opened Visual Basic ( by pressing Alt+F11 ), then I`ve copied the lines you provided and saved the Module. Now I have Module1 in Modules in the VBA project of the excel document that I wanna exctract phone numbersfrom ( I do think that I did that part OK ). After that I`ve copied the formula that you`ve provided in a cell V2 ( the cell that I wanna exctract information from is U2 ), so cell V2 looks like that : " =RegexMid($U2,"\d{5,}",COLUMNS($U:U)) ". This formula don`t works, so I`ve changed the commas with semicolons, and V2 looks like that " =RegexMid($U2;"\d{5,}";COLUMNS($U;U)) ". The problem is that now the formula works, but it returns only the first phone number of the cell, but not the other ones. I`m sorry for disturbing you again, but I`m trying to get around this, but I cannot find what`s wrong (since I am a dummy as I said ). P.S. I do have a exemplary worksheet with just 5 examples, but as far as I know in my country there is no case that a adress contains number with more than 4 digits. Even if we find some, it`s not going to be that crucial since it`s going to be one in a thousand so we can correct it manually. Thank you very much, again, Georgi Georgi, I have Excel 2007 also. But this routine should work on most versions. You are correct in that the commas should be replaced by whatever your country separators are -- semicolon is common. You initially wrote that you wanted thenumbers"distributed in 3 separated cells." So each cell with the formula will only return one number. In my instructions, I had written to "Then fill right far enough to be sure to get all of the returns". Did you do that? What does the formula look like when you filled it into W2, X2, Y2 etc? --ron Hi Ron, I didn`t saw this part of your post. I did it and it is working briliantly! Sorry for the waste of your time ! P.S. Now after I`ve asked they`ve sent me a real table with around 1000 cells, so I can see if there is any variantions in the mobile numbers.... and I found that the personal that entered the phone numbers, used all kind of ways to enter them. In some cases they used 0877/55/44/33 ; 0877-55-44-33 - I`ve used Find&Replace on them and after that your formula worked just perfect. But there is another case - 0877 554433 or 0877 55 44 33 - in this case I don`t think that there is something that we can do, despite some handwork :). Now I`m trying to find a way to just find the cells that contain numbers with 3 to 6 digits, so I can manualy check what is the problem with them, but I cannot find the right way to do it - I`ve tried some strings in Find&Replace but they don`t seem to work. Thanks for your help, I really appreciate that ! Georgi |
Extracting numbers from a single sell
On Mon, 29 Dec 2008 07:18:18 -0800 (PST), Georgi wrote:
Hi Ron, I didn`t saw this part of your post. I did it and it is working briliantly! Sorry for the waste of your time ! P.S. Now after I`ve asked they`ve sent me a real table with around 1000 cells, so I can see if there is any variantions in the mobile numbers.... and I found that the personal that entered the phone numbers, used all kind of ways to enter them. In some cases they used 0877/55/44/33 ; 0877-55-44-33 - I`ve used Find&Replace on them and after that your formula worked just perfect. But there is another case - 0877 554433 or 0877 55 44 33 - in this case I don`t think that there is something that we can do, despite some handwork :). Now I`m trying to find a way to just find the cells that contain numbers with 3 to 6 digits, so I can manualy check what is the problem with them, but I cannot find the right way to do it - I`ve tried some strings in Find&Replace but they don`t seem to work. Instead of using the Find/Replace, try using the following Regex instead: "\b\d[-\d\s\/]{3,}\d\b" So your formula might be: =RegexMid($U2,"\b\d[-\d\s\/]{3,}\d\b",COLUMNS($A:A)) It may be that this may return data that is not valid, but that may be easier to deal with. In particular, this regex is looking for a string that 1. Start with a digit 2. The next three or more characters can be a hyphen, digit, space or backslash. 3. End with a digit. So it should pick up all five digit string, even if they include the extra characters. But it could also pick up something like 1///2. That's probably not a problem for your database. (If it is, we could deal with it by making the regex more complex, but I'm lazy :-)) --ron |
Extracting numbers from a single sell
On Dec 29, 10:04*pm, Ron Rosenfeld wrote:
On Mon, 29 Dec 2008 07:18:18 -0800 (PST), Georgi wrote: Hi Ron, I didn`t saw this part of your post. I did it and it is working briliantly! Sorry for the waste of your time ! P.S. Now after I`ve asked they`ve sent me a real table with around 1000 cells, so I can see if there is any variantions in the mobile numbers.... and I found that the personal that entered the phone numbers, used all kind of ways to enter them. In some cases they used 0877/55/44/33 ; 0877-55-44-33 - I`ve used Find&Replace on them and after that your formula worked just perfect. But there is another case - 0877 554433 or 0877 55 44 33 - in this case I don`t think that there is something that we can do, despite some handwork :). Now I`m trying to find a way to just find the cells that containnumberswith 3 to 6 digits, so I can manualy check what is the problem with them, but I cannot find the right way to do it - I`ve tried some strings in Find&Replace but they don`t seem to work. Instead of using the Find/Replace, try using the following Regex instead: "\b\d[-\d\s\/]{3,}\d\b" So your formula might be: =RegexMid($U2,"\b\d[-\d\s\/]{3,}\d\b",COLUMNS($A:A)) It may be that this may return data that is not valid, but that may be easier to deal with. *In particular, this regex is looking for a string that 1. *Start with a digit 2. *The next three or more characters can be a hyphen, digit, space or backslash. 3. *End with a digit. So it should pick up all five digit string, even if they include the extra characters. *But it could also pick up something like *1///2. *That's probably not a problem for your database. *(If it is, we could deal with it by making the regex more complex, but I'm lazy :-)) --ron Hi there Ron, The formula you provided is not working very well. First of all when I try to enter it, Excel tells me that it`s wrong, and suggest to change the formula. The funny thing is that as far as I can see the new formula is exactly the same as the one you provided by you and this must be some kind of bug :). After that this formula gets all of the phone numbers from the big cell into the first destination cell + a couple of other numbers :). What I saw now is that the big cell contains two spaces ( the only two consecutive space between something in the cell ) between the address part and the numbers part, so it looks like that : adr:Sofia, P.K. 1715 Druzhba district 4 flat 407 entrance 6 floor 6 apartment 38 (two spaces here ) 9754543 0898892158 9754543 000 So maybe I can get rid of the adress part using this.... Also if we want to select the number that consist 2 or 3 parts, there should be something like "If there is a number with less that 6 number, next to another number with less than 6 numbers and possibly next to another number with less than 6 symbols, then consolidate them". This would work in most of the cases, except when there is two or three different phone numbers entered with spaces ( most of them are entered in different branches in different parts of the country, so if somebody is using spaces for delimation for one of the number, he`ll probably use this pattern again with the next number in the cell - it`s part of his style :-D ) in this case we cannot understand where one number starts and the other ends. If I have to be honest I cannot see how it can be done and I`m starting to think that it`s impossible - they`re too many variantions and too many possibilities that can occur. I`ll try to find some way around by myself, since I`ve already taken enough advantage of you kindness, but in the end I do think that some of the messes should be untwisted manually. Thank you very much for your exceptional help, and I`m wishing you to get all the best from the upcoming 2009 year :) |
Extracting numbers from a single sell
On 29 Dic, 23:58, Georgi wrote:
On Dec 29, 10:04*pm, Ron Rosenfeld wrote: On Mon, 29 Dec 2008 07:18:18 -0800 (PST), Georgi wrote: Hi Ron, I didn`t saw this part of your post. I did it and it is working briliantly! Sorry for the waste of your time ! P.S. Now after I`ve asked they`ve sent me a real table with around 1000 cells, so I can see if there is any variantions in the mobile numbers.... and I found that the personal that entered the phone numbers, used all kind of ways to enter them. In some cases they used 0877/55/44/33 ; 0877-55-44-33 - I`ve used Find&Replace on them and after that your formula worked just perfect. But there is another case - 0877 554433 or 0877 55 44 33 - in this case I don`t think that there is something that we can do, despite some handwork :). Now I`m trying to find a way to just find the cells that containnumberswith 3 to 6 digits, so I can manualy check what is the problem with them, but I cannot find the right way to do it - I`ve tried some strings in Find&Replace but they don`t seem to work. Instead of using the Find/Replace, try using the following Regex instead: "\b\d[-\d\s\/]{3,}\d\b" So your formula might be: =RegexMid($U2,"\b\d[-\d\s\/]{3,}\d\b",COLUMNS($A:A)) It may be that this may return data that is not valid, but that may be easier to deal with. *In particular, this regex is looking for a string that 1. *Start with a digit 2. *The next three or more characters can be a hyphen, digit, space or backslash. 3. *End with a digit. So it should pick up all five digit string, even if they include the extra characters. *But it could also pick up something like *1///2. *That's probably not a problem for your database. *(If it is, we could deal with it by making the regex more complex, but I'm lazy :-)) --ron Hi there Ron, The formula you provided is not working very well. First of all when I try to enter it, Excel tells me that it`s wrong, and suggest to change the formula. The funny thing is that as far as I can see the new formula is exactly the same as the one you provided by you and this must be some kind of bug :). After that this formula gets all of the phone numbers from the big cell into the first destination cell + a couple of other numbers :). What I saw now is that the big cell contains two spaces ( the only two consecutive space between something in the cell ) between the address part and the numbers part, so it looks like that : adr:Sofia, P.K. 1715 Druzhba district 4 flat 407 entrance 6 floor 6 apartment 38 (two spaces here ) 9754543 0898892158 9754543 000 So maybe I can get rid of the adress part using this.... Also if we want to select the number that consist 2 or 3 parts, there should be something like "If there is a number with less that 6 number, next to another number with less than 6 numbers and possibly next to another number with less than 6 symbols, then consolidate them". This would work in most of the cases, except when there is two or three different phone numbers entered with spaces ( most of them are entered in different branches in different parts of the country, so if somebody is using spaces for delimation for one of the number, he`ll probably use this pattern again with the next number in the cell - it`s part of his style :-D ) in this case we cannot understand where one number starts and the other ends. If I have to be honest I cannot see how it can be done and I`m starting to think that it`s impossible - they`re too many variantions and too many possibilities that can occur. I`ll try to find some way around by myself, since I`ve already taken enough advantage of you kindness, but in the end I do think that some of the messes should be untwisted manually. Thank you very much for your exceptional help, and I`m wishing you to get all the best from the upcoming 2009 year :)- Nascondi testo citato - Mostra testo citato - Hi Georgi. Try. Public Sub trovapho() 'Trova il telefono (numerico 5 pos) Dim CL As Range Dim RNG As Range Dim B() As String Dim I As Long Dim Pho As Long Dim P As Long Dim test As String Dim X As Long Sheets("Foglio2").Select Range("B1:Z100") = "" ' <----- da variare Pho = 5 Set RNG = Range("a1:a100") ' <----- da variare For Each CL In RNG X = 0 B = Split(CL, " ") For I = LBound(B) To UBound(B) If B(I) = "" Then Exit Sub test = B(I) If I < 1 Then If IsNumeric(test) Then If Len(test) Pho Then X = X + 1 Cells(CL.Row, CL.Column + X) = B(I) End If End If End If Next Next End Sub Have an Happy New Year. Eliano |
Extracting numbers from a single sell
On Dec 30 2008, 5:07*am, Ron Rosenfeld
wrote: On Mon, 29 Dec 2008 14:58:32 -0800 (PST), Georgi wrote: On Dec 29, 10:04*pm, Ron Rosenfeld wrote: On Mon, 29 Dec 2008 07:18:18 -0800 (PST), Georgi wrote: Hi Ron, I didn`t saw this part of your post. I did it and it is working briliantly! Sorry for the waste of your time ! P.S. Now after I`ve asked they`ve sent me a real table with around 1000 cells, so I can see if there is any variantions in the mobile numbers.... and I found that the personal that entered the phone numbers, used all kind of ways to enter them. In some cases they used 0877/55/44/33 ; 0877-55-44-33 - I`ve used Find&Replace on them and after that your formula worked just perfect. But there is another case - 0877 554433 or 0877 55 44 33 - in this case I don`t think that there is something that we can do, despite some handwork :). Now I`m trying to find a way to just find the cells that containnumberswith 3 to 6 digits, so I can manualy check what is the problem with them, but I cannot find the right way to do it - I`ve tried some strings in Find&Replace but they don`t seem to work. Instead of using the Find/Replace, try using the following Regex instead: "\b\d[-\d\s\/]{3,}\d\b" So your formula might be: =RegexMid($U2,"\b\d[-\d\s\/]{3,}\d\b",COLUMNS($A:A)) It may be that this may return data that is not valid, but that may be easier to deal with. *In particular, this regex is looking for a string that 1. *Start with a digit 2. *The next three or more characters can be a hyphen, digit, space or backslash. 3. *End with a digit. So it should pick up all five digit string, even if they include the extra characters. *But it could also pick up something like *1///2. *That's probably not a problem for your database. *(If it is, we could deal with it by making the regex more complex, but I'm lazy :-)) --ron Hi there Ron, The formula you provided is not working very well. First of all when I try to enter it, Excel tells me that it`s wrong, and suggest to change the formula. The funny thing is that as far as I can see the new formula is exactly the same as the one you provided by you and this must be some kind of bug :). After that this formula gets all of the phonenumbersfrom the big cell into the first destination cell + a couple of othernumbers:). What I saw now is that the big cell contains two spaces ( the only two consecutive space between something in the cell ) between the address part and thenumberspart, so it looks like that : adr:Sofia, P.K. 1715 Druzhba district 4 flat 407 entrance 6 floor 6 apartment 38 (two spaces here ) 9754543 0898892158 9754543 000 So maybe I can get rid of the adress part using this.... Also if we want to select the number that consist 2 or 3 parts, there should be something like "If there is a number with less that 6 number, next to another number with less than 6numbersand possibly next to another number with less than 6 symbols, then consolidate them". This would work in most of the cases, except when there is two or three different phonenumbersentered with spaces ( most of them are entered in different branches in different parts of the country, so if somebody is using spaces for delimation for one of the number, he`ll probably use this pattern again with the next number in the cell - it`s part of his style :-D ) in this case we cannot understand where one number starts and the other ends. If I have to be honest I cannot see how it can be done and I`m starting to think that it`s impossible - they`re too many variantions and too many possibilities that can occur. I`ll try to find some way around by myself, since I`ve already taken enough advantage of you kindness, but in the end I do think that some of the messes should be untwisted manually. Thank you very much for your exceptional help, and I`m wishing you to get all the best from the upcoming 2009 year :) Here is one that seems to work with all of the examples you have provided, including the space separatednumbers: =RegexMid($U2,"\b((\d[-\d\/]{4,}\d)|((\d{2,5}\s?){3,4}))\b",COLUMNS($A:A)) Changed to semicolons: =RegexMid($U2;"\b((\d[-\d\/]{4,}\d)|((\d{2,5}\s?){3,4}))\b";COLUMNS($A:A)) There will be an issue, however, if you have a space delimiter and one of the included digit strings is six digits (or more). *Since you have stated that you could have a phone number consisting of a minimum of six digits, there would be no way of telling for example the 123456 123456 is two 6 digitnumbersor asingle12 digit number. *This also requires that space delimited entries have at least two digits in each segment. *(Otherwise non phonenumberswould be detected) --ron Hi there Ron, First of all I wanna all the best in the new year to you, and all of the other helpers here. After that I would like to thank you for your help until now - I think that this formula is as good as we can go. Despite that now I`ve got a official confirmation from the Telecoms in our region that : A mobile phone can contain 10 numbers ( or 13 if our national code is included ) A home phone can contain 8 or 9 numbers, including the area code ( or 11 or 12 if our national code is also included ). This means that the minimum lenght of a valid number is 8 digits, and the maximum is 13. Even then there is some cases that a number is with less than 8 numbers - in these cases the area code is not provided - so the number is not OK for our use, but is as good as we can go. Whatever we do, there still be a need for someone to take a look at the table, before we can use it, and with your previous formula the time needed is as small as we can go :). Thank you again, Georgi P.S. I wrote a post, before new year, but it dissapeared somewhere :) |
Extracting numbers from a single sell
On Mon, 5 Jan 2009 01:23:08 -0800 (PST), Georgi wrote:
Even then there is some cases that a number is with less than 8 numbers - in these cases the area code is not provided - so the number is not OK for our use, but is as good as we can go. Whatever we do, there still be a need for someone to take a look at the table, before we can use it, and with your previous formula the time needed is as small as we can go :). Thank you again, Georgi Well, sometimes we cannot achieve "perfection". Glad to help where I could. --ron |
Extracting numbers from a single sell
On Jan 5, 3:23*pm, Ron Rosenfeld wrote:
On Mon, 5 Jan 2009 01:23:08 -0800 (PST), Georgi wrote: Even then there is some cases that a number is with less than 8 numbers- in these cases the area code is not provided - so the number is not OK for our use, but is as good as we can go. Whatever we do, there still be a need for someone to take a look at the table, before we can use it, and with your previous formula the time needed is as small as we can go :). Thank you again, Georgi Well, sometimes we cannot achieve "perfection". * Glad to help where I could. --ron Hi there Ron, My troubles with this sheet continues. I don`t know if I need to create a new tread or ..... After we`ve created the new cells, a prefix should be attached. I`ve tried with Concatenate, but I`ll need something a little bit more complex. What I mean : - when the phone number starts with zero, another zero should be prefixed ( ex. 0885475810 , becomes 00885475810 ) - if the number starts with a digit other than zero, then two zeros (00) should be prefixed ( ex 885475810 , becomes 00885475810 ; ex. 684115452 becomes 0068115454) - if the number starts with 359 then this prefix (359) should be substituted with (00) ( ex. 359885475810, becomes 00885475810 ; 3592458965 becomes 002458965) - if the number starts with two zeros, no action should be taken ( this is a case that we can exclude because it`s too rare ). I can hardly find a word to express my gratitude for what you`ve done for me so far. Regards, |
Extracting numbers from a single sell
On Mon, 5 Jan 2009 08:10:32 -0800 (PST), Georgi wrote:
What I mean : - when the phone number starts with zero, another zero should be prefixed ( ex. 0885475810 , becomes 00885475810 ) - if the number starts with a digit other than zero, then two zeros (00) should be prefixed ( ex 885475810 , becomes 00885475810 ; ex. 684115452 becomes 0068115454) - if the number starts with 359 then this prefix (359) should be substituted with (00) ( ex. 359885475810, becomes 00885475810 ; 3592458965 becomes 002458965) - if the number starts with two zeros, no action should be taken ( this is a case that we can exclude because it`s too rare ). This can be done with worksheet functions, but given the numbers of times you've revised your specifications, and anticipating that you might be revising again in the future, I offer this UDF: ================== Function Prefix(Pnum) If Left(Pnum, 3) = 359 Then Prefix = Replace(Pnum, "359", "00", 1, 1) Exit Function End If Select Case Left(Pnum, 1) Case Is = 0 Prefix = "0" & Pnum Case 1 To 9 Prefix = "00" & Pnum End Select End Function ======================= And you would use it by embedding the RegexMid function within it; like: =prefix(regexmid(...)) --ron |
All times are GMT +1. The time now is 11:18 AM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com