![]() |
Finding email addresses in cells
In one of the columns I have text with email addresses I need seperated.
Amongst the text (i.e. "please note that is no longer employeed" - or - email is undeliverable") Is there a formula that I can create to single out just the email address? There's no set # of characters before or after the "@" - but I'm assuming I could go by a SPACE to seperate the name@domain from the rest of the text. Any suggestions? The trim function only works in one direction (i.e. after the @) and I need it to go both ways. Thanks, Joey |
Finding email addresses in cells
You could do Data TextToColumns using SPACE as a delimiter and it should
break the entire email address out into it's own separate column as it has a space both before and after with none in between..... Vaya con Dios, Chuck, CABGx3 "Joey" wrote: In one of the columns I have text with email addresses I need seperated. Amongst the text (i.e. "please note that is no longer employeed" - or - email is undeliverable") Is there a formula that I can create to single out just the email address? There's no set # of characters before or after the "@" - but I'm assuming I could go by a SPACE to seperate the name@domain from the rest of the text. Any suggestions? The trim function only works in one direction (i.e. after the @) and I need it to go both ways. Thanks, Joey |
Finding email addresses in cells
But that doesn't put the email address in one column, especially if there's
25 words in the text of the message. Text-to-columns using SPACE could just spread out the message along 25 more columns, with NO idea where the email address is. I even tried text-to-columns with "@" as the seperator, but then I don't get the name or the domain name. Joey "CLR" wrote: You could do Data TextToColumns using SPACE as a delimiter and it should break the entire email address out into it's own separate column as it has a space both before and after with none in between..... Vaya con Dios, Chuck, CABGx3 "Joey" wrote: In one of the columns I have text with email addresses I need seperated. Amongst the text (i.e. "please note that is no longer employeed" - or - email is undeliverable") Is there a formula that I can create to single out just the email address? There's no set # of characters before or after the "@" - but I'm assuming I could go by a SPACE to seperate the name@domain from the rest of the text. Any suggestions? The trim function only works in one direction (i.e. after the @) and I need it to go both ways. Thanks, Joey |
Finding email addresses in cells
With your data in A1 try this formula =TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(" ",A1,FIND("@",A1))-1)," ",REPT(" ",LEN(A1))),LEN(A1))) -- daddylonglegs ------------------------------------------------------------------------ daddylonglegs's Profile: http://www.excelforum.com/member.php...o&userid=30486 View this thread: http://www.excelforum.com/showthread...hreadid=523235 |
Finding email addresses in cells
On Thu, 16 Mar 2006 14:03:01 -0600, daddylonglegs
wrote: With your data in A1 try this formula =TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(" ",A1,FIND("@",A1))-1)," ",REPT(" ",LEN(A1))),LEN(A1))) Doesn't seem to work if the email address is at the very end of the string. --ron |
Finding email addresses in cells
This seems to work:
=TRIM(RIGHT(SUBSTITUTE(LEFT(A1&" ",FIND(" ",A1&" ",FIND("@",A1&" "))-1)," ",REPT(" ",LEN(A1))),LEN(A1))) Biff "Biff" wrote in message ... Crashes on entries like: text text text text Biff "daddylonglegs" wrote in message news:daddylonglegs.24s3qp_1142539806.708@excelforu m-nospam.com... With your data in A1 try this formula =TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(" ",A1,FIND("@",A1))-1)," ",REPT(" ",LEN(A1))),LEN(A1))) -- daddylonglegs ------------------------------------------------------------------------ daddylonglegs's Profile: http://www.excelforum.com/member.php...o&userid=30486 View this thread: http://www.excelforum.com/showthread...hreadid=523235 |
Finding email addresses in cells
Joey, This works, too, but you need morefunc.dll (google search)
=TRIM(TEXTREVERSE(LEFT(TEXTREVERSE(LEFT(A1,(SEARCH (" ",A1,(SEARCH("@",A1,1)))-1))),(SEARCH(" ",TEXTREVERSE(LEFT(A1,(SEARCH(" ",A1,(SEARCH("@",A1,1)))-1))),1))))) Beege "Joey" wrote in message ... In one of the columns I have text with email addresses I need seperated. Amongst the text (i.e. "please note that is no longer employeed" - or - email is undeliverable") Is there a formula that I can create to single out just the email address? There's no set # of characters before or after the "@" - but I'm assuming I could go by a SPACE to seperate the name@domain from the rest of the text. Any suggestions? The trim function only works in one direction (i.e. after the @) and I need it to go both ways. Thanks, Joey |
Finding email addresses in cells
On Thu, 16 Mar 2006 16:39:40 -0500, "Biff" wrote:
This seems to work: =TRIM(RIGHT(SUBSTITUTE(LEFT(A1&" ",FIND(" ",A1&" ",FIND("@",A1&" "))-1)," ",REPT(" ",LEN(A1))),LEN(A1))) Biff It will fail if the string happens to start with a dot <.. It also gives an erroneous result with: "It's too bad that John Jones is no longer with us" and various other legal email configurations. --ron |
Finding email addresses in cells
Oh well!
Where's your REGEX solution? I know I've seen it at least once. Biff "Ron Rosenfeld" wrote in message ... On Thu, 16 Mar 2006 16:39:40 -0500, "Biff" wrote: This seems to work: =TRIM(RIGHT(SUBSTITUTE(LEFT(A1&" ",FIND(" ",A1&" ",FIND("@",A1&" "))-1)," ",REPT(" ",LEN(A1))),LEN(A1))) Biff It will fail if the string happens to start with a dot <.. It also gives an erroneous result with: "It's too bad that John Jones is no longer with us" and various other legal email configurations. --ron |
Finding email addresses in cells
Ron Rosenfeld Wrote: Doesn't seem to work if the email address is at the very end of the string. --ron Hi ron, quite right. I believe that can be fixed with a small amendment =TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(" ",A1&" ",FIND("@",A1))-1)," ",REPT(" ",LEN(A1))),LEN(A1))) Biff, thanks also. I didn't consider multiple email addresses, I'll leave that to somebody else ;) -- daddylonglegs ------------------------------------------------------------------------ daddylonglegs's Profile: http://www.excelforum.com/member.php...o&userid=30486 View this thread: http://www.excelforum.com/showthread...hreadid=523235 |
Finding email addresses in cells
On Thu, 16 Mar 2006 18:13:53 -0500, "Biff" wrote:
Oh well! Where's your REGEX solution? I know I've seen it at least once. Biff Given all of the possible variations in legal email addresses, even a REGEX solution would be tough. Here's a partial one: =REGEX.MID(A1,"\b(\w|[._%-])+@(\w|[.-])+\.[A-Za-z]{2,4}\b") but it's not completely tested and it will not work if there are <space's in the address. Also it assumes that any terminal 2 to 4 letter code will be a valid top-level domain. In addition to that not always being the case, there are some top level domains that are more than four characters. One could substitute a pipe separated list of valid top level domains for the last portion of that regex. It should cover a number of common variations, however. For the OP, to use this formula you must first download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr --ron |
Finding email addresses in cells
Ron Rosenfeld wrote...
.... Given all of the possible variations in legal email addresses, even a REGEX solution would be tough. .... Impossible. Like it or not, internet e-mail descends from Unix e-mail, which means any valid home directory name (precisely, the first level subdirecories of /home) on a Unix-like system would be a valid recipient ID. Now expand that to encompass filesystems based on UNICODE text. As a practical, non-UNICODE matter, @, / and ASCII NUL can't be characters in e-mail address names, and *, ?, \ and the other control characters (ASCII decimal codes 1 through 31 plus 127) would be very unlikely. Anything else is fair game. Here's a partial one: =REGEX.MID(A1,"\b(\w|[._%-])+@(\w|[.-])+\.[A-Za-z]{2,4}\b") .... If you mean all 'word' characters plus periods, underscores, percent signs and dashes, better to use "\b[-A-Za-z0-9_.%]+\@([-A-Za-z0-9_]+\.)*[A-Za-z]+\b" which allows for valid degenerate e-mail addresses like foo@localhost . Don't use the alternation operator, |, unless absolutely necessary. It's a real performance drag in regexps. MUCH MORE EFFICIENT to use single (if longer) character classes. |
Finding email addresses in cells
On 16 Mar 2006 21:49:47 -0800, "Harlan Grove" wrote:
Ron Rosenfeld wrote... ... Given all of the possible variations in legal email addresses, even a REGEX solution would be tough. ... Impossible. Like it or not, internet e-mail descends from Unix e-mail, which means any valid home directory name (precisely, the first level subdirecories of /home) on a Unix-like system would be a valid recipient ID. Now expand that to encompass filesystems based on UNICODE text. Constructing regex's to match all legal email addresses seems to be a challenge taken up by some. I've seen one regex that is said to be 6000+ characters in length. As a practical, non-UNICODE matter, @, / and ASCII NUL can't be characters in e-mail address names, and *, ?, \ and the other control characters (ASCII decimal codes 1 through 31 plus 127) would be very unlikely. Anything else is fair game. Here's a partial one: =REGEX.MID(A1,"\b(\w|[._%-])+@(\w|[.-])+\.[A-Za-z]{2,4}\b") ... If you mean all 'word' characters plus periods, underscores, percent signs and dashes, better to use "\b[-A-Za-z0-9_.%]+\@([-A-Za-z0-9_]+\.)*[A-Za-z]+\b" which allows for valid degenerate e-mail addresses like foo@localhost . Don't use the alternation operator, |, unless absolutely necessary. It's a real performance drag in regexps. MUCH MORE EFFICIENT to use single (if longer) character classes. Thanks for the pointer about the pipe. In your regex, what is the significance of the "\" preceding the "@"? --ron |
Finding email addresses in cells
Ron Rosenfeld wrote...
.... In your regex, what is the significance of the "\" preceding the "@"? Probably unnecessary. I saw a Perl regexp from a web search for e-mail address syntax that showed \@, so I figured it might be needed in some regexp flavors. It may not be necessary, but it won't cause any harm. |
Finding email addresses in cells
On 17 Mar 2006 08:10:25 -0800, "Harlan Grove" wrote:
Ron Rosenfeld wrote... ... In your regex, what is the significance of the "\" preceding the "@"? Probably unnecessary. I saw a Perl regexp from a web search for e-mail address syntax that showed \@, so I figured it might be needed in some regexp flavors. It may not be necessary, but it won't cause any harm. OK, that's basically what I figured -- it wouldn't do any harm. I just hadn't run across it with "@" in the past. --ron |
Finding email addresses in cells
Some of these regex syntax's are the most cryptic things I've ever seen!
Biff "Ron Rosenfeld" wrote in message ... On 17 Mar 2006 08:10:25 -0800, "Harlan Grove" wrote: Ron Rosenfeld wrote... ... In your regex, what is the significance of the "\" preceding the "@"? Probably unnecessary. I saw a Perl regexp from a web search for e-mail address syntax that showed \@, so I figured it might be needed in some regexp flavors. It may not be necessary, but it won't cause any harm. OK, that's basically what I figured -- it wouldn't do any harm. I just hadn't run across it with "@" in the past. --ron |
Finding email addresses in cells
Biff wrote...
Some of these regex syntax's are the most cryptic things I've ever seen! .... You should try Einstein's tensor notation in his work on General Relativity. Or a few APL one-liners. Seriously, they may be a pain to learn, but they're far & away the most powerful means of dealing with text. You'll never fully appreciate them until you compare the mountains of procedural code you'd need to match certain common text patterns. For example, try finding US social security numbers in general strings: ###-##-#### with anything other than decimal numerals on either side. Regexp: "(.*\D)?(\d{3}-\d{2}-\d{4})(\D.*)?", where $2 matches the ssn. |
Finding email addresses in cells
On Fri, 17 Mar 2006 14:23:37 -0500, "Biff" wrote:
Some of these regex syntax's are the most cryptic things I've ever seen! Biff At least as implemented in Excel and VBA, once you get used to thinking in that language, they become relatively simple to follow. There are some flavors that are orders of magnitude more cryptic. --ron |
Finding email addresses in cells
Ron Rosenfeld wrote...
.... There are some flavors that are orders of magnitude more cryptic. Which? At least for Unix-originated and Unix-like software, there are basic REs (ed, plain grep), extended REs (awk, egrep), vim/sed REs, and Perl REs. In addition to DFA vs NFA engines, the differences between them boil down to whether certain metacharacters begin with backslashes and whether they provide noncapturing word boundaries, parenthesized subexpression backreferencing, nongreedy matching qualifiers and interval qualifiers. Only the Perl flavor provides general noncapturing assertions. If you know basic REs, these other flavors aren't all that mysterious. VBScript REs are almost Perl REs, complete with general noncapturing assertions. If you mean Word's text patterns, then I agree, but they're difficult to figure out because they're so nonstandard compared to the Unix-ish flavors. |
All times are GMT +1. The time now is 05:53 AM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com