ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Worksheet Functions (https://www.excelbanter.com/excel-worksheet-functions/)
-   -   Finding email addresses in cells (https://www.excelbanter.com/excel-worksheet-functions/77880-finding-email-addresses-cells.html)

Joey

Finding email addresses in cells
 
In one of the columns I have text with email addresses I need seperated.

Amongst the text (i.e. "please note that is no longer
employeed" - or - email is undeliverable")

Is there a formula that I can create to single out just the email address?
There's no set # of characters before or after the "@" - but I'm assuming I
could go by a SPACE to seperate the name@domain from the rest of the text.

Any suggestions?

The trim function only works in one direction (i.e. after the @) and I need
it to go both ways.

Thanks,
Joey

CLR

Finding email addresses in cells
 
You could do Data TextToColumns using SPACE as a delimiter and it should
break the entire email address out into it's own separate column as it has a
space both before and after with none in between.....

Vaya con Dios,
Chuck, CABGx3



"Joey" wrote:

In one of the columns I have text with email addresses I need seperated.

Amongst the text (i.e. "please note that is no longer
employeed" - or - email is undeliverable")

Is there a formula that I can create to single out just the email address?
There's no set # of characters before or after the "@" - but I'm assuming I
could go by a SPACE to seperate the name@domain from the rest of the text.

Any suggestions?

The trim function only works in one direction (i.e. after the @) and I need
it to go both ways.

Thanks,
Joey


Joey

Finding email addresses in cells
 
But that doesn't put the email address in one column, especially if there's
25 words in the text of the message. Text-to-columns using SPACE could just
spread out the message along 25 more columns, with NO idea where the email
address is.

I even tried text-to-columns with "@" as the seperator, but then I don't get
the name or the domain name.

Joey

"CLR" wrote:

You could do Data TextToColumns using SPACE as a delimiter and it should
break the entire email address out into it's own separate column as it has a
space both before and after with none in between.....

Vaya con Dios,
Chuck, CABGx3



"Joey" wrote:

In one of the columns I have text with email addresses I need seperated.

Amongst the text (i.e. "please note that is no longer
employeed" - or - email is undeliverable")

Is there a formula that I can create to single out just the email address?
There's no set # of characters before or after the "@" - but I'm assuming I
could go by a SPACE to seperate the name@domain from the rest of the text.

Any suggestions?

The trim function only works in one direction (i.e. after the @) and I need
it to go both ways.

Thanks,
Joey


daddylonglegs

Finding email addresses in cells
 

With your data in A1 try this formula

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(" ",A1,FIND("@",A1))-1)," ",REPT("
",LEN(A1))),LEN(A1)))


--
daddylonglegs
------------------------------------------------------------------------
daddylonglegs's Profile: http://www.excelforum.com/member.php...o&userid=30486
View this thread: http://www.excelforum.com/showthread...hreadid=523235


Ron Rosenfeld

Finding email addresses in cells
 
On Thu, 16 Mar 2006 14:03:01 -0600, daddylonglegs
wrote:


With your data in A1 try this formula

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(" ",A1,FIND("@",A1))-1)," ",REPT("
",LEN(A1))),LEN(A1)))


Doesn't seem to work if the email address is at the very end of the string.
--ron

Biff

Finding email addresses in cells
 
Crashes on entries like:

text text text text


Biff

"daddylonglegs"
wrote in message
news:daddylonglegs.24s3qp_1142539806.708@excelforu m-nospam.com...

With your data in A1 try this formula

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(" ",A1,FIND("@",A1))-1)," ",REPT("
",LEN(A1))),LEN(A1)))


--
daddylonglegs
------------------------------------------------------------------------
daddylonglegs's Profile:
http://www.excelforum.com/member.php...o&userid=30486
View this thread: http://www.excelforum.com/showthread...hreadid=523235




Biff

Finding email addresses in cells
 
This seems to work:

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1&" ",FIND(" ",A1&" ",FIND("@",A1&" "))-1),"
",REPT(" ",LEN(A1))),LEN(A1)))

Biff

"Biff" wrote in message
...
Crashes on entries like:

text text text text


Biff

"daddylonglegs"
wrote in
message news:daddylonglegs.24s3qp_1142539806.708@excelforu m-nospam.com...

With your data in A1 try this formula

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(" ",A1,FIND("@",A1))-1)," ",REPT("
",LEN(A1))),LEN(A1)))


--
daddylonglegs
------------------------------------------------------------------------
daddylonglegs's Profile:
http://www.excelforum.com/member.php...o&userid=30486
View this thread:
http://www.excelforum.com/showthread...hreadid=523235






Beege

Finding email addresses in cells
 
Joey, This works, too, but you need morefunc.dll (google search)

=TRIM(TEXTREVERSE(LEFT(TEXTREVERSE(LEFT(A1,(SEARCH ("
",A1,(SEARCH("@",A1,1)))-1))),(SEARCH(" ",TEXTREVERSE(LEFT(A1,(SEARCH("
",A1,(SEARCH("@",A1,1)))-1))),1)))))

Beege

"Joey" wrote in message
...
In one of the columns I have text with email addresses I need seperated.

Amongst the text (i.e. "please note that is no longer
employeed" - or - email is undeliverable")

Is there a formula that I can create to single out just the email address?
There's no set # of characters before or after the "@" - but I'm assuming
I
could go by a SPACE to seperate the name@domain from the rest of the text.

Any suggestions?

The trim function only works in one direction (i.e. after the @) and I
need
it to go both ways.

Thanks,
Joey




Ron Rosenfeld

Finding email addresses in cells
 
On Thu, 16 Mar 2006 16:39:40 -0500, "Biff" wrote:

This seems to work:

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1&" ",FIND(" ",A1&" ",FIND("@",A1&" "))-1),"
",REPT(" ",LEN(A1))),LEN(A1)))

Biff


It will fail if the string happens to start with a dot <..

It also gives an erroneous result with:

"It's too bad that John Jones is no longer with us"

and various other legal email configurations.






--ron

Biff

Finding email addresses in cells
 
Oh well!

Where's your REGEX solution? I know I've seen it at least once.

Biff

"Ron Rosenfeld" wrote in message
...
On Thu, 16 Mar 2006 16:39:40 -0500, "Biff" wrote:

This seems to work:

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1&" ",FIND(" ",A1&" ",FIND("@",A1&" "))-1),"
",REPT(" ",LEN(A1))),LEN(A1)))

Biff


It will fail if the string happens to start with a dot <..

It also gives an erroneous result with:

"It's too bad that John Jones is no longer with us"

and various other legal email configurations.






--ron




daddylonglegs

Finding email addresses in cells
 

Ron Rosenfeld Wrote:

Doesn't seem to work if the email address is at the very end of the
string.
--ron


Hi ron, quite right. I believe that can be fixed with a small
amendment

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(" ",A1&" ",FIND("@",A1))-1),"
",REPT(" ",LEN(A1))),LEN(A1)))

Biff, thanks also. I didn't consider multiple email addresses, I'll
leave that to somebody else ;)


--
daddylonglegs
------------------------------------------------------------------------
daddylonglegs's Profile: http://www.excelforum.com/member.php...o&userid=30486
View this thread: http://www.excelforum.com/showthread...hreadid=523235


Ron Rosenfeld

Finding email addresses in cells
 
On Thu, 16 Mar 2006 18:13:53 -0500, "Biff" wrote:

Oh well!

Where's your REGEX solution? I know I've seen it at least once.

Biff


Given all of the possible variations in legal email addresses, even a REGEX
solution would be tough.

Here's a partial one:

=REGEX.MID(A1,"\b(\w|[._%-])+@(\w|[.-])+\.[A-Za-z]{2,4}\b")

but it's not completely tested and it will not work if there are <space's in
the address. Also it assumes that any terminal 2 to 4 letter code will be a
valid top-level domain. In addition to that not always being the case, there
are some top level domains that are more than four characters. One could
substitute a pipe separated list of valid top level domains for the last
portion of that regex.

It should cover a number of common variations, however.

For the OP, to use this formula you must first download and install Longre's
free morefunc.xll add-in from http://xcell05.free.fr


--ron

Harlan Grove

Finding email addresses in cells
 
Ron Rosenfeld wrote...
....
Given all of the possible variations in legal email addresses, even a REGEX
solution would be tough.

....

Impossible. Like it or not, internet e-mail descends from Unix e-mail,
which means any valid home directory name (precisely, the first level
subdirecories of /home) on a Unix-like system would be a valid
recipient ID. Now expand that to encompass filesystems based on UNICODE
text.

As a practical, non-UNICODE matter, @, / and ASCII NUL can't be
characters in e-mail address names, and *, ?, \ and the other control
characters (ASCII decimal codes 1 through 31 plus 127) would be very
unlikely. Anything else is fair game.

Here's a partial one:

=REGEX.MID(A1,"\b(\w|[._%-])+@(\w|[.-])+\.[A-Za-z]{2,4}\b")

....

If you mean all 'word' characters plus periods, underscores, percent
signs and dashes, better to use

"\b[-A-Za-z0-9_.%]+\@([-A-Za-z0-9_]+\.)*[A-Za-z]+\b"

which allows for valid degenerate e-mail addresses like foo@localhost .
Don't use the alternation operator, |, unless absolutely necessary.
It's a real performance drag in regexps. MUCH MORE EFFICIENT to use
single (if longer) character classes.


Ron Rosenfeld

Finding email addresses in cells
 
On 16 Mar 2006 21:49:47 -0800, "Harlan Grove" wrote:

Ron Rosenfeld wrote...
...
Given all of the possible variations in legal email addresses, even a REGEX
solution would be tough.

...

Impossible. Like it or not, internet e-mail descends from Unix e-mail,
which means any valid home directory name (precisely, the first level
subdirecories of /home) on a Unix-like system would be a valid
recipient ID. Now expand that to encompass filesystems based on UNICODE
text.


Constructing regex's to match all legal email addresses seems to be a challenge
taken up by some. I've seen one regex that is said to be 6000+ characters in
length.


As a practical, non-UNICODE matter, @, / and ASCII NUL can't be
characters in e-mail address names, and *, ?, \ and the other control
characters (ASCII decimal codes 1 through 31 plus 127) would be very
unlikely. Anything else is fair game.

Here's a partial one:

=REGEX.MID(A1,"\b(\w|[._%-])+@(\w|[.-])+\.[A-Za-z]{2,4}\b")

...

If you mean all 'word' characters plus periods, underscores, percent
signs and dashes, better to use

"\b[-A-Za-z0-9_.%]+\@([-A-Za-z0-9_]+\.)*[A-Za-z]+\b"

which allows for valid degenerate e-mail addresses like foo@localhost .
Don't use the alternation operator, |, unless absolutely necessary.
It's a real performance drag in regexps. MUCH MORE EFFICIENT to use
single (if longer) character classes.


Thanks for the pointer about the pipe.

In your regex, what is the significance of the "\" preceding the "@"?


--ron

Harlan Grove

Finding email addresses in cells
 
Ron Rosenfeld wrote...
....
In your regex, what is the significance of the "\" preceding the "@"?


Probably unnecessary. I saw a Perl regexp from a web search for e-mail
address syntax that showed \@, so I figured it might be needed in some
regexp flavors. It may not be necessary, but it won't cause any harm.


Ron Rosenfeld

Finding email addresses in cells
 
On 17 Mar 2006 08:10:25 -0800, "Harlan Grove" wrote:

Ron Rosenfeld wrote...
...
In your regex, what is the significance of the "\" preceding the "@"?


Probably unnecessary. I saw a Perl regexp from a web search for e-mail
address syntax that showed \@, so I figured it might be needed in some
regexp flavors. It may not be necessary, but it won't cause any harm.


OK, that's basically what I figured -- it wouldn't do any harm. I just hadn't
run across it with "@" in the past.


--ron

Biff

Finding email addresses in cells
 
Some of these regex syntax's are the most cryptic things I've ever seen!

Biff

"Ron Rosenfeld" wrote in message
...
On 17 Mar 2006 08:10:25 -0800, "Harlan Grove" wrote:

Ron Rosenfeld wrote...
...
In your regex, what is the significance of the "\" preceding the "@"?


Probably unnecessary. I saw a Perl regexp from a web search for e-mail
address syntax that showed \@, so I figured it might be needed in some
regexp flavors. It may not be necessary, but it won't cause any harm.


OK, that's basically what I figured -- it wouldn't do any harm. I just
hadn't
run across it with "@" in the past.


--ron




Harlan Grove

Finding email addresses in cells
 
Biff wrote...
Some of these regex syntax's are the most cryptic things I've ever seen!

....

You should try Einstein's tensor notation in his work on General
Relativity.

Or a few APL one-liners.

Seriously, they may be a pain to learn, but they're far & away the most
powerful means of dealing with text. You'll never fully appreciate them
until you compare the mountains of procedural code you'd need to match
certain common text patterns. For example, try finding US social
security numbers in general strings: ###-##-#### with anything other
than decimal numerals on either side. Regexp:
"(.*\D)?(\d{3}-\d{2}-\d{4})(\D.*)?", where $2 matches the ssn.


Ron Rosenfeld

Finding email addresses in cells
 
On Fri, 17 Mar 2006 14:23:37 -0500, "Biff" wrote:

Some of these regex syntax's are the most cryptic things I've ever seen!

Biff



At least as implemented in Excel and VBA, once you get used to thinking in that
language, they become relatively simple to follow.

There are some flavors that are orders of magnitude more cryptic.
--ron

Harlan Grove

Finding email addresses in cells
 
Ron Rosenfeld wrote...
....
There are some flavors that are orders of magnitude more cryptic.


Which?

At least for Unix-originated and Unix-like software, there are basic
REs (ed, plain grep), extended REs (awk, egrep), vim/sed REs, and Perl
REs. In addition to DFA vs NFA engines, the differences between them
boil down to whether certain metacharacters begin with backslashes and
whether they provide noncapturing word boundaries, parenthesized
subexpression backreferencing, nongreedy matching qualifiers and
interval qualifiers. Only the Perl flavor provides general noncapturing
assertions. If you know basic REs, these other flavors aren't all that
mysterious. VBScript REs are almost Perl REs, complete with general
noncapturing assertions.

If you mean Word's text patterns, then I agree, but they're difficult
to figure out because they're so nonstandard compared to the Unix-ish
flavors.



All times are GMT +1. The time now is 05:53 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com