ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Discussion (Misc queries) (https://www.excelbanter.com/excel-discussion-misc-queries/)
-   -   Extracting text (https://www.excelbanter.com/excel-discussion-misc-queries/137018-extracting-text.html)

Guy Lydig

Extracting text
 
I am copying a number of columns from a PDF file and pasting in Excel. It
pastes with spaces between words and between columns. The columns look like
this:
# TEXT (between 2 and 6 words) # # # # # (where each # means there is a
number in that column.)

i.e.
103 Deer Park CDP 23 476 9,435 334 234
103 Harbor on the Hill Village 34 543 12,345 332 555

Converting text to columns is a problem since there are differences in the
number of words. What I would like to know is:

Is there a way to first extract all the text from each line and put it into
another column and then I can do convert text to columns on the rest of it.

T. Valko

Extracting text
 
Try this:

A1 = 103 Deer Park CDP 23 476 9,435 334 234

B1 =

=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1,"
",""))-4))-1)

C1 =

=SUBSTITUTE(A1,B1&" ","")

You can convert the formula in C1 to a constant by EditCopy then EditPaste
SpecialValues then do Text to ColumnsDelimitedSpace

Biff

"Guy Lydig" wrote in message
...
I am copying a number of columns from a PDF file and pasting in Excel. It
pastes with spaces between words and between columns. The columns look
like
this:
# TEXT (between 2 and 6 words) # # # # # (where each # means there is a
number in that column.)

i.e.
103 Deer Park CDP 23 476 9,435 334 234
103 Harbor on the Hill Village 34 543 12,345 332 555

Converting text to columns is a problem since there are differences in the
number of words. What I would like to know is:

Is there a way to first extract all the text from each line and put it
into
another column and then I can do convert text to columns on the rest of
it.




T. Valko

Extracting text
 
The formula for B1 may have been affected by line wrap depending on what
application you use to view these forums. Here it is again in chunks so that
line wrap won't mess it up:

=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)
-LEN(SUBSTITUTE(A1," ",""))-4))-1)

Line wrap always breaks at a space character and there are intentional
spaces in the formula.

Biff

"T. Valko" wrote in message
...
Try this:

A1 = 103 Deer Park CDP 23 476 9,435 334 234

B1 =

=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1,"
",""))-4))-1)

C1 =

=SUBSTITUTE(A1,B1&" ","")

You can convert the formula in C1 to a constant by EditCopy then
EditPaste SpecialValues then do Text to ColumnsDelimitedSpace

Biff

"Guy Lydig" wrote in message
...
I am copying a number of columns from a PDF file and pasting in Excel. It
pastes with spaces between words and between columns. The columns look
like
this:
# TEXT (between 2 and 6 words) # # # # # (where each # means there is a
number in that column.)

i.e.
103 Deer Park CDP 23 476 9,435 334 234
103 Harbor on the Hill Village 34 543 12,345 332 555

Converting text to columns is a problem since there are differences in
the
number of words. What I would like to know is:

Is there a way to first extract all the text from each line and put it
into
another column and then I can do convert text to columns on the rest of
it.






Guy Lydig

Extracting text
 
The function (although I have no clue what it's doing) works insofar as it
pulls out the number in the first column together with the text.

Two problems:
1) Sometimes I get a #VALUE error message.

2) The major problem is that I want the function to eliminate that text from
the first cell.

Ideas?

"T. Valko" wrote:

The formula for B1 may have been affected by line wrap depending on what
application you use to view these forums. Here it is again in chunks so that
line wrap won't mess it up:

=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)
-LEN(SUBSTITUTE(A1," ",""))-4))-1)

Line wrap always breaks at a space character and there are intentional
spaces in the formula.

Biff

"T. Valko" wrote in message
...
Try this:

A1 = 103 Deer Park CDP 23 476 9,435 334 234

B1 =

=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1,"
",""))-4))-1)

C1 =

=SUBSTITUTE(A1,B1&" ","")

You can convert the formula in C1 to a constant by EditCopy then
EditPaste SpecialValues then do Text to ColumnsDelimitedSpace

Biff

"Guy Lydig" wrote in message
...
I am copying a number of columns from a PDF file and pasting in Excel. It
pastes with spaces between words and between columns. The columns look
like
this:
# TEXT (between 2 and 6 words) # # # # # (where each # means there is a
number in that column.)

i.e.
103 Deer Park CDP 23 476 9,435 334 234
103 Harbor on the Hill Village 34 543 12,345 332 555

Converting text to columns is a problem since there are differences in
the
number of words. What I would like to know is:

Is there a way to first extract all the text from each line and put it
into
another column and then I can do convert text to columns on the rest of
it.







Guy Lydig

Extracting text
 
Instead of eliminating the text from the first cell, is there a way to put
the remainder of the cell's contents (without the text) into a separate cell?

"Guy Lydig" wrote:

The function (although I have no clue what it's doing) works insofar as it
pulls out the number in the first column together with the text.

Two problems:
1) Sometimes I get a #VALUE error message.

2) The major problem is that I want the function to eliminate that text from
the first cell.

Ideas?

"T. Valko" wrote:

The formula for B1 may have been affected by line wrap depending on what
application you use to view these forums. Here it is again in chunks so that
line wrap won't mess it up:

=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)
-LEN(SUBSTITUTE(A1," ",""))-4))-1)

Line wrap always breaks at a space character and there are intentional
spaces in the formula.

Biff

"T. Valko" wrote in message
...
Try this:

A1 = 103 Deer Park CDP 23 476 9,435 334 234

B1 =

=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1,"
",""))-4))-1)

C1 =

=SUBSTITUTE(A1,B1&" ","")

You can convert the formula in C1 to a constant by EditCopy then
EditPaste SpecialValues then do Text to ColumnsDelimitedSpace

Biff

"Guy Lydig" wrote in message
...
I am copying a number of columns from a PDF file and pasting in Excel. It
pastes with spaces between words and between columns. The columns look
like
this:
# TEXT (between 2 and 6 words) # # # # # (where each # means there is a
number in that column.)

i.e.
103 Deer Park CDP 23 476 9,435 334 234
103 Harbor on the Hill Village 34 543 12,345 332 555

Converting text to columns is a problem since there are differences in
the
number of words. What I would like to know is:

Is there a way to first extract all the text from each line and put it
into
another column and then I can do convert text to columns on the rest of
it.






Lori

Extracting text
 
One way - replace each of 0-9 followed/preceded by space with 0-9
followed/preceded by semicolon...

103;Deer Park CDP;23;476;9,435;334;234;
103;Harbor on the Hill Village;34;543;12,345;332;555;

Then do Text-To-Columns with semicolon.

To automate the first part try this script (Alt+F11), paste into
immediate window and select last line then press Enter to execute..

for i = 0 to 9: _
selection.replace i & " ",i & ";": _
selection.replace " " & i,";" & i: _
next i


On 29 Mar, 16:16, Guy Lydig
wrote:
Instead of eliminating the text from the first cell, is there a way to put
the remainder of the cell's contents (without the text) into a separate cell?



"Guy Lydig" wrote:
The function (although I have no clue what it's doing) works insofar as it
pulls out the number in the first column together with the text.


Two problems:
1) Sometimes I get a #VALUE error message.


2) The major problem is that I want the function to eliminate that text from
the first cell.


Ideas?


"T. Valko" wrote:


The formula for B1 may have been affected by line wrap depending on what
application you use to view these forums. Here it is again in chunks so that
line wrap won't mess it up:


=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)
-LEN(SUBSTITUTE(A1," ",""))-4))-1)


Line wrap always breaks at a space character and there are intentional
spaces in the formula.


Biff


"T. Valko" wrote in message
...
Try this:


A1 = 103 Deer Park CDP 23 476 9,435 334 234


B1 =


=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1,"
",""))-4))-1)


C1 =


=SUBSTITUTE(A1,B1&" ","")


You can convert the formula in C1 to a constant by EditCopy then
EditPaste SpecialValues then do Text to ColumnsDelimitedSpace


Biff


"Guy Lydig" wrote in message
...
I am copying a number of columns from a PDF file and pasting in Excel. It
pastes with spaces between words and between columns. The columns look
like
this:
# TEXT (between 2 and 6 words) # # # # # (where each # means there is a
number in that column.)


i.e.
103 Deer Park CDP 23 476 9,435 334 234
103 Harbor on the Hill Village 34 543 12,345 332 555


Converting text to columns is a problem since there are differences in
the
number of words. What I would like to know is:


Is there a way to first extract all the text from each line and put it
into
another column and then I can do convert text to columns on the rest of
it.- Hide quoted text -


- Show quoted text -




T. Valko

Extracting text
 
Sometimes I get a #VALUE error message.

The formulas work on the *2* samples you posted. If they doesn't work on
some entries then those entries must have a different format than the *2*
posted samples.

Biff

"Guy Lydig" wrote in message
...
The function (although I have no clue what it's doing) works insofar as it
pulls out the number in the first column together with the text.

Two problems:
1) Sometimes I get a #VALUE error message.

2) The major problem is that I want the function to eliminate that text
from
the first cell.

Ideas?

"T. Valko" wrote:

The formula for B1 may have been affected by line wrap depending on what
application you use to view these forums. Here it is again in chunks so
that
line wrap won't mess it up:

=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)
-LEN(SUBSTITUTE(A1," ",""))-4))-1)

Line wrap always breaks at a space character and there are intentional
spaces in the formula.

Biff

"T. Valko" wrote in message
...
Try this:

A1 = 103 Deer Park CDP 23 476 9,435 334 234

B1 =

=LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1,"
",""))-4))-1)

C1 =

=SUBSTITUTE(A1,B1&" ","")

You can convert the formula in C1 to a constant by EditCopy then
EditPaste SpecialValues then do Text to ColumnsDelimitedSpace

Biff

"Guy Lydig" wrote in message
...
I am copying a number of columns from a PDF file and pasting in Excel.
It
pastes with spaces between words and between columns. The columns look
like
this:
# TEXT (between 2 and 6 words) # # # # # (where each # means there is
a
number in that column.)

i.e.
103 Deer Park CDP 23 476 9,435 334 234
103 Harbor on the Hill Village 34 543 12,345 332 555

Converting text to columns is a problem since there are differences in
the
number of words. What I would like to know is:

Is there a way to first extract all the text from each line and put it
into
another column and then I can do convert text to columns on the rest
of
it.








Ron Rosenfeld

Extracting text
 
On Wed, 28 Mar 2007 22:14:04 -0700, Guy Lydig
wrote:

103 Deer Park CDP 23 476 9,435 334 234
103 Harbor on the Hill Village 34 543 12,345 332 555


To do this with formulas, you can download and install Longre's free and easily
distributable morefunc.xll add-in from http://xcell05.free.fr/

Then, with your data in A1:

Text:
B1: =TRIM(REGEX.MID($A1,"\D+"))

Numbers
C1: =REGEX.MID($A1,"\b[^a-z\s]+\b",COLUMNS($A:A),FALSE)

and copy/drag across to H1.


--ron


All times are GMT +1. The time now is 06:52 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
ExcelBanter.com