Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.misc
|
|||
|
|||
Extracting text
I am copying a number of columns from a PDF file and pasting in Excel. It
pastes with spaces between words and between columns. The columns look like this: # TEXT (between 2 and 6 words) # # # # # (where each # means there is a number in that column.) i.e. 103 Deer Park CDP 23 476 9,435 334 234 103 Harbor on the Hill Village 34 543 12,345 332 555 Converting text to columns is a problem since there are differences in the number of words. What I would like to know is: Is there a way to first extract all the text from each line and put it into another column and then I can do convert text to columns on the rest of it. |
#2
Posted to microsoft.public.excel.misc
|
|||
|
|||
Extracting text
Try this:
A1 = 103 Deer Park CDP 23 476 9,435 334 234 B1 = =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))-4))-1) C1 = =SUBSTITUTE(A1,B1&" ","") You can convert the formula in C1 to a constant by EditCopy then EditPaste SpecialValues then do Text to ColumnsDelimitedSpace Biff "Guy Lydig" wrote in message ... I am copying a number of columns from a PDF file and pasting in Excel. It pastes with spaces between words and between columns. The columns look like this: # TEXT (between 2 and 6 words) # # # # # (where each # means there is a number in that column.) i.e. 103 Deer Park CDP 23 476 9,435 334 234 103 Harbor on the Hill Village 34 543 12,345 332 555 Converting text to columns is a problem since there are differences in the number of words. What I would like to know is: Is there a way to first extract all the text from each line and put it into another column and then I can do convert text to columns on the rest of it. |
#3
Posted to microsoft.public.excel.misc
|
|||
|
|||
Extracting text
The formula for B1 may have been affected by line wrap depending on what
application you use to view these forums. Here it is again in chunks so that line wrap won't mess it up: =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1) -LEN(SUBSTITUTE(A1," ",""))-4))-1) Line wrap always breaks at a space character and there are intentional spaces in the formula. Biff "T. Valko" wrote in message ... Try this: A1 = 103 Deer Park CDP 23 476 9,435 334 234 B1 = =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))-4))-1) C1 = =SUBSTITUTE(A1,B1&" ","") You can convert the formula in C1 to a constant by EditCopy then EditPaste SpecialValues then do Text to ColumnsDelimitedSpace Biff "Guy Lydig" wrote in message ... I am copying a number of columns from a PDF file and pasting in Excel. It pastes with spaces between words and between columns. The columns look like this: # TEXT (between 2 and 6 words) # # # # # (where each # means there is a number in that column.) i.e. 103 Deer Park CDP 23 476 9,435 334 234 103 Harbor on the Hill Village 34 543 12,345 332 555 Converting text to columns is a problem since there are differences in the number of words. What I would like to know is: Is there a way to first extract all the text from each line and put it into another column and then I can do convert text to columns on the rest of it. |
#4
Posted to microsoft.public.excel.misc
|
|||
|
|||
Extracting text
The function (although I have no clue what it's doing) works insofar as it
pulls out the number in the first column together with the text. Two problems: 1) Sometimes I get a #VALUE error message. 2) The major problem is that I want the function to eliminate that text from the first cell. Ideas? "T. Valko" wrote: The formula for B1 may have been affected by line wrap depending on what application you use to view these forums. Here it is again in chunks so that line wrap won't mess it up: =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1) -LEN(SUBSTITUTE(A1," ",""))-4))-1) Line wrap always breaks at a space character and there are intentional spaces in the formula. Biff "T. Valko" wrote in message ... Try this: A1 = 103 Deer Park CDP 23 476 9,435 334 234 B1 = =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))-4))-1) C1 = =SUBSTITUTE(A1,B1&" ","") You can convert the formula in C1 to a constant by EditCopy then EditPaste SpecialValues then do Text to ColumnsDelimitedSpace Biff "Guy Lydig" wrote in message ... I am copying a number of columns from a PDF file and pasting in Excel. It pastes with spaces between words and between columns. The columns look like this: # TEXT (between 2 and 6 words) # # # # # (where each # means there is a number in that column.) i.e. 103 Deer Park CDP 23 476 9,435 334 234 103 Harbor on the Hill Village 34 543 12,345 332 555 Converting text to columns is a problem since there are differences in the number of words. What I would like to know is: Is there a way to first extract all the text from each line and put it into another column and then I can do convert text to columns on the rest of it. |
#5
Posted to microsoft.public.excel.misc
|
|||
|
|||
Extracting text
Instead of eliminating the text from the first cell, is there a way to put
the remainder of the cell's contents (without the text) into a separate cell? "Guy Lydig" wrote: The function (although I have no clue what it's doing) works insofar as it pulls out the number in the first column together with the text. Two problems: 1) Sometimes I get a #VALUE error message. 2) The major problem is that I want the function to eliminate that text from the first cell. Ideas? "T. Valko" wrote: The formula for B1 may have been affected by line wrap depending on what application you use to view these forums. Here it is again in chunks so that line wrap won't mess it up: =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1) -LEN(SUBSTITUTE(A1," ",""))-4))-1) Line wrap always breaks at a space character and there are intentional spaces in the formula. Biff "T. Valko" wrote in message ... Try this: A1 = 103 Deer Park CDP 23 476 9,435 334 234 B1 = =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))-4))-1) C1 = =SUBSTITUTE(A1,B1&" ","") You can convert the formula in C1 to a constant by EditCopy then EditPaste SpecialValues then do Text to ColumnsDelimitedSpace Biff "Guy Lydig" wrote in message ... I am copying a number of columns from a PDF file and pasting in Excel. It pastes with spaces between words and between columns. The columns look like this: # TEXT (between 2 and 6 words) # # # # # (where each # means there is a number in that column.) i.e. 103 Deer Park CDP 23 476 9,435 334 234 103 Harbor on the Hill Village 34 543 12,345 332 555 Converting text to columns is a problem since there are differences in the number of words. What I would like to know is: Is there a way to first extract all the text from each line and put it into another column and then I can do convert text to columns on the rest of it. |
#6
Posted to microsoft.public.excel.misc
|
|||
|
|||
Extracting text
One way - replace each of 0-9 followed/preceded by space with 0-9
followed/preceded by semicolon... 103;Deer Park CDP;23;476;9,435;334;234; 103;Harbor on the Hill Village;34;543;12,345;332;555; Then do Text-To-Columns with semicolon. To automate the first part try this script (Alt+F11), paste into immediate window and select last line then press Enter to execute.. for i = 0 to 9: _ selection.replace i & " ",i & ";": _ selection.replace " " & i,";" & i: _ next i On 29 Mar, 16:16, Guy Lydig wrote: Instead of eliminating the text from the first cell, is there a way to put the remainder of the cell's contents (without the text) into a separate cell? "Guy Lydig" wrote: The function (although I have no clue what it's doing) works insofar as it pulls out the number in the first column together with the text. Two problems: 1) Sometimes I get a #VALUE error message. 2) The major problem is that I want the function to eliminate that text from the first cell. Ideas? "T. Valko" wrote: The formula for B1 may have been affected by line wrap depending on what application you use to view these forums. Here it is again in chunks so that line wrap won't mess it up: =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1) -LEN(SUBSTITUTE(A1," ",""))-4))-1) Line wrap always breaks at a space character and there are intentional spaces in the formula. Biff "T. Valko" wrote in message ... Try this: A1 = 103 Deer Park CDP 23 476 9,435 334 234 B1 = =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))-4))-1) C1 = =SUBSTITUTE(A1,B1&" ","") You can convert the formula in C1 to a constant by EditCopy then EditPaste SpecialValues then do Text to ColumnsDelimitedSpace Biff "Guy Lydig" wrote in message ... I am copying a number of columns from a PDF file and pasting in Excel. It pastes with spaces between words and between columns. The columns look like this: # TEXT (between 2 and 6 words) # # # # # (where each # means there is a number in that column.) i.e. 103 Deer Park CDP 23 476 9,435 334 234 103 Harbor on the Hill Village 34 543 12,345 332 555 Converting text to columns is a problem since there are differences in the number of words. What I would like to know is: Is there a way to first extract all the text from each line and put it into another column and then I can do convert text to columns on the rest of it.- Hide quoted text - - Show quoted text - |
#7
Posted to microsoft.public.excel.misc
|
|||
|
|||
Extracting text
Sometimes I get a #VALUE error message.
The formulas work on the *2* samples you posted. If they doesn't work on some entries then those entries must have a different format than the *2* posted samples. Biff "Guy Lydig" wrote in message ... The function (although I have no clue what it's doing) works insofar as it pulls out the number in the first column together with the text. Two problems: 1) Sometimes I get a #VALUE error message. 2) The major problem is that I want the function to eliminate that text from the first cell. Ideas? "T. Valko" wrote: The formula for B1 may have been affected by line wrap depending on what application you use to view these forums. Here it is again in chunks so that line wrap won't mess it up: =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1) -LEN(SUBSTITUTE(A1," ",""))-4))-1) Line wrap always breaks at a space character and there are intentional spaces in the formula. Biff "T. Valko" wrote in message ... Try this: A1 = 103 Deer Park CDP 23 476 9,435 334 234 B1 = =LEFT(A1,FIND("~",SUBSTITUTE(A1," ","~",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))-4))-1) C1 = =SUBSTITUTE(A1,B1&" ","") You can convert the formula in C1 to a constant by EditCopy then EditPaste SpecialValues then do Text to ColumnsDelimitedSpace Biff "Guy Lydig" wrote in message ... I am copying a number of columns from a PDF file and pasting in Excel. It pastes with spaces between words and between columns. The columns look like this: # TEXT (between 2 and 6 words) # # # # # (where each # means there is a number in that column.) i.e. 103 Deer Park CDP 23 476 9,435 334 234 103 Harbor on the Hill Village 34 543 12,345 332 555 Converting text to columns is a problem since there are differences in the number of words. What I would like to know is: Is there a way to first extract all the text from each line and put it into another column and then I can do convert text to columns on the rest of it. |
#8
Posted to microsoft.public.excel.misc
|
|||
|
|||
Extracting text
On Wed, 28 Mar 2007 22:14:04 -0700, Guy Lydig
wrote: 103 Deer Park CDP 23 476 9,435 334 234 103 Harbor on the Hill Village 34 543 12,345 332 555 To do this with formulas, you can download and install Longre's free and easily distributable morefunc.xll add-in from http://xcell05.free.fr/ Then, with your data in A1: Text: B1: =TRIM(REGEX.MID($A1,"\D+")) Numbers C1: =REGEX.MID($A1,"\b[^a-z\s]+\b",COLUMNS($A:A),FALSE) and copy/drag across to H1. --ron |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Extracting Text | Excel Worksheet Functions | |||
Extracting Text only | Excel Worksheet Functions | |||
Extracting Text from the right | Excel Worksheet Functions | |||
extracting text only | Excel Worksheet Functions | |||
EXTRACTING TEXT | Excel Discussion (Misc queries) |