Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
How do I remove HTML markup from and cell.
i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#2
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
I think this will get you most of the way the
EditReplace: Find What: <* Replace with: (this should be blank) Click the [Replace] button until you're sure it's doing what you want. Then, click the [Replace All] button. That will leave some extraneous spaces, so do another Find/Replace for double spaces and, again, replace them with nothing. Does that help? *********** Regards, Ron "Stuart" wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#3
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Thanks for the quick reply.
I have over 2500 products to do so that would take some finding and replacing. Is there no formula or script that I can use to do this automatically. Regards Stuart "Ron Coderre" wrote: I think this will get you most of the way the EditReplace: Find What: <* Replace with: (this should be blank) Click the [Replace] button until you're sure it's doing what you want. Then, click the [Replace All] button. That will leave some extraneous spaces, so do another Find/Replace for double spaces and, again, replace them with nothing. Does that help? *********** Regards, Ron "Stuart" wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#4
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
I have just tried that and it gives me the error message "formula to long"
"Stuart" wrote: Thanks for the quick reply. I have over 2500 products to do so that would take some finding and replacing. Is there no formula or script that I can use to do this automatically. Regards Stuart "Ron Coderre" wrote: I think this will get you most of the way the EditReplace: Find What: <* Replace with: (this should be blank) Click the [Replace] button until you're sure it's doing what you want. Then, click the [Replace All] button. That will leave some extraneous spaces, so do another Find/Replace for double spaces and, again, replace them with nothing. Does that help? *********** Regards, Ron "Stuart" wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#5
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
OK...more information is good....but...
How about: Q: Are all 2500 products in one sheet? If yes, you don't have to Edit/Replace one at a time; you can click the [Replace all] button. Did you do that? Q: Are they all in one workbook? If yes, group the sheets together (Select the first sheet, hold down the [Shift] key, then select the last sheet). Then, do the same Edit/Replace as above. Q: Multiple workbooks? How many workbooks? If only a few, then use the above technique on each. If many...well...let's see if we need to cross that bridge. Does that help? *********** Regards, Ron "Stuart" wrote: Thanks for the quick reply. I have over 2500 products to do so that would take some finding and replacing. Is there no formula or script that I can use to do this automatically. Regards Stuart "Ron Coderre" wrote: I think this will get you most of the way the EditReplace: Find What: <* Replace with: (this should be blank) Click the [Replace] button until you're sure it's doing what you want. Then, click the [Replace All] button. That will leave some extraneous spaces, so do another Find/Replace for double spaces and, again, replace them with nothing. Does that help? *********** Regards, Ron "Stuart" wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#6
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
All 2500 products are on one sheet, if you want an example i can give you an
example but it will take up alot off room in here and can out it on the server as a download so you can see what Im talking about Regards Stuart "Ron Coderre" wrote: OK...more information is good....but... How about: Q: Are all 2500 products in one sheet? If yes, you don't have to Edit/Replace one at a time; you can click the [Replace all] button. Did you do that? Q: Are they all in one workbook? If yes, group the sheets together (Select the first sheet, hold down the [Shift] key, then select the last sheet). Then, do the same Edit/Replace as above. Q: Multiple workbooks? How many workbooks? If only a few, then use the above technique on each. If many...well...let's see if we need to cross that bridge. Does that help? *********** Regards, Ron "Stuart" wrote: Thanks for the quick reply. I have over 2500 products to do so that would take some finding and replacing. Is there no formula or script that I can use to do this automatically. Regards Stuart "Ron Coderre" wrote: I think this will get you most of the way the EditReplace: Find What: <* Replace with: (this should be blank) Click the [Replace] button until you're sure it's doing what you want. Then, click the [Replace All] button. That will leave some extraneous spaces, so do another Find/Replace for double spaces and, again, replace them with nothing. Does that help? *********** Regards, Ron "Stuart" wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#7
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
How many characters are in the cell? Is there something about the cell to
make Excel think it's a formula (plus sign, equal sign, etc)? (Anybody else have thoughts about this?) Does that help? *********** Regards, Ron "Stuart" wrote: I have just tried that and it gives me the error message "formula to long" "Stuart" wrote: Thanks for the quick reply. I have over 2500 products to do so that would take some finding and replacing. Is there no formula or script that I can use to do this automatically. Regards Stuart "Ron Coderre" wrote: I think this will get you most of the way the EditReplace: Find What: <* Replace with: (this should be blank) Click the [Replace] button until you're sure it's doing what you want. Then, click the [Replace All] button. That will leave some extraneous spaces, so do another Find/Replace for double spaces and, again, replace them with nothing. Does that help? *********** Regards, Ron "Stuart" wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#8
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
www.underonelink.com/help.xls
"Stuart" wrote: All 2500 products are on one sheet, if you want an example i can give you an example but it will take up alot off room in here and can out it on the server as a download so you can see what Im talking about Regards Stuart "Ron Coderre" wrote: OK...more information is good....but... How about: Q: Are all 2500 products in one sheet? If yes, you don't have to Edit/Replace one at a time; you can click the [Replace all] button. Did you do that? Q: Are they all in one workbook? If yes, group the sheets together (Select the first sheet, hold down the [Shift] key, then select the last sheet). Then, do the same Edit/Replace as above. Q: Multiple workbooks? How many workbooks? If only a few, then use the above technique on each. If many...well...let's see if we need to cross that bridge. Does that help? *********** Regards, Ron "Stuart" wrote: Thanks for the quick reply. I have over 2500 products to do so that would take some finding and replacing. Is there no formula or script that I can use to do this automatically. Regards Stuart "Ron Coderre" wrote: I think this will get you most of the way the EditReplace: Find What: <* Replace with: (this should be blank) Click the [Replace] button until you're sure it's doing what you want. Then, click the [Replace All] button. That will leave some extraneous spaces, so do another Find/Replace for double spaces and, again, replace them with nothing. Does that help? *********** Regards, Ron "Stuart" wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#9
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Yikes! I just checked the file.
Each of those cells has many thousands of characters.....evidently, too many for Find/Replace to handle. I'm going to hope that somebody has already programmed a solution, because so far, I haven't come up with any tricks that have worked. *********** Regards, Ron "Stuart" wrote: www.underonelink.com/help.xls "Stuart" wrote: All 2500 products are on one sheet, if you want an example i can give you an example but it will take up alot off room in here and can out it on the server as a download so you can see what Im talking about Regards Stuart "Ron Coderre" wrote: OK...more information is good....but... How about: Q: Are all 2500 products in one sheet? If yes, you don't have to Edit/Replace one at a time; you can click the [Replace all] button. Did you do that? Q: Are they all in one workbook? If yes, group the sheets together (Select the first sheet, hold down the [Shift] key, then select the last sheet). Then, do the same Edit/Replace as above. Q: Multiple workbooks? How many workbooks? If only a few, then use the above technique on each. If many...well...let's see if we need to cross that bridge. Does that help? *********** Regards, Ron "Stuart" wrote: Thanks for the quick reply. I have over 2500 products to do so that would take some finding and replacing. Is there no formula or script that I can use to do this automatically. Regards Stuart "Ron Coderre" wrote: I think this will get you most of the way the EditReplace: Find What: <* Replace with: (this should be blank) Click the [Replace] button until you're sure it's doing what you want. Then, click the [Replace All] button. That will leave some extraneous spaces, so do another Find/Replace for double spaces and, again, replace them with nothing. Does that help? *********** Regards, Ron "Stuart" wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#10
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Hi Ron
Thanks for your help so far. Much appreciated. Just hope someone has already done this like you said. Fingers crossed. Thanks again Regards Stuart "Ron Coderre" wrote: Yikes! I just checked the file. Each of those cells has many thousands of characters.....evidently, too many for Find/Replace to handle. I'm going to hope that somebody has already programmed a solution, because so far, I haven't come up with any tricks that have worked. *********** Regards, Ron "Stuart" wrote: www.underonelink.com/help.xls "Stuart" wrote: All 2500 products are on one sheet, if you want an example i can give you an example but it will take up alot off room in here and can out it on the server as a download so you can see what Im talking about Regards Stuart "Ron Coderre" wrote: OK...more information is good....but... How about: Q: Are all 2500 products in one sheet? If yes, you don't have to Edit/Replace one at a time; you can click the [Replace all] button. Did you do that? Q: Are they all in one workbook? If yes, group the sheets together (Select the first sheet, hold down the [Shift] key, then select the last sheet). Then, do the same Edit/Replace as above. Q: Multiple workbooks? How many workbooks? If only a few, then use the above technique on each. If many...well...let's see if we need to cross that bridge. Does that help? *********** Regards, Ron "Stuart" wrote: Thanks for the quick reply. I have over 2500 products to do so that would take some finding and replacing. Is there no formula or script that I can use to do this automatically. Regards Stuart "Ron Coderre" wrote: I think this will get you most of the way the EditReplace: Find What: <* Replace with: (this should be blank) Click the [Replace] button until you're sure it's doing what you want. Then, click the [Replace All] button. That will leave some extraneous spaces, so do another Find/Replace for double spaces and, again, replace them with nothing. Does that help? *********** Regards, Ron "Stuart" wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#11
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Since you're fixing a .csv file, maybe excel isn't the best choice.
If you visit shareware.com, you can find lots of programs to clean up HTML files. Stuart wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart -- Dave Peterson |
#12
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Not sure if I'm on the right track here with this suggestion ... If the HTML is being copied from the web, copy if to Notepad first, then copy from Notepad into Excel. Notepad will clean it up and get rid of the HTML code. Kaye On Tue, 22 Nov 2005 17:05:05 -0800, Stuart wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#13
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
The csv file is a database feed from our e-commerce site, so adding them one
by one to notepad to remove the html would take a long time. Regards Stuart "KC" wrote: Not sure if I'm on the right track here with this suggestion ... If the HTML is being copied from the web, copy if to Notepad first, then copy from Notepad into Excel. Notepad will clean it up and get rid of the HTML code. Kaye On Tue, 22 Nov 2005 17:05:05 -0800, Stuart wrote: How do I remove HTML markup from and cell. i.e. <DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0 <TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR <TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment </LI</UL</P</DIV I want it to read:- SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers dimensions: 75 x 46cm When using a changing mat on a dresser or other raised surface, never leave your baby unattended even for a moment I am trying to change out website csv file into a froogle csv file but without the HTML Thanks in advance Stuart |
#14
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Hi Stuart,
I would go to Steve Miller's http://www.stevemiller.net/apps/ and install Pure Text then you can copy the HTML from an HTML page click on the PT system taskbar button, and paste into notepad. If you start from the Excel sheet, copy from the column, Paste into notepad, save as HTML (junk.htm) from notepad, open the junk.htm file with your browser, copy the everything on the sheet (Ctrl+A, Ctrl+C) then paste into notepad and you will have your text. After you have installed and used Pure Text the process should take about 3 minutes to convert in that manner. --- HTH, David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001] My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm Search Page: http://www.mvps.org/dmcritchie/excel/search.htm |
#15
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Hi David.
Thanks for that, the only problem with that is it leaves lots of blank cells when pasted back into excel. I've even tried saving the spreadsheet as csv and text file to see if that would help when pasted back into excel. No luck there. Any more ideas greatly accepted. Regards Stuar "David McRitchie" wrote: Hi Stuart, I would go to Steve Miller's http://www.stevemiller.net/apps/ and install Pure Text then you can copy the HTML from an HTML page click on the PT system taskbar button, and paste into notepad. If you start from the Excel sheet, copy from the column, Paste into notepad, save as HTML (junk.htm) from notepad, open the junk.htm file with your browser, copy the everything on the sheet (Ctrl+A, Ctrl+C) then paste into notepad and you will have your text. After you have installed and used Pure Text the process should take about 3 minutes to convert in that manner. --- HTH, David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001] My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm Search Page: http://www.mvps.org/dmcritchie/excel/search.htm . |
#16
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Hi, Stuart
See if this works for you.... Copy the list of HTML cells from Excel to MS Word. In MS Word: EditReplace Find what: \<*\ Replace with: (leave this blank) Click the [More] button to see search options and select: Use Wildcards Click the [Replace All] button Then...EditReplace Find what: (enter 2 spaces) Click the [Replace All] button -That will eliminate strings of spaces Then...EditReplace Find what:   Replace with: (enter 1 space) Click the [Replace All] button -That will replace non-breaking space codes with 1 space Finally: Select the whole Word document EditCopy Select the first HTML cell in Excel EditPaste It might not be perfect, but is it close enough? *********** Regards, Ron "Stuart" wrote: Hi David. Thanks for that, the only problem with that is it leaves lots of blank cells when pasted back into excel. I've even tried saving the spreadsheet as csv and text file to see if that would help when pasted back into excel. No luck there. Any more ideas greatly accepted. Regards Stuar "David McRitchie" wrote: Hi Stuart, I would go to Steve Miller's http://www.stevemiller.net/apps/ and install Pure Text then you can copy the HTML from an HTML page click on the PT system taskbar button, and paste into notepad. If you start from the Excel sheet, copy from the column, Paste into notepad, save as HTML (junk.htm) from notepad, open the junk.htm file with your browser, copy the everything on the sheet (Ctrl+A, Ctrl+C) then paste into notepad and you will have your text. After you have installed and used Pure Text the process should take about 3 minutes to convert in that manner. --- HTH, David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001] My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm Search Page: http://www.mvps.org/dmcritchie/excel/search.htm . |
#17
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Hi Stuart,
Each of your original cells was an entire formatted webpage, less pictures. If you convert the text in one cell to plain text and bring it back into Excel you are going to generated a lot empty lines due to paragraphs which are necessary for humans to be able to read, and each cell when brought back into Excel is going to take many cells (vertically). You aren't going to be able read whole web pages in single cells no matter what you do, so why not use links to the original web pages, or to your archived copy of those web pages. If you are trying to do some search on keywords on your machine you could use Google Desktop on your machine, or Google to search the site you want to search (limited to 32 words per search). If you want the material back in their original cells, you will have to use a macro like your first suggestion to remove the < and everything in between (and replace with a space), it would be pretty much unreadable, but it would match your initial post in this thread. . Maybe if you stated the actual purpose, you might get suggestions for other applications, because I don't see how storing this in Excel is going to be efficient -- check the size of your Excel file.. . --- HTH, David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001] My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm Search Page: http://www.mvps.org/dmcritchie/excel/search.htm "Stuart" wrote in message ... Hi David. Thanks for that, the only problem with that is it leaves lots of blank cells when pasted back into excel. I've even tried saving the spreadsheet as csv and text file to see if that would help when pasted back into excel. No luck there. -- Any more ideas greatly accepted. |
#18
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Hi David.
Our website is dtabase driven so the product pages are what Im trying to change in the csv file. The reason I want to chage them to plain text is so I can use the full description of the product in shopping comparison sites like google, price grabber, price runner and ciao. These only accpet product descriptions in plain text so all the html has to be removed. Regards Stuart "David McRitchie" wrote: Hi Stuart, Each of your original cells was an entire formatted webpage, less pictures. If you convert the text in one cell to plain text and bring it back into Excel you are going to generated a lot empty lines due to paragraphs which are necessary for humans to be able to read, and each cell when brought back into Excel is going to take many cells (vertically). You aren't going to be able read whole web pages in single cells no matter what you do, so why not use links to the original web pages, or to your archived copy of those web pages. If you are trying to do some search on keywords on your machine you could use Google Desktop on your machine, or Google to search the site you want to search (limited to 32 words per search). If you want the material back in their original cells, you will have to use a macro like your first suggestion to remove the < and everything in between (and replace with a space), it would be pretty much unreadable, but it would match your initial post in this thread. . Maybe if you stated the actual purpose, you might get suggestions for other applications, because I don't see how storing this in Excel is going to be efficient -- check the size of your Excel file.. . --- HTH, David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001] My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm Search Page: http://www.mvps.org/dmcritchie/excel/search.htm "Stuart" wrote in message ... Hi David. Thanks for that, the only problem with that is it leaves lots of blank cells when pasted back into excel. I've even tried saving the spreadsheet as csv and text file to see if that would help when pasted back into excel. No luck there. -- Any more ideas greatly accepted. |
#19
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
That works fine, I also added
EditReplace Find what: \{*\} Replace with: (enter 1 space) Click the [More] button to see search options and select: Use Wildcards Click the [Replace All] button "Ron Coderre" wrote: Hi, Stuart See if this works for you.... Copy the list of HTML cells from Excel to MS Word. In MS Word: EditReplace Find what: \<*\ Replace with: (leave this blank) Click the [More] button to see search options and select: Use Wildcards Click the [Replace All] button Then...EditReplace Find what: (enter 2 spaces) Click the [Replace All] button -That will eliminate strings of spaces Then...EditReplace Find what:   Replace with: (enter 1 space) Click the [Replace All] button -That will replace non-breaking space codes with 1 space Finally: Select the whole Word document EditCopy Select the first HTML cell in Excel EditPaste It might not be perfect, but is it close enough? *********** Regards, Ron "Stuart" wrote: Hi David. Thanks for that, the only problem with that is it leaves lots of blank cells when pasted back into excel. I've even tried saving the spreadsheet as csv and text file to see if that would help when pasted back into excel. No luck there. Any more ideas greatly accepted. Regards Stuar "David McRitchie" wrote: Hi Stuart, I would go to Steve Miller's http://www.stevemiller.net/apps/ and install Pure Text then you can copy the HTML from an HTML page click on the PT system taskbar button, and paste into notepad. If you start from the Excel sheet, copy from the column, Paste into notepad, save as HTML (junk.htm) from notepad, open the junk.htm file with your browser, copy the everything on the sheet (Ctrl+A, Ctrl+C) then paste into notepad and you will have your text. After you have installed and used Pure Text the process should take about 3 minutes to convert in that manner. --- HTH, David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001] My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm Search Page: http://www.mvps.org/dmcritchie/excel/search.htm . |
#20
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Remove html markup
Thanks for letting us know. I'm glad you finally got something you could
work with. *********** Regards, Ron "Stuart" wrote: That works fine, I also added EditReplace Find what: \{*\} Replace with: (enter 1 space) Click the [More] button to see search options and select: Use Wildcards Click the [Replace All] button "Ron Coderre" wrote: Hi, Stuart See if this works for you.... Copy the list of HTML cells from Excel to MS Word. In MS Word: EditReplace Find what: \<*\ Replace with: (leave this blank) Click the [More] button to see search options and select: Use Wildcards Click the [Replace All] button Then...EditReplace Find what: (enter 2 spaces) Click the [Replace All] button -That will eliminate strings of spaces Then...EditReplace Find what:   Replace with: (enter 1 space) Click the [Replace All] button -That will replace non-breaking space codes with 1 space Finally: Select the whole Word document EditCopy Select the first HTML cell in Excel EditPaste It might not be perfect, but is it close enough? *********** Regards, Ron "Stuart" wrote: Hi David. Thanks for that, the only problem with that is it leaves lots of blank cells when pasted back into excel. I've even tried saving the spreadsheet as csv and text file to see if that would help when pasted back into excel. No luck there. Any more ideas greatly accepted. Regards Stuar "David McRitchie" wrote: Hi Stuart, I would go to Steve Miller's http://www.stevemiller.net/apps/ and install Pure Text then you can copy the HTML from an HTML page click on the PT system taskbar button, and paste into notepad. If you start from the Excel sheet, copy from the column, Paste into notepad, save as HTML (junk.htm) from notepad, open the junk.htm file with your browser, copy the everything on the sheet (Ctrl+A, Ctrl+C) then paste into notepad and you will have your text. After you have installed and used Pure Text the process should take about 3 minutes to convert in that manner. --- HTH, David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001] My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm Search Page: http://www.mvps.org/dmcritchie/excel/search.htm . |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
saving ppt as html | Excel Discussion (Misc queries) | |||
extract text from html files | Excel Discussion (Misc queries) | |||
html to excel | Excel Discussion (Misc queries) | |||
save as html | Excel Discussion (Misc queries) | |||
html filter | Excel Discussion (Misc queries) |