Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #11   Report Post  
Old November 23rd 05, 03:22 AM posted to microsoft.public.excel.worksheet.functions
Dave Peterson
 
Posts: n/a
Default Remove html markup

Since you're fixing a .csv file, maybe excel isn't the best choice.

If you visit shareware.com, you can find lots of programs to clean up HTML
files.



Stuart wrote:

How do I remove HTML markup from and cell.

i.e.
<DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0
<TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most
dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR
<TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or
other raised surface, never leave your baby unattended even for a moment
</LI</UL</P</DIV

I want it to read:-

SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers
dimensions: 75 x 46cm When using a changing mat on a dresser or other
raised surface, never leave your baby unattended even for a moment

I am trying to change out website csv file into a froogle csv file but
without the HTML

Thanks in advance

Stuart


--

Dave Peterson

  #12   Report Post  
Old November 23rd 05, 03:56 AM posted to microsoft.public.excel.worksheet.functions
KC
 
Posts: n/a
Default Remove html markup


Not sure if I'm on the right track here with this suggestion ...

If the HTML is being copied from the web, copy if to Notepad first,
then copy from Notepad into Excel. Notepad will clean it up and get
rid of the HTML code.

Kaye



On Tue, 22 Nov 2005 17:05:05 -0800, Stuart
wrote:

How do I remove HTML markup from and cell.

i.e.
<DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0
<TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most
dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR
<TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or
other raised surface, never leave your baby unattended even for a moment
</LI</UL</P</DIV

I want it to read:-

SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers
dimensions: 75 x 46cm When using a changing mat on a dresser or other
raised surface, never leave your baby unattended even for a moment

I am trying to change out website csv file into a froogle csv file but
without the HTML

Thanks in advance

Stuart


  #13   Report Post  
Old November 23rd 05, 09:48 AM posted to microsoft.public.excel.worksheet.functions
Stuart
 
Posts: n/a
Default Remove html markup

The csv file is a database feed from our e-commerce site, so adding them one
by one to notepad to remove the html would take a long time.

Regards

Stuart

"KC" wrote:


Not sure if I'm on the right track here with this suggestion ...

If the HTML is being copied from the web, copy if to Notepad first,
then copy from Notepad into Excel. Notepad will clean it up and get
rid of the HTML code.

Kaye



On Tue, 22 Nov 2005 17:05:05 -0800, Stuart
wrote:

How do I remove HTML markup from and cell.

i.e.
<DIV class=productdesc <H3SCALLYWAGS CHANGING MAT. </H3<TABLE border=0
<TBODY <TR <TD <UL <LIFoam-filled <LIWipe-clean surface <LIFits most
dressers <LIdimensions: 75 x 46cm </LI</UL</TD</TR
<TR</TR</TBODY</TABLE <UL <LIWhen using a changing mat on a dresser or
other raised surface, never leave your baby unattended even for a moment
</LI</UL</P</DIV

I want it to read:-

SCALLYWAGS CHANGING MAT. Foam-filled Wipe-clean surface Fits most dressers
dimensions: 75 x 46cm When using a changing mat on a dresser or other
raised surface, never leave your baby unattended even for a moment

I am trying to change out website csv file into a froogle csv file but
without the HTML

Thanks in advance

Stuart



  #14   Report Post  
Old November 25th 05, 03:27 AM posted to microsoft.public.excel.worksheet.functions
David McRitchie
 
Posts: n/a
Default Remove html markup

Hi Stuart,
I would go to Steve Miller's http://www.stevemiller.net/apps/
and install Pure Text then you can copy the HTML from an HTML page
click on the PT system taskbar button, and paste into notepad.

If you start from the Excel sheet, copy from the column, Paste into notepad,
save as HTML (junk.htm) from notepad, open the junk.htm file with your
browser, copy the everything on the sheet (Ctrl+A, Ctrl+C) then paste
into notepad and you will have your text. After you have installed and
used Pure Text the process should take about 3 minutes to convert in that
manner.
---
HTH,
David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001]
My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page: http://www.mvps.org/dmcritchie/excel/search.htm
  #15   Report Post  
Old November 25th 05, 10:00 AM posted to microsoft.public.excel.worksheet.functions
Stuart
 
Posts: n/a
Default Remove html markup

Hi David.

Thanks for that, the only problem with that is it leaves lots of blank cells
when pasted back into excel. I've even tried saving the spreadsheet as csv
and text file to see if that would help when pasted back into excel. No luck
there.

Any more ideas greatly accepted.

Regards

Stuar

"David McRitchie" wrote:

Hi Stuart,
I would go to Steve Miller's http://www.stevemiller.net/apps/
and install Pure Text then you can copy the HTML from an HTML page
click on the PT system taskbar button, and paste into notepad.

If you start from the Excel sheet, copy from the column, Paste into notepad,
save as HTML (junk.htm) from notepad, open the junk.htm file with your
browser, copy the everything on the sheet (Ctrl+A, Ctrl+C) then paste
into notepad and you will have your text. After you have installed and
used Pure Text the process should take about 3 minutes to convert in that
manner.
---
HTH,
David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001]
My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page: http://www.mvps.org/dmcritchie/excel/search.htm
.





  #16   Report Post  
Old November 25th 05, 02:01 PM posted to microsoft.public.excel.worksheet.functions
Ron Coderre
 
Posts: n/a
Default Remove html markup

Hi, Stuart

See if this works for you....

Copy the list of HTML cells from Excel to MS Word.

In MS Word:
EditReplace
Find what: \<*\
Replace with: (leave this blank)
Click the [More] button to see search options and select: Use Wildcards
Click the [Replace All] button

Then...EditReplace
Find what: (enter 2 spaces)
Click the [Replace All] button
-That will eliminate strings of spaces

Then...EditReplace
Find what: &nbsp
Replace with: (enter 1 space)
Click the [Replace All] button
-That will replace non-breaking space codes with 1 space

Finally:
Select the whole Word document
EditCopy
Select the first HTML cell in Excel
EditPaste

It might not be perfect, but is it close enough?


***********
Regards,
Ron


"Stuart" wrote:

Hi David.

Thanks for that, the only problem with that is it leaves lots of blank cells
when pasted back into excel. I've even tried saving the spreadsheet as csv
and text file to see if that would help when pasted back into excel. No luck
there.

Any more ideas greatly accepted.

Regards

Stuar

"David McRitchie" wrote:

Hi Stuart,
I would go to Steve Miller's http://www.stevemiller.net/apps/
and install Pure Text then you can copy the HTML from an HTML page
click on the PT system taskbar button, and paste into notepad.

If you start from the Excel sheet, copy from the column, Paste into notepad,
save as HTML (junk.htm) from notepad, open the junk.htm file with your
browser, copy the everything on the sheet (Ctrl+A, Ctrl+C) then paste
into notepad and you will have your text. After you have installed and
used Pure Text the process should take about 3 minutes to convert in that
manner.
---
HTH,
David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001]
My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page: http://www.mvps.org/dmcritchie/excel/search.htm
.



  #17   Report Post  
Old November 25th 05, 02:17 PM posted to microsoft.public.excel.worksheet.functions
David McRitchie
 
Posts: n/a
Default Remove html markup

Hi Stuart,
Each of your original cells was an entire formatted webpage, less pictures.
If you convert the text in one cell to plain text and bring it back into
Excel you are going to generated a lot empty lines due to paragraphs
which are necessary for humans to be able to read, and each
cell when brought back into Excel is going to take many cells (vertically).

You aren't going to be able read whole web pages in single cells no
matter what you do, so why not use links to the original web pages,
or to your archived copy of those web pages.

If you are trying to do some search on keywords on your machine
you could use Google Desktop on your machine, or Google to search
the site you want to search (limited to 32 words per search).

If you want the material back in their original cells, you will have to
use a macro like your first suggestion to remove the < and
everything in between (and replace with a space), it would be pretty
much unreadable, but it would match your initial post in this thread. .

Maybe if you stated the actual purpose, you might get suggestions
for other applications, because I don't see how storing this in Excel
is going to be efficient -- check the size of your Excel file.. .

---
HTH,
David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001]
My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page: http://www.mvps.org/dmcritchie/excel/search.htm

"Stuart" wrote in message ...
Hi David.

Thanks for that, the only problem with that is it leaves lots of blank cells
when pasted back into excel. I've even tried saving the spreadsheet as csv
and text file to see if that would help when pasted back into excel. No luck
there. -- Any more ideas greatly accepted.



  #18   Report Post  
Old November 25th 05, 02:46 PM posted to microsoft.public.excel.worksheet.functions
Stuart
 
Posts: n/a
Default Remove html markup

Hi David.

Our website is dtabase driven so the product pages are what Im trying to
change in the csv file. The reason I want to chage them to plain text is so I
can use the full description of the product in shopping comparison sites like
google, price grabber, price runner and ciao. These only accpet product
descriptions in plain text so all the html has to be removed.

Regards

Stuart

"David McRitchie" wrote:

Hi Stuart,
Each of your original cells was an entire formatted webpage, less pictures.
If you convert the text in one cell to plain text and bring it back into
Excel you are going to generated a lot empty lines due to paragraphs
which are necessary for humans to be able to read, and each
cell when brought back into Excel is going to take many cells (vertically).

You aren't going to be able read whole web pages in single cells no
matter what you do, so why not use links to the original web pages,
or to your archived copy of those web pages.

If you are trying to do some search on keywords on your machine
you could use Google Desktop on your machine, or Google to search
the site you want to search (limited to 32 words per search).

If you want the material back in their original cells, you will have to
use a macro like your first suggestion to remove the < and
everything in between (and replace with a space), it would be pretty
much unreadable, but it would match your initial post in this thread. .

Maybe if you stated the actual purpose, you might get suggestions
for other applications, because I don't see how storing this in Excel
is going to be efficient -- check the size of your Excel file.. .

---
HTH,
David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001]
My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page: http://www.mvps.org/dmcritchie/excel/search.htm

"Stuart" wrote in message ...
Hi David.

Thanks for that, the only problem with that is it leaves lots of blank cells
when pasted back into excel. I've even tried saving the spreadsheet as csv
and text file to see if that would help when pasted back into excel. No luck
there. -- Any more ideas greatly accepted.




  #19   Report Post  
Old November 25th 05, 03:01 PM posted to microsoft.public.excel.worksheet.functions
Stuart
 
Posts: n/a
Default Remove html markup

That works fine, I also added

EditReplace
Find what: \{*\}
Replace with: (enter 1 space)
Click the [More] button to see search options and select: Use Wildcards
Click the [Replace All] button




"Ron Coderre" wrote:

Hi, Stuart

See if this works for you....

Copy the list of HTML cells from Excel to MS Word.

In MS Word:
EditReplace
Find what: \<*\
Replace with: (leave this blank)
Click the [More] button to see search options and select: Use Wildcards
Click the [Replace All] button

Then...EditReplace
Find what: (enter 2 spaces)
Click the [Replace All] button
-That will eliminate strings of spaces

Then...EditReplace
Find what: &nbsp
Replace with: (enter 1 space)
Click the [Replace All] button
-That will replace non-breaking space codes with 1 space

Finally:
Select the whole Word document
EditCopy
Select the first HTML cell in Excel
EditPaste

It might not be perfect, but is it close enough?


***********
Regards,
Ron


"Stuart" wrote:

Hi David.

Thanks for that, the only problem with that is it leaves lots of blank cells
when pasted back into excel. I've even tried saving the spreadsheet as csv
and text file to see if that would help when pasted back into excel. No luck
there.

Any more ideas greatly accepted.

Regards

Stuar

"David McRitchie" wrote:

Hi Stuart,
I would go to Steve Miller's http://www.stevemiller.net/apps/
and install Pure Text then you can copy the HTML from an HTML page
click on the PT system taskbar button, and paste into notepad.

If you start from the Excel sheet, copy from the column, Paste into notepad,
save as HTML (junk.htm) from notepad, open the junk.htm file with your
browser, copy the everything on the sheet (Ctrl+A, Ctrl+C) then paste
into notepad and you will have your text. After you have installed and
used Pure Text the process should take about 3 minutes to convert in that
manner.
---
HTH,
David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001]
My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page: http://www.mvps.org/dmcritchie/excel/search.htm
.



  #20   Report Post  
Old November 25th 05, 03:47 PM posted to microsoft.public.excel.worksheet.functions
Ron Coderre
 
Posts: n/a
Default Remove html markup

Thanks for letting us know. I'm glad you finally got something you could
work with.

***********
Regards,
Ron


"Stuart" wrote:

That works fine, I also added

EditReplace
Find what: \{*\}
Replace with: (enter 1 space)
Click the [More] button to see search options and select: Use Wildcards
Click the [Replace All] button




"Ron Coderre" wrote:

Hi, Stuart

See if this works for you....

Copy the list of HTML cells from Excel to MS Word.

In MS Word:
EditReplace
Find what: \<*\
Replace with: (leave this blank)
Click the [More] button to see search options and select: Use Wildcards
Click the [Replace All] button

Then...EditReplace
Find what: (enter 2 spaces)
Click the [Replace All] button
-That will eliminate strings of spaces

Then...EditReplace
Find what: &nbsp
Replace with: (enter 1 space)
Click the [Replace All] button
-That will replace non-breaking space codes with 1 space

Finally:
Select the whole Word document
EditCopy
Select the first HTML cell in Excel
EditPaste

It might not be perfect, but is it close enough?


***********
Regards,
Ron


"Stuart" wrote:

Hi David.

Thanks for that, the only problem with that is it leaves lots of blank cells
when pasted back into excel. I've even tried saving the spreadsheet as csv
and text file to see if that would help when pasted back into excel. No luck
there.

Any more ideas greatly accepted.

Regards

Stuar

"David McRitchie" wrote:

Hi Stuart,
I would go to Steve Miller's http://www.stevemiller.net/apps/
and install Pure Text then you can copy the HTML from an HTML page
click on the PT system taskbar button, and paste into notepad.

If you start from the Excel sheet, copy from the column, Paste into notepad,
save as HTML (junk.htm) from notepad, open the junk.htm file with your
browser, copy the everything on the sheet (Ctrl+A, Ctrl+C) then paste
into notepad and you will have your text. After you have installed and
used Pure Text the process should take about 3 minutes to convert in that
manner.
---
HTH,
David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001]
My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page: http://www.mvps.org/dmcritchie/excel/search.htm
.





Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
saving ppt as html [email protected] Excel Discussion (Misc queries) 0 September 9th 05 02:20 PM
extract text from html files Glowinafuse Excel Discussion (Misc queries) 3 May 31st 05 06:23 AM
html to excel nellie Excel Discussion (Misc queries) 4 February 8th 05 10:37 PM
save as html writenick Excel Discussion (Misc queries) 0 January 13th 05 05:25 PM
html filter DavidR Excel Discussion (Misc queries) 0 December 13th 04 07:19 PM


All times are GMT +1. The time now is 06:55 AM.

Powered by vBulletin® Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
Copyright 2004-2019 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"

 

Copyright © 2017