Home |
Search |
Today's Posts |
|
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking "strip html" to the next level
I'm really glad I found this - and glad you took the time to write the
code. Not being a 'coder' I was wondering if my request would be possible to add into the script - I have various spreadsheets totaling 10's of thousands of cells that contain HTML tags that I need to remove the HTML from. This script definitely does that and it helps me quite a bit... My request is the following: What I envision is when I double click on a cell with HTML tags the script executes just like it does, but instead of having to manually copy the text from within the user form, then click the 'command' button, and CTRL-V back into the original cell....it would be cool if those steps were automated. In other words, the user double-clicks the cell and "bingo" the HTML markup cell contents are replaced with the non-HTML content. If you could select an entire column and put all that code in "For...Next" (i'm sure for...next isn't correct) loop and the script executes all the way down the column with one double-click - that would *really* be cool...but if somebody could show me how to do the simplest automation I'd greatly appreciate it.... making my way through about 15,000 rows and 3 columns is going to be a lot of double-clicks, CTRL-C, click, CTRL-V, Arrow Down....repeat.....don't get me wrong though...I'm very appreciative to have what you've already provided! |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking "strip html" to the next level
sorry...I meant to post this as a reply to the thread below.
the original thread I meant to reply to is he http://groups.google.com/group/micro...cd8f0a1a 71de |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking "strip html" to the next level
On Fri, 13 Jun 2008 07:33:03 -0700 (PDT), Steve127 wrote:
I'm really glad I found this - and glad you took the time to write the code. Not being a 'coder' I was wondering if my request would be possible to add into the script - I have various spreadsheets totaling 10's of thousands of cells that contain HTML tags that I need to remove the HTML from. This script definitely does that and it helps me quite a bit... My request is the following: What I envision is when I double click on a cell with HTML tags the script executes just like it does, but instead of having to manually copy the text from within the user form, then click the 'command' button, and CTRL-V back into the original cell....it would be cool if those steps were automated. In other words, the user double-clicks the cell and "bingo" the HTML markup cell contents are replaced with the non-HTML content. If you could select an entire column and put all that code in "For...Next" (i'm sure for...next isn't correct) loop and the script executes all the way down the column with one double-click - that would *really* be cool...but if somebody could show me how to do the simplest automation I'd greatly appreciate it.... making my way through about 15,000 rows and 3 columns is going to be a lot of double-clicks, CTRL-C, click, CTRL-V, Arrow Down....repeat.....don't get me wrong though...I'm very appreciative to have what you've already provided! Steve, Since this is for readability, is it correct to assume that you'd want the document "collapsed" after having been processed? Or do you just want to leave blank lines. For example, ============================== Option Explicit Sub StripHTML() Dim c As Range Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True re.Pattern = "</?[a-z][a-z0-9]*[^<]*" 'not sure how you want to set the range to act on ' but this is quick and easy For Each c In Selection c.Value = re.Replace(c.Value, "") Next c End Sub ================================= removes all the HTML tags in Selection, except for Comments and the Document Type tags If you also want to remove the blank lines, then perhaps: ======================================= Option Explicit Sub StripHTML() Dim c As Range, rw As Object Dim i As Long, lFirstRow As Long, lLastRow As Long, lColumn As Long Dim re As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.Global = True re.Pattern = "</?[a-z][a-z0-9]*[^<]*" Application.ScreenUpdating = False 'not sure how you want to set the range to act on ' but using Selection is quick and easy For Each c In Selection c.Value = re.Replace(c.Value, "") Next c lFirstRow = Selection.Row lLastRow = Selection.Rows.Count + lFirstRow - 1 lColumn = Selection.Column For i = lLastRow To lFirstRow Step -1 If Application.WorksheetFunction.Trim(Cells(i, 1).Value) = "" Then Cells(i, lColumn).Delete shift:=xlUp End If Next i Application.ScreenUpdating = True End Sub ===================================== --ron |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking "strip html" to the next level
Thank you Ron - I'll give both a try and let you know.
To clarify: I'm working on an export from a MySQL table. The database is part of a shopping cart system. I inherited the database from person(s) who input the product data with a lot of deprecated and non-validating HTML. I am trying to remove all those tags. As an example: Suppose column D cells contain 'product_desc' data which are the cells that have the bad HTML. Using the script from the original poster, you double click the cell (say D3). In the popup text box you see the text that is in D3, except the HTML tags are gone. What I do then is CTRL-A, then CTRL-C, click the command button, and paste back into D3. That gives me what I'm looking for - same product description without HTML tags and database/table integrity. One table alone has over 15,000 rows and 3 fields (or columns) with bad HTML so you can imagine the routine will take me a very long time to finish. There might be a way to do this same thing inside MySQL, but I'm less proficient at it than I am Excel! :) I can do write basic data queries, but writing something to remove HTML tags would be way over my head. Anyway, hope that gives some insight into my problem (roadblock really). BTW...I messed around with the original script and managed to get it to auto-paste the 'good' text into the cell after clicking the command button, but I still have to do CTRL-A & CTRL-C. I gave both of those a shot but kept getting into runtime errors and so forth and it quickly got past my skill level. Thank you |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Excel - Golf - how to display "-2" as "2 Under" or "4"as "+4" or "4 Over" in a calculation cell | Excel Discussion (Misc queries) | |||
Sub to strip away "Sheet" prefix from names | Excel Programming | |||
how do I stop "copy" taking a picture instead of copying text | Excel Discussion (Misc queries) | |||
"Strip" Cell Formula | Excel Programming | |||
Hiding "work" taking place on a worksheet. | Excel Programming |