Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
|
|||
|
|||
![]()
This is posted to both the Word and Excel VBA NG because so far that's
the only way I can figure out to do this. But wait - I throw in FSO to make it even messier! Using Word and Excel 2007. We have a proprietary program that saves its working files as XML. When a user needs to add a list of data, he's presented with either a grid format that's a pain to work with, or a single-form-per-item page that's easier to use but wearying after 35+ items. What I want to do is get the part of the XML to be revised into an Excel worksheet so the user can work easier. Because this program is proprietary, the schema is not available to me, and since I'm not familiar with XML except as text separated by tags, I wouldn't know how to use it. So my bright idea so far is to open the file for reading in FSO, then paste that into a new Word doc. This gives me a text file with XML tags, but without Word's interpretation of the XML. Direct import into Excel causes Excel to interpret the tags and buries the information I want to get. Parsing the text in Word, I can set ranges using the different tags and work my way down to the info. In my current plan, I'm thinking of driving this from Excel. The data is set up like so: <MAIN_CATEGORY <DATA_PAGE num="0001" -- page info -- <GRID_LIST <LINE num="0001" -- line info -- <LINE num="0002" -- line info -- etc. The data I want is between the <LINE tags. I could get all the data I need by looping through and reading each data point from <LINE to </LINE into an array, then writing that back into Excel. Or I could simply capture each <GRID_LIST into a range, save it out as a separate temp XML file, then import that into Excel. Is this really the best way to do this?? Or is there a much easier way? Any recommendations? Ed |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() I have written plenty of programs to parse data files to extract the data I need. I don't think it is very complicated to extract what you need and yo don't even need to put it into an array. You can write it directly into excel The code below I wrote in under 5 minutes. It reads the line immediately after the line that contains the start tag starting in column 1. I can modify the code if you gave me a better example of the data. Your sample doesn't contain the end tag and I don't know if the tags start in column 1. Sub ParseXML() Const ReadFile = "c:\temp\event.txt" Const ForReading = 1, ForWriting = 2, _ ForAppending = 3 Set fs = CreateObject("Scripting.FileSystemObject") Set fin = fs.OpenTextFile(ReadFile, _ ForReading, TristateFalse) RowCount = 1 ReadNextLine = False Do While fin.AtEndOfStream < True ReadData = fin.readline If ReadData < Val(ReadData) = 100 Then Select Case ReadNextLine Case True Range("A" & RowCount) = ReadData ReadNextLine = False RowCount = RowCount + 1 Case False If Left(ReadDate, 9) = "<LINE num" Then ReadNextLine = True End If End Select End If Loop fin.Close End Sub -- joel ------------------------------------------------------------------------ joel's Profile: 229 View this thread: http://www.thecodecage.com/forumz/sh...d.php?t=173301 Microsoft Office Help |
#3
![]()
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
|
|||
|
|||
![]()
I would probably try a different approach to this, although I cannot
provide you with useful detail or code: a. use Excel VBA and MSXML to extract the nodes and data from the XML. You don't need a schema to do that. That should save you having to do your own parsing, and you won't need to involve Word. b. optionally, if it makes sense to do so, construct an XSLT transform to transform your XML structure into something simpler to deal with, and use MSXML to perform the transform. The impression I get is that your XML is simple enough that you would not need that, which would be handy because XSLT can be hard work if you are starting from scratch. Then if I couldn't make that work I'd probably fall back to an approach that was either similar to yours, except that right now I don't see why it would be easier to parse the XML in a Word document rather than doing it directly with Excel VBA. Peter Jamieson http://tips.pjmsn.me.uk On 25/01/2010 22:21, Ed from AZ wrote: This is posted to both the Word and Excel VBA NG because so far that's the only way I can figure out to do this. But wait - I throw in FSO to make it even messier! Using Word and Excel 2007. We have a proprietary program that saves its working files as XML. When a user needs to add a list of data, he's presented with either a grid format that's a pain to work with, or a single-form-per-item page that's easier to use but wearying after 35+ items. What I want to do is get the part of the XML to be revised into an Excel worksheet so the user can work easier. Because this program is proprietary, the schema is not available to me, and since I'm not familiar with XML except as text separated by tags, I wouldn't know how to use it. So my bright idea so far is to open the file for reading in FSO, then paste that into a new Word doc. This gives me a text file with XML tags, but without Word's interpretation of the XML. Direct import into Excel causes Excel to interpret the tags and buries the information I want to get. Parsing the text in Word, I can set ranges using the different tags and work my way down to the info. In my current plan, I'm thinking of driving this from Excel. The data is set up like so: <MAIN_CATEGORY <DATA_PAGE num="0001" -- page info -- <GRID_LIST <LINE num="0001" -- line info -- <LINE num="0002" -- line info -- etc. The data I want is between the<LINE tags. I could get all the data I need by looping through and reading each data point from<LINE to</LINE into an array, then writing that back into Excel. Or I could simply capture each<GRID_LIST into a range, save it out as a separate temp XML file, then import that into Excel. Is this really the best way to do this?? Or is there a much easier way? Any recommendations? Ed |
#4
![]()
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
|
|||
|
|||
![]()
You can read the XML file directly using the MSXML library. In VBA,
go to the Tools menu, choose References, and scroll down to "Microsoft XML, v6.0" and check that item. The code uses some relatively simple XPath to select the nodes, so I assume that you (1) already know XPath, or (2) can figure it out, or (3) can ask for assistance with XPath. Without know the actual layout of your XML, I just guessed and made up the following XML file: <?xml version="1.0" encoding="UTF-8"? <MAIN_CATEGORY <DATA_PAGE num="001" <GRID_LIST <LINE num="001" <nameChip Pearson</name <cityKansas City</city </LINE <LINE num="002" <nameJen Filson</name <cityMission Hills</city </LINE </GRID_LIST </DATA_PAGE <DATA_PAGE num="002" <GRID_LIST <LINE num="003" <nameJanice Downey</name <cityWestwood Hills</city </LINE <LINE num="04" <nameMike Stitt</name <cityCharlotte</city </LINE </GRID_LIST </DATA_PAGE </MAIN_CATEGORY Then, you can use code like Sub AAA() ' Required reference: ' Name: MSXML2 ' Description: Microsoft XML, v6.0 ' Typical location: C:\Windows\System32\msxml6.dll ' GUID: {F5078F18-C551-11D3-89B9-0000F81FE221} ' Major: 6 Minor: 0 Dim DOM As MSXML2.DOMDocument60 Dim FName As Variant Dim DataPageList As MSXML2.IXMLDOMNodeList Dim DataPage As MSXML2.IXMLDOMNodeList Dim GridList As MSXML2.IXMLDOMNodeList Dim Grid As MSXML2.IXMLDOMNodeList Dim LinesList As MSXML2.IXMLDOMNodeList Dim LineX As MSXML2.IXMLDOMNode Dim Arr() As String Dim N As Long Dim M As Long Dim P As Long Dim K As Long Dim R As Range Dim J As Long Set R = Range("C3") '<<<<< OUTPUT ON WORKSHEET FName = Application.GetOpenFilename("XML Files (*.xml),*.xml") If FName = False Then Exit Sub End If Set DOM = New MSXML2.DOMDocument60 DOM.Load CStr(FName) DOM.setProperty "SelectionLanguage", "XPath" ' get all data pages Set DataPageList = DOM.SelectNodes("/MAIN_CATEGORY/DATA_PAGE") For N = 0 To DataPageList.Length - 1 Set GridList = DataPageList.Item(N).SelectNodes("./GRID_LIST") For M = 0 To GridList.Length - 1 Set LinesList = GridList.Item(M).SelectNodes("./LINE") ReDim Arr(0 To LinesList.Length - 1) K = 0 For P = 0 To LinesList.Length - 1 Set LineX = LinesList.Item(P) Arr(K) = LineX.ChildNodes(0).Text & vbCrLf & _ LineX.ChildNodes(1).Text K = K + 1 Next P '<< write to worksheet For J = 0 To UBound(Arr) R.Offset(0, J).Value = Arr(J) Next J Set R = R.Offset(1, 0) Next M Next N End Sub This lists the contents between the <LINE and </LINE tags out to the worksheet. You'll have to modfy the code a bit to conform with your actual data structure, but the code will be very muich like that shown above. Cordially, Chip Pearson Microsoft MVP 1998 - 2010 Pearson Software Consulting, LLC www.cpearson.com [email on web site] On Mon, 25 Jan 2010 14:21:20 -0800 (PST), Ed from AZ wrote: This is posted to both the Word and Excel VBA NG because so far that's the only way I can figure out to do this. But wait - I throw in FSO to make it even messier! Using Word and Excel 2007. We have a proprietary program that saves its working files as XML. When a user needs to add a list of data, he's presented with either a grid format that's a pain to work with, or a single-form-per-item page that's easier to use but wearying after 35+ items. What I want to do is get the part of the XML to be revised into an Excel worksheet so the user can work easier. Because this program is proprietary, the schema is not available to me, and since I'm not familiar with XML except as text separated by tags, I wouldn't know how to use it. So my bright idea so far is to open the file for reading in FSO, then paste that into a new Word doc. This gives me a text file with XML tags, but without Word's interpretation of the XML. Direct import into Excel causes Excel to interpret the tags and buries the information I want to get. Parsing the text in Word, I can set ranges using the different tags and work my way down to the info. In my current plan, I'm thinking of driving this from Excel. The data is set up like so: <MAIN_CATEGORY <DATA_PAGE num="0001" -- page info -- <GRID_LIST <LINE num="0001" -- line info -- <LINE num="0002" -- line info -- etc. The data I want is between the <LINE tags. I could get all the data I need by looping through and reading each data point from <LINE to </LINE into an array, then writing that back into Excel. Or I could simply capture each <GRID_LIST into a range, save it out as a separate temp XML file, then import that into Excel. Is this really the best way to do this?? Or is there a much easier way? Any recommendations? Ed |
#5
![]()
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
|
|||
|
|||
![]()
Wow!! I didn't even know this existed! Thank you!!
I'll holler back if things start smoking! Ed On Jan 28, 12:26*pm, Chip Pearson wrote: You can read the XML file directly using the MSXML library. *In VBA, go to the Tools menu, choose References, and scroll down to "Microsoft XML, v6.0" and check that item. *The code uses some relatively simple XPath to select the nodes, so I assume that you (1) already know XPath, or (2) can figure it out, or (3) can ask for assistance with XPath. Without know the actual layout of your XML, I just guessed and made up the following XML file: <?xml version="1.0" encoding="UTF-8"? <MAIN_CATEGORY * * * * <DATA_PAGE num="001" * * * * * * * * <GRID_LIST * * * * * * * * * * * * <LINE num="001" * * * * * * * * * * * * * * * * <nameChip Pearson</name * * * * * * * * * * * * * * * * <cityKansas City</city * * * * * * * * * * * * </LINE * * * * * * * * * * * * <LINE num="002" * * * * * * * * * * * * * * * * <nameJen Filson</name * * * * * * * * * * * * * * * * <cityMission Hills</city * * * * * * * * * * * * </LINE * * * * * * * * </GRID_LIST * * * * </DATA_PAGE * * * * <DATA_PAGE num="002" * * * * * * * * <GRID_LIST * * * * * * * * * * * * <LINE num="003" * * * * * * * * * * * * * * * * <nameJanice Downey</name * * * * * * * * * * * * * * * * <cityWestwood Hills</city * * * * * * * * * * * * </LINE * * * * * * * * * * * * <LINE num="04" * * * * * * * * * * * * * * * * <nameMike Stitt</name * * * * * * * * * * * * * * * * <cityCharlotte</city * * * * * * * * * * * * </LINE * * * * * * * * </GRID_LIST * * * * </DATA_PAGE </MAIN_CATEGORY Then, you can use code like Sub AAA() ' Required reference: ' * Name: MSXML2 ' * Description: Microsoft XML, v6.0 ' * Typical location: C:\Windows\System32\msxml6.dll ' * GUID: {F5078F18-C551-11D3-89B9-0000F81FE221} ' * Major: 6 * *Minor: 0 Dim DOM As MSXML2.DOMDocument60 Dim FName As Variant Dim DataPageList As MSXML2.IXMLDOMNodeList Dim DataPage As MSXML2.IXMLDOMNodeList Dim GridList As MSXML2.IXMLDOMNodeList Dim Grid As MSXML2.IXMLDOMNodeList Dim LinesList As MSXML2.IXMLDOMNodeList Dim LineX As MSXML2.IXMLDOMNode Dim Arr() As String Dim N As Long Dim M As Long Dim P As Long Dim K As Long Dim R As Range Dim J As Long Set R = Range("C3") '<<<<< OUTPUT ON WORKSHEET FName = Application.GetOpenFilename("XML Files (*.xml),*.xml") If FName = False Then * * Exit Sub End If Set DOM = New MSXML2.DOMDocument60 DOM.Load CStr(FName) DOM.setProperty "SelectionLanguage", "XPath" ' get all data pages Set DataPageList = DOM.SelectNodes("/MAIN_CATEGORY/DATA_PAGE") For N = 0 To DataPageList.Length - 1 * * Set GridList = DataPageList.Item(N).SelectNodes("./GRID_LIST") * * For M = 0 To GridList.Length - 1 * * * * Set LinesList = GridList.Item(M).SelectNodes("./LINE") * * * * ReDim Arr(0 To LinesList.Length - 1) * * * * K = 0 * * * * For P = 0 To LinesList.Length - 1 * * * * * * Set LineX = LinesList.Item(P) * * * * * * Arr(K) = LineX.ChildNodes(0).Text & vbCrLf & _ * * * * * * * * LineX.ChildNodes(1).Text * * * * * * K = K + 1 * * * * Next P * * * * '<< write to worksheet * * * * For J = 0 To UBound(Arr) * * * * * * R.Offset(0, J).Value = Arr(J) * * * * Next J * * * * Set R = R.Offset(1, 0) * * Next M Next N End Sub This lists the contents between the <LINE and </LINE tags out to the worksheet. *You'll have to modfy the code a bit to conform with your actual data structure, but the code will be very muich like that shown above. Cordially, Chip Pearson Microsoft MVP 1998 - 2010 Pearson Software Consulting, LLCwww.cpearson.com [email on web site] On Mon, 25 Jan 2010 14:21:20 -0800 (PST), Ed from AZ wrote: This is posted to both the Word and Excel VBA NG because so far that's the only way I can figure out to do this. *But wait - I throw in FSO to make it even messier! *Using Word and Excel 2007. We have a proprietary program that saves its working files as XML. When a user needs to add a list of data, he's presented with either a grid format that's a pain to work with, or a single-form-per-item page that's easier to use but wearying after 35+ items. *What I want to do is get the part of the XML to be revised into an Excel worksheet so the user can work easier. *Because this program is proprietary, the schema is not available to me, and since I'm not familiar with XML except as text separated by tags, I wouldn't know how to use it. So my bright idea so far is to open the file for reading in FSO, then paste that into a new Word doc. *This gives me a text file with XML tags, but without Word's interpretation of the XML. *Direct import into Excel causes Excel to interpret the tags and buries the information I want to get. *Parsing the text in Word, I can set ranges using the different tags and work my way down to the info. *In my current plan, I'm thinking of driving this from Excel. The data is set up like so: <MAIN_CATEGORY <DATA_PAGE num="0001" -- page info -- <GRID_LIST <LINE num="0001" -- line info -- <LINE num="0002" -- line info -- etc. The data I want is between the <LINE tags. I could get all the data I need by looping through and reading each data point from <LINE to </LINE into an array, then writing that back into Excel. *Or I could simply capture each <GRID_LIST into a range, save it out as a separate temp XML file, then import that into Excel. Is this really the best way to do this?? *Or is there a much easier way? *Any recommendations? Ed- Hide quoted text - - Show quoted text - |
#6
![]()
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
|
|||
|
|||
![]()
I just realized one of the major reasons I included Word in this mix.
I am pulling the data into an Excel sheet so I can add to it. I then need to write the modified data back into the XML file. Not being familiar with XML or XPath, my thought train was: -- Read XML file via FSO and paste into Word -- Find the major groups and sub-groups and set bookmarks -- Pull the data into Excel by parsing the tags and text -- Add new data and modify existing data as required -- Create an array of each sub-group and group, wrapping the data in the proper tags -- Replace the contents of the associated bookmark range with the new data -- Write the whole kaboodle back into the XML file So the hiccup in my thinking was how to get the data from Excel back into the XML file. All I could think of was using the bookmarks and ranges in Word. Is there a better way? Ed On Jan 28, 12:26*pm, Chip Pearson wrote: You can read the XML file directly using the MSXML library. *In VBA, go to the Tools menu, choose References, and scroll down to "Microsoft XML, v6.0" and check that item. *The code uses some relatively simple XPath to select the nodes, so I assume that you (1) already know XPath, or (2) can figure it out, or (3) can ask for assistance with XPath. Without know the actual layout of your XML, I just guessed and made up the following XML file: <?xml version="1.0" encoding="UTF-8"? <MAIN_CATEGORY * * * * <DATA_PAGE num="001" * * * * * * * * <GRID_LIST * * * * * * * * * * * * <LINE num="001" * * * * * * * * * * * * * * * * <nameChip Pearson</name * * * * * * * * * * * * * * * * <cityKansas City</city * * * * * * * * * * * * </LINE * * * * * * * * * * * * <LINE num="002" * * * * * * * * * * * * * * * * <nameJen Filson</name * * * * * * * * * * * * * * * * <cityMission Hills</city * * * * * * * * * * * * </LINE * * * * * * * * </GRID_LIST * * * * </DATA_PAGE * * * * <DATA_PAGE num="002" * * * * * * * * <GRID_LIST * * * * * * * * * * * * <LINE num="003" * * * * * * * * * * * * * * * * <nameJanice Downey</name * * * * * * * * * * * * * * * * <cityWestwood Hills</city * * * * * * * * * * * * </LINE * * * * * * * * * * * * <LINE num="04" * * * * * * * * * * * * * * * * <nameMike Stitt</name * * * * * * * * * * * * * * * * <cityCharlotte</city * * * * * * * * * * * * </LINE * * * * * * * * </GRID_LIST * * * * </DATA_PAGE </MAIN_CATEGORY Then, you can use code like Sub AAA() ' Required reference: ' * Name: MSXML2 ' * Description: Microsoft XML, v6.0 ' * Typical location: C:\Windows\System32\msxml6.dll ' * GUID: {F5078F18-C551-11D3-89B9-0000F81FE221} ' * Major: 6 * *Minor: 0 Dim DOM As MSXML2.DOMDocument60 Dim FName As Variant Dim DataPageList As MSXML2.IXMLDOMNodeList Dim DataPage As MSXML2.IXMLDOMNodeList Dim GridList As MSXML2.IXMLDOMNodeList Dim Grid As MSXML2.IXMLDOMNodeList Dim LinesList As MSXML2.IXMLDOMNodeList Dim LineX As MSXML2.IXMLDOMNode Dim Arr() As String Dim N As Long Dim M As Long Dim P As Long Dim K As Long Dim R As Range Dim J As Long Set R = Range("C3") '<<<<< OUTPUT ON WORKSHEET FName = Application.GetOpenFilename("XML Files (*.xml),*.xml") If FName = False Then * * Exit Sub End If Set DOM = New MSXML2.DOMDocument60 DOM.Load CStr(FName) DOM.setProperty "SelectionLanguage", "XPath" ' get all data pages Set DataPageList = DOM.SelectNodes("/MAIN_CATEGORY/DATA_PAGE") For N = 0 To DataPageList.Length - 1 * * Set GridList = DataPageList.Item(N).SelectNodes("./GRID_LIST") * * For M = 0 To GridList.Length - 1 * * * * Set LinesList = GridList.Item(M).SelectNodes("./LINE") * * * * ReDim Arr(0 To LinesList.Length - 1) * * * * K = 0 * * * * For P = 0 To LinesList.Length - 1 * * * * * * Set LineX = LinesList.Item(P) * * * * * * Arr(K) = LineX.ChildNodes(0).Text & vbCrLf & _ * * * * * * * * LineX.ChildNodes(1).Text * * * * * * K = K + 1 * * * * Next P * * * * '<< write to worksheet * * * * For J = 0 To UBound(Arr) * * * * * * R.Offset(0, J).Value = Arr(J) * * * * Next J * * * * Set R = R.Offset(1, 0) * * Next M Next N End Sub This lists the contents between the <LINE and </LINE tags out to the worksheet. *You'll have to modfy the code a bit to conform with your actual data structure, but the code will be very muich like that shown above. Cordially, Chip Pearson Microsoft MVP 1998 - 2010 Pearson Software Consulting, LLCwww.cpearson.com [email on web site] On Mon, 25 Jan 2010 14:21:20 -0800 (PST), Ed from AZ wrote: This is posted to both the Word and Excel VBA NG because so far that's the only way I can figure out to do this. *But wait - I throw in FSO to make it even messier! *Using Word and Excel 2007. We have a proprietary program that saves its working files as XML. When a user needs to add a list of data, he's presented with either a grid format that's a pain to work with, or a single-form-per-item page that's easier to use but wearying after 35+ items. *What I want to do is get the part of the XML to be revised into an Excel worksheet so the user can work easier. *Because this program is proprietary, the schema is not available to me, and since I'm not familiar with XML except as text separated by tags, I wouldn't know how to use it. So my bright idea so far is to open the file for reading in FSO, then paste that into a new Word doc. *This gives me a text file with XML tags, but without Word's interpretation of the XML. *Direct import into Excel causes Excel to interpret the tags and buries the information I want to get. *Parsing the text in Word, I can set ranges using the different tags and work my way down to the info. *In my current plan, I'm thinking of driving this from Excel. The data is set up like so: <MAIN_CATEGORY <DATA_PAGE num="0001" -- page info -- <GRID_LIST <LINE num="0001" -- line info -- <LINE num="0002" -- line info -- etc. The data I want is between the <LINE tags. I could get all the data I need by looping through and reading each data point from <LINE to </LINE into an array, then writing that back into Excel. *Or I could simply capture each <GRID_LIST into a range, save it out as a separate temp XML file, then import that into Excel. Is this really the best way to do this?? *Or is there a much easier way? *Any recommendations? Ed- |
#7
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() You can do a saveas in excel using XML format. You may loose some of the non row/column data from the original xml file. A XML file "MAY" contains other properties beside the Row/column data that you see in excel. It depends on the datta contained in the original XML file. -- joel ------------------------------------------------------------------------ joel's Profile: 229 View this thread: http://www.thecodecage.com/forumz/sh...d.php?t=173301 Microsoft Office Help |
#8
![]()
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
|
|||
|
|||
![]()
Well, now I'm thinking I can do it like this:
-- Begin reading the XML node by node, saving everything into a string until it hits the node that begins the data I want to pull into Excel. -- Continue reading, pulling into Excel, until it hits the node at the end of the data. -- Finish reading the file, saving everything into a second string. When I finish in Excel, I can wrap all that in the proper tags, then assemble strStart & strData & strEnd and write that back into the XML file. Leaves Word out of the loop (although I may leave it in for debugging so I write out what I capture - it's probably too much for the Debug.Print window!) and makes things a lot smoother. If it starts smoking, I'll yell for help! Ed On Feb 3, 7:10*am, Ed from AZ wrote: I just realized one of the major reasons I included Word in this mix. I am pulling the data into an Excel sheet so I can add to it. *I then need to write the modified data back into the XML file. Not being familiar with XML or XPath, my thought train was: -- Read XML file via FSO and paste into Word -- Find the major groups and sub-groups and set bookmarks -- Pull the data into Excel by parsing the tags and text -- Add new data and modify existing data as required -- Create an array of each sub-group and group, wrapping the data in the proper tags -- Replace the contents of the associated bookmark range with the new data -- Write the whole kaboodle back into the XML file So the hiccup in my thinking was how to get the data from Excel back into the XML file. *All I could think of was using the bookmarks and ranges in Word. Is there a better way? Ed On Jan 28, 12:26*pm, Chip Pearson wrote: You can read the XML file directly using the MSXML library. *In VBA, go to the Tools menu, choose References, and scroll down to "Microsoft XML, v6.0" and check that item. *The code uses some relatively simple XPath to select the nodes, so I assume that you (1) already know XPath, or (2) can figure it out, or (3) can ask for assistance with XPath. Without know the actual layout of your XML, I just guessed and made up the following XML file: <?xml version="1.0" encoding="UTF-8"? <MAIN_CATEGORY * * * * <DATA_PAGE num="001" * * * * * * * * <GRID_LIST * * * * * * * * * * * * <LINE num="001" * * * * * * * * * * * * * * * * <nameChip Pearson</name * * * * * * * * * * * * * * * * <cityKansas City</city * * * * * * * * * * * * </LINE * * * * * * * * * * * * <LINE num="002" * * * * * * * * * * * * * * * * <nameJen Filson</name * * * * * * * * * * * * * * * * <cityMission Hills</city * * * * * * * * * * * * </LINE * * * * * * * * </GRID_LIST * * * * </DATA_PAGE * * * * <DATA_PAGE num="002" * * * * * * * * <GRID_LIST * * * * * * * * * * * * <LINE num="003" * * * * * * * * * * * * * * * * <nameJanice Downey</name * * * * * * * * * * * * * * * * <cityWestwood Hills</city * * * * * * * * * * * * </LINE * * * * * * * * * * * * <LINE num="04" * * * * * * * * * * * * * * * * <nameMike Stitt</name * * * * * * * * * * * * * * * * <cityCharlotte</city * * * * * * * * * * * * </LINE * * * * * * * * </GRID_LIST * * * * </DATA_PAGE </MAIN_CATEGORY Then, you can use code like Sub AAA() ' Required reference: ' * Name: MSXML2 ' * Description: Microsoft XML, v6.0 ' * Typical location: C:\Windows\System32\msxml6.dll ' * GUID: {F5078F18-C551-11D3-89B9-0000F81FE221} ' * Major: 6 * *Minor: 0 Dim DOM As MSXML2.DOMDocument60 Dim FName As Variant Dim DataPageList As MSXML2.IXMLDOMNodeList Dim DataPage As MSXML2.IXMLDOMNodeList Dim GridList As MSXML2.IXMLDOMNodeList Dim Grid As MSXML2.IXMLDOMNodeList Dim LinesList As MSXML2.IXMLDOMNodeList Dim LineX As MSXML2.IXMLDOMNode Dim Arr() As String Dim N As Long Dim M As Long Dim P As Long Dim K As Long Dim R As Range Dim J As Long Set R = Range("C3") '<<<<< OUTPUT ON WORKSHEET FName = Application.GetOpenFilename("XML Files (*.xml),*.xml") If FName = False Then * * Exit Sub End If Set DOM = New MSXML2.DOMDocument60 DOM.Load CStr(FName) DOM.setProperty "SelectionLanguage", "XPath" ' get all data pages Set DataPageList = DOM.SelectNodes("/MAIN_CATEGORY/DATA_PAGE") For N = 0 To DataPageList.Length - 1 * * Set GridList = DataPageList.Item(N).SelectNodes("./GRID_LIST") * * For M = 0 To GridList.Length - 1 * * * * Set LinesList = GridList.Item(M).SelectNodes("./LINE") * * * * ReDim Arr(0 To LinesList.Length - 1) * * * * K = 0 * * * * For P = 0 To LinesList.Length - 1 * * * * * * Set LineX = LinesList.Item(P) * * * * * * Arr(K) = LineX.ChildNodes(0).Text & vbCrLf & _ * * * * * * * * LineX.ChildNodes(1).Text * * * * * * K = K + 1 * * * * Next P * * * * '<< write to worksheet * * * * For J = 0 To UBound(Arr) * * * * * * R.Offset(0, J).Value = Arr(J) * * * * Next J * * * * Set R = R.Offset(1, 0) * * Next M Next N End Sub This lists the contents between the <LINE and </LINE tags out to the worksheet. *You'll have to modfy the code a bit to conform with your actual data structure, but the code will be very muich like that shown above. Cordially, Chip Pearson Microsoft MVP 1998 - 2010 Pearson Software Consulting, LLCwww.cpearson.com [email on web site] On Mon, 25 Jan 2010 14:21:20 -0800 (PST), Ed from AZ wrote: This is posted to both the Word and Excel VBA NG because so far that's the only way I can figure out to do this. *But wait - I throw in FSO to make it even messier! *Using Word and Excel 2007. We have a proprietary program that saves its working files as XML. When a user needs to add a list of data, he's presented with either a grid format that's a pain to work with, or a single-form-per-item page that's easier to use but wearying after 35+ items. *What I want to do is get the part of the XML to be revised into an Excel worksheet so the user can work easier. *Because this program is proprietary, the schema is not available to me, and since I'm not familiar with XML except as text separated by tags, I wouldn't know how to use it. So my bright idea so far is to open the file for reading in FSO, then paste that into a new Word doc. *This gives me a text file with XML tags, but without Word's interpretation of the XML. *Direct import into Excel causes Excel to interpret the tags and buries the information I want to get. *Parsing the text in Word, I can set ranges using the different tags and work my way down to the info. *In my current plan, I'm thinking of driving this from Excel. The data is set up like so: <MAIN_CATEGORY <DATA_PAGE num="0001" -- page info -- <GRID_LIST <LINE num="0001" -- line info -- <LINE num="0002" -- line info -- etc. The data I want is between the <LINE tags. I could get all the data I need by looping through and reading each data point from <LINE to </LINE into an array, then writing that back into Excel. *Or I could simply capture each <GRID_LIST into a range, save it out as a separate temp XML file, then import that into Excel. Is this really the best way to do this?? *Or is there a much easier way? *Any recommendations? Ed-- |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
How can I restore part of a file? | Charts and Charting in Excel | |||
get part of a file name from a cell value for use in a macro | Excel Programming | |||
Reference a cell as part of a file name | Excel Programming | |||
import part of a text file | Excel Programming |