Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
external usenet poster
 
Posts: 120
Default Trying to get just part of an XML file?

This is posted to both the Word and Excel VBA NG because so far that's
the only way I can figure out to do this. But wait - I throw in FSO
to make it even messier! Using Word and Excel 2007.

We have a proprietary program that saves its working files as XML.
When a user needs to add a list of data, he's presented with either a
grid format that's a pain to work with, or a single-form-per-item page
that's easier to use but wearying after 35+ items. What I want to do
is get the part of the XML to be revised into an Excel worksheet so
the user can work easier. Because this program is proprietary, the
schema is not available to me, and since I'm not familiar with XML
except as text separated by tags, I wouldn't know how to use it.

So my bright idea so far is to open the file for reading in FSO, then
paste that into a new Word doc. This gives me a text file with XML
tags, but without Word's interpretation of the XML. Direct import
into Excel causes Excel to interpret the tags and buries the
information I want to get. Parsing the text in Word, I can set ranges
using the different tags and work my way down to the info. In my
current plan, I'm thinking of driving this from Excel.

The data is set up like so:
<MAIN_CATEGORY
<DATA_PAGE num="0001"
-- page info --
<GRID_LIST
<LINE num="0001"
-- line info --
<LINE num="0002"
-- line info --
etc.

The data I want is between the <LINE tags.

I could get all the data I need by looping through and reading each
data point from <LINE to </LINE into an array, then writing that
back into Excel. Or I could simply capture each <GRID_LIST into a
range, save it out as a separate temp XML file, then import that into
Excel.

Is this really the best way to do this?? Or is there a much easier
way? Any recommendations?
Ed
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1
Default Trying to get just part of an XML file?


I have written plenty of programs to parse data files to extract the
data I need. I don't think it is very complicated to extract what you
need and yo don't even need to put it into an array. You can write it
directly into excel

The code below I wrote in under 5 minutes. It reads the line
immediately after the line that contains the start tag starting in
column 1. I can modify the code if you gave me a better example of the
data. Your sample doesn't contain the end tag and I don't know if the
tags start in column 1.


Sub ParseXML()
Const ReadFile = "c:\temp\event.txt"

Const ForReading = 1, ForWriting = 2, _
ForAppending = 3

Set fs = CreateObject("Scripting.FileSystemObject")
Set fin = fs.OpenTextFile(ReadFile, _
ForReading, TristateFalse)

RowCount = 1
ReadNextLine = False
Do While fin.AtEndOfStream < True

ReadData = fin.readline
If ReadData < Val(ReadData) = 100 Then

Select Case ReadNextLine

Case True
Range("A" & RowCount) = ReadData
ReadNextLine = False
RowCount = RowCount + 1
Case False
If Left(ReadDate, 9) = "<LINE num" Then
ReadNextLine = True
End If

End Select
End If
Loop
fin.Close
End Sub


--
joel
------------------------------------------------------------------------
joel's Profile: 229
View this thread: http://www.thecodecage.com/forumz/sh...d.php?t=173301

Microsoft Office Help

  #3   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
external usenet poster
 
Posts: 4
Default Trying to get just part of an XML file?

I would probably try a different approach to this, although I cannot
provide you with useful detail or code:
a. use Excel VBA and MSXML to extract the nodes and data from the XML.
You don't need a schema to do that. That should save you having to do
your own parsing, and you won't need to involve Word.
b. optionally, if it makes sense to do so, construct an XSLT transform
to transform your XML structure into something simpler to deal with, and
use MSXML to perform the transform. The impression I get is that your
XML is simple enough that you would not need that, which would be handy
because XSLT can be hard work if you are starting from scratch.

Then if I couldn't make that work I'd probably fall back to an approach
that was either similar to yours, except that right now I don't see why
it would be easier to parse the XML in a Word document rather than doing
it directly with Excel VBA.


Peter Jamieson

http://tips.pjmsn.me.uk

On 25/01/2010 22:21, Ed from AZ wrote:
This is posted to both the Word and Excel VBA NG because so far that's
the only way I can figure out to do this. But wait - I throw in FSO
to make it even messier! Using Word and Excel 2007.

We have a proprietary program that saves its working files as XML.
When a user needs to add a list of data, he's presented with either a
grid format that's a pain to work with, or a single-form-per-item page
that's easier to use but wearying after 35+ items. What I want to do
is get the part of the XML to be revised into an Excel worksheet so
the user can work easier. Because this program is proprietary, the
schema is not available to me, and since I'm not familiar with XML
except as text separated by tags, I wouldn't know how to use it.

So my bright idea so far is to open the file for reading in FSO, then
paste that into a new Word doc. This gives me a text file with XML
tags, but without Word's interpretation of the XML. Direct import
into Excel causes Excel to interpret the tags and buries the
information I want to get. Parsing the text in Word, I can set ranges
using the different tags and work my way down to the info. In my
current plan, I'm thinking of driving this from Excel.

The data is set up like so:
<MAIN_CATEGORY
<DATA_PAGE num="0001"
-- page info --
<GRID_LIST
<LINE num="0001"
-- line info --
<LINE num="0002"
-- line info --
etc.

The data I want is between the<LINE tags.

I could get all the data I need by looping through and reading each
data point from<LINE to</LINE into an array, then writing that
back into Excel. Or I could simply capture each<GRID_LIST into a
range, save it out as a separate temp XML file, then import that into
Excel.

Is this really the best way to do this?? Or is there a much easier
way? Any recommendations?
Ed

  #4   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
external usenet poster
 
Posts: 7,247
Default Trying to get just part of an XML file?

You can read the XML file directly using the MSXML library. In VBA,
go to the Tools menu, choose References, and scroll down to "Microsoft
XML, v6.0" and check that item. The code uses some relatively simple
XPath to select the nodes, so I assume that you (1) already know
XPath, or (2) can figure it out, or (3) can ask for assistance with
XPath.

Without know the actual layout of your XML, I just guessed and made up
the following XML file:

<?xml version="1.0" encoding="UTF-8"?
<MAIN_CATEGORY
<DATA_PAGE num="001"
<GRID_LIST
<LINE num="001"
<nameChip Pearson</name
<cityKansas City</city
</LINE
<LINE num="002"
<nameJen Filson</name
<cityMission Hills</city
</LINE
</GRID_LIST
</DATA_PAGE
<DATA_PAGE num="002"
<GRID_LIST
<LINE num="003"
<nameJanice Downey</name
<cityWestwood Hills</city
</LINE
<LINE num="04"
<nameMike Stitt</name
<cityCharlotte</city
</LINE
</GRID_LIST
</DATA_PAGE
</MAIN_CATEGORY

Then, you can use code like


Sub AAA()
' Required reference:
' Name: MSXML2
' Description: Microsoft XML, v6.0
' Typical location: C:\Windows\System32\msxml6.dll
' GUID: {F5078F18-C551-11D3-89B9-0000F81FE221}
' Major: 6 Minor: 0

Dim DOM As MSXML2.DOMDocument60
Dim FName As Variant
Dim DataPageList As MSXML2.IXMLDOMNodeList
Dim DataPage As MSXML2.IXMLDOMNodeList
Dim GridList As MSXML2.IXMLDOMNodeList
Dim Grid As MSXML2.IXMLDOMNodeList
Dim LinesList As MSXML2.IXMLDOMNodeList
Dim LineX As MSXML2.IXMLDOMNode
Dim Arr() As String
Dim N As Long
Dim M As Long
Dim P As Long
Dim K As Long
Dim R As Range
Dim J As Long

Set R = Range("C3") '<<<<< OUTPUT ON WORKSHEET
FName = Application.GetOpenFilename("XML Files (*.xml),*.xml")
If FName = False Then
Exit Sub
End If
Set DOM = New MSXML2.DOMDocument60
DOM.Load CStr(FName)
DOM.setProperty "SelectionLanguage", "XPath"

' get all data pages
Set DataPageList = DOM.SelectNodes("/MAIN_CATEGORY/DATA_PAGE")
For N = 0 To DataPageList.Length - 1
Set GridList = DataPageList.Item(N).SelectNodes("./GRID_LIST")
For M = 0 To GridList.Length - 1
Set LinesList = GridList.Item(M).SelectNodes("./LINE")
ReDim Arr(0 To LinesList.Length - 1)
K = 0
For P = 0 To LinesList.Length - 1
Set LineX = LinesList.Item(P)
Arr(K) = LineX.ChildNodes(0).Text & vbCrLf & _
LineX.ChildNodes(1).Text
K = K + 1
Next P
'<< write to worksheet
For J = 0 To UBound(Arr)
R.Offset(0, J).Value = Arr(J)
Next J
Set R = R.Offset(1, 0)
Next M
Next N
End Sub

This lists the contents between the <LINE and </LINE tags out to the
worksheet. You'll have to modfy the code a bit to conform with your
actual data structure, but the code will be very muich like that shown
above.

Cordially,
Chip Pearson
Microsoft MVP 1998 - 2010
Pearson Software Consulting, LLC
www.cpearson.com
[email on web site]







On Mon, 25 Jan 2010 14:21:20 -0800 (PST), Ed from AZ
wrote:

This is posted to both the Word and Excel VBA NG because so far that's
the only way I can figure out to do this. But wait - I throw in FSO
to make it even messier! Using Word and Excel 2007.

We have a proprietary program that saves its working files as XML.
When a user needs to add a list of data, he's presented with either a
grid format that's a pain to work with, or a single-form-per-item page
that's easier to use but wearying after 35+ items. What I want to do
is get the part of the XML to be revised into an Excel worksheet so
the user can work easier. Because this program is proprietary, the
schema is not available to me, and since I'm not familiar with XML
except as text separated by tags, I wouldn't know how to use it.

So my bright idea so far is to open the file for reading in FSO, then
paste that into a new Word doc. This gives me a text file with XML
tags, but without Word's interpretation of the XML. Direct import
into Excel causes Excel to interpret the tags and buries the
information I want to get. Parsing the text in Word, I can set ranges
using the different tags and work my way down to the info. In my
current plan, I'm thinking of driving this from Excel.

The data is set up like so:
<MAIN_CATEGORY
<DATA_PAGE num="0001"
-- page info --
<GRID_LIST
<LINE num="0001"
-- line info --
<LINE num="0002"
-- line info --
etc.

The data I want is between the <LINE tags.

I could get all the data I need by looping through and reading each
data point from <LINE to </LINE into an array, then writing that
back into Excel. Or I could simply capture each <GRID_LIST into a
range, save it out as a separate temp XML file, then import that into
Excel.

Is this really the best way to do this?? Or is there a much easier
way? Any recommendations?
Ed

  #5   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
external usenet poster
 
Posts: 120
Default Trying to get just part of an XML file?

Wow!! I didn't even know this existed! Thank you!!

I'll holler back if things start smoking!
Ed


On Jan 28, 12:26*pm, Chip Pearson wrote:
You can read the XML file directly using the MSXML library. *In VBA,
go to the Tools menu, choose References, and scroll down to "Microsoft
XML, v6.0" and check that item. *The code uses some relatively simple
XPath to select the nodes, so I assume that you (1) already know
XPath, or (2) can figure it out, or (3) can ask for assistance with
XPath.

Without know the actual layout of your XML, I just guessed and made up
the following XML file:

<?xml version="1.0" encoding="UTF-8"?
<MAIN_CATEGORY
* * * * <DATA_PAGE num="001"
* * * * * * * * <GRID_LIST
* * * * * * * * * * * * <LINE num="001"
* * * * * * * * * * * * * * * * <nameChip Pearson</name
* * * * * * * * * * * * * * * * <cityKansas City</city
* * * * * * * * * * * * </LINE
* * * * * * * * * * * * <LINE num="002"
* * * * * * * * * * * * * * * * <nameJen Filson</name
* * * * * * * * * * * * * * * * <cityMission Hills</city
* * * * * * * * * * * * </LINE
* * * * * * * * </GRID_LIST
* * * * </DATA_PAGE
* * * * <DATA_PAGE num="002"
* * * * * * * * <GRID_LIST
* * * * * * * * * * * * <LINE num="003"
* * * * * * * * * * * * * * * * <nameJanice Downey</name
* * * * * * * * * * * * * * * * <cityWestwood Hills</city
* * * * * * * * * * * * </LINE
* * * * * * * * * * * * <LINE num="04"
* * * * * * * * * * * * * * * * <nameMike Stitt</name
* * * * * * * * * * * * * * * * <cityCharlotte</city
* * * * * * * * * * * * </LINE
* * * * * * * * </GRID_LIST
* * * * </DATA_PAGE
</MAIN_CATEGORY

Then, you can use code like

Sub AAA()
' Required reference:
' * Name: MSXML2
' * Description: Microsoft XML, v6.0
' * Typical location: C:\Windows\System32\msxml6.dll
' * GUID: {F5078F18-C551-11D3-89B9-0000F81FE221}
' * Major: 6 * *Minor: 0

Dim DOM As MSXML2.DOMDocument60
Dim FName As Variant
Dim DataPageList As MSXML2.IXMLDOMNodeList
Dim DataPage As MSXML2.IXMLDOMNodeList
Dim GridList As MSXML2.IXMLDOMNodeList
Dim Grid As MSXML2.IXMLDOMNodeList
Dim LinesList As MSXML2.IXMLDOMNodeList
Dim LineX As MSXML2.IXMLDOMNode
Dim Arr() As String
Dim N As Long
Dim M As Long
Dim P As Long
Dim K As Long
Dim R As Range
Dim J As Long

Set R = Range("C3") '<<<<< OUTPUT ON WORKSHEET
FName = Application.GetOpenFilename("XML Files (*.xml),*.xml")
If FName = False Then
* * Exit Sub
End If
Set DOM = New MSXML2.DOMDocument60
DOM.Load CStr(FName)
DOM.setProperty "SelectionLanguage", "XPath"

' get all data pages
Set DataPageList = DOM.SelectNodes("/MAIN_CATEGORY/DATA_PAGE")
For N = 0 To DataPageList.Length - 1
* * Set GridList = DataPageList.Item(N).SelectNodes("./GRID_LIST")
* * For M = 0 To GridList.Length - 1
* * * * Set LinesList = GridList.Item(M).SelectNodes("./LINE")
* * * * ReDim Arr(0 To LinesList.Length - 1)
* * * * K = 0
* * * * For P = 0 To LinesList.Length - 1
* * * * * * Set LineX = LinesList.Item(P)
* * * * * * Arr(K) = LineX.ChildNodes(0).Text & vbCrLf & _
* * * * * * * * LineX.ChildNodes(1).Text
* * * * * * K = K + 1
* * * * Next P
* * * * '<< write to worksheet
* * * * For J = 0 To UBound(Arr)
* * * * * * R.Offset(0, J).Value = Arr(J)
* * * * Next J
* * * * Set R = R.Offset(1, 0)
* * Next M
Next N
End Sub

This lists the contents between the <LINE and </LINE tags out to the
worksheet. *You'll have to modfy the code a bit to conform with your
actual data structure, but the code will be very muich like that shown
above.

Cordially,
Chip Pearson
Microsoft MVP 1998 - 2010
Pearson Software Consulting, LLCwww.cpearson.com
[email on web site]

On Mon, 25 Jan 2010 14:21:20 -0800 (PST), Ed from AZ



wrote:
This is posted to both the Word and Excel VBA NG because so far that's
the only way I can figure out to do this. *But wait - I throw in FSO
to make it even messier! *Using Word and Excel 2007.


We have a proprietary program that saves its working files as XML.
When a user needs to add a list of data, he's presented with either a
grid format that's a pain to work with, or a single-form-per-item page
that's easier to use but wearying after 35+ items. *What I want to do
is get the part of the XML to be revised into an Excel worksheet so
the user can work easier. *Because this program is proprietary, the
schema is not available to me, and since I'm not familiar with XML
except as text separated by tags, I wouldn't know how to use it.


So my bright idea so far is to open the file for reading in FSO, then
paste that into a new Word doc. *This gives me a text file with XML
tags, but without Word's interpretation of the XML. *Direct import
into Excel causes Excel to interpret the tags and buries the
information I want to get. *Parsing the text in Word, I can set ranges
using the different tags and work my way down to the info. *In my
current plan, I'm thinking of driving this from Excel.


The data is set up like so:
<MAIN_CATEGORY
<DATA_PAGE num="0001"
-- page info --
<GRID_LIST
<LINE num="0001"
-- line info --
<LINE num="0002"
-- line info --
etc.


The data I want is between the <LINE tags.


I could get all the data I need by looping through and reading each
data point from <LINE to </LINE into an array, then writing that
back into Excel. *Or I could simply capture each <GRID_LIST into a
range, save it out as a separate temp XML file, then import that into
Excel.


Is this really the best way to do this?? *Or is there a much easier
way? *Any recommendations?
Ed- Hide quoted text -


- Show quoted text -




  #6   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
external usenet poster
 
Posts: 120
Default Trying to get just part of an XML file?

I just realized one of the major reasons I included Word in this mix.
I am pulling the data into an Excel sheet so I can add to it. I then
need to write the modified data back into the XML file.

Not being familiar with XML or XPath, my thought train was:
-- Read XML file via FSO and paste into Word
-- Find the major groups and sub-groups and set bookmarks
-- Pull the data into Excel by parsing the tags and text
-- Add new data and modify existing data as required
-- Create an array of each sub-group and group, wrapping the data in
the proper tags
-- Replace the contents of the associated bookmark range with the new
data
-- Write the whole kaboodle back into the XML file

So the hiccup in my thinking was how to get the data from Excel back
into the XML file. All I could think of was using the bookmarks and
ranges in Word.

Is there a better way?
Ed


On Jan 28, 12:26*pm, Chip Pearson wrote:
You can read the XML file directly using the MSXML library. *In VBA,
go to the Tools menu, choose References, and scroll down to "Microsoft
XML, v6.0" and check that item. *The code uses some relatively simple
XPath to select the nodes, so I assume that you (1) already know
XPath, or (2) can figure it out, or (3) can ask for assistance with
XPath.

Without know the actual layout of your XML, I just guessed and made up
the following XML file:

<?xml version="1.0" encoding="UTF-8"?
<MAIN_CATEGORY
* * * * <DATA_PAGE num="001"
* * * * * * * * <GRID_LIST
* * * * * * * * * * * * <LINE num="001"
* * * * * * * * * * * * * * * * <nameChip Pearson</name
* * * * * * * * * * * * * * * * <cityKansas City</city
* * * * * * * * * * * * </LINE
* * * * * * * * * * * * <LINE num="002"
* * * * * * * * * * * * * * * * <nameJen Filson</name
* * * * * * * * * * * * * * * * <cityMission Hills</city
* * * * * * * * * * * * </LINE
* * * * * * * * </GRID_LIST
* * * * </DATA_PAGE
* * * * <DATA_PAGE num="002"
* * * * * * * * <GRID_LIST
* * * * * * * * * * * * <LINE num="003"
* * * * * * * * * * * * * * * * <nameJanice Downey</name
* * * * * * * * * * * * * * * * <cityWestwood Hills</city
* * * * * * * * * * * * </LINE
* * * * * * * * * * * * <LINE num="04"
* * * * * * * * * * * * * * * * <nameMike Stitt</name
* * * * * * * * * * * * * * * * <cityCharlotte</city
* * * * * * * * * * * * </LINE
* * * * * * * * </GRID_LIST
* * * * </DATA_PAGE
</MAIN_CATEGORY

Then, you can use code like

Sub AAA()
' Required reference:
' * Name: MSXML2
' * Description: Microsoft XML, v6.0
' * Typical location: C:\Windows\System32\msxml6.dll
' * GUID: {F5078F18-C551-11D3-89B9-0000F81FE221}
' * Major: 6 * *Minor: 0

Dim DOM As MSXML2.DOMDocument60
Dim FName As Variant
Dim DataPageList As MSXML2.IXMLDOMNodeList
Dim DataPage As MSXML2.IXMLDOMNodeList
Dim GridList As MSXML2.IXMLDOMNodeList
Dim Grid As MSXML2.IXMLDOMNodeList
Dim LinesList As MSXML2.IXMLDOMNodeList
Dim LineX As MSXML2.IXMLDOMNode
Dim Arr() As String
Dim N As Long
Dim M As Long
Dim P As Long
Dim K As Long
Dim R As Range
Dim J As Long

Set R = Range("C3") '<<<<< OUTPUT ON WORKSHEET
FName = Application.GetOpenFilename("XML Files (*.xml),*.xml")
If FName = False Then
* * Exit Sub
End If
Set DOM = New MSXML2.DOMDocument60
DOM.Load CStr(FName)
DOM.setProperty "SelectionLanguage", "XPath"

' get all data pages
Set DataPageList = DOM.SelectNodes("/MAIN_CATEGORY/DATA_PAGE")
For N = 0 To DataPageList.Length - 1
* * Set GridList = DataPageList.Item(N).SelectNodes("./GRID_LIST")
* * For M = 0 To GridList.Length - 1
* * * * Set LinesList = GridList.Item(M).SelectNodes("./LINE")
* * * * ReDim Arr(0 To LinesList.Length - 1)
* * * * K = 0
* * * * For P = 0 To LinesList.Length - 1
* * * * * * Set LineX = LinesList.Item(P)
* * * * * * Arr(K) = LineX.ChildNodes(0).Text & vbCrLf & _
* * * * * * * * LineX.ChildNodes(1).Text
* * * * * * K = K + 1
* * * * Next P
* * * * '<< write to worksheet
* * * * For J = 0 To UBound(Arr)
* * * * * * R.Offset(0, J).Value = Arr(J)
* * * * Next J
* * * * Set R = R.Offset(1, 0)
* * Next M
Next N
End Sub

This lists the contents between the <LINE and </LINE tags out to the
worksheet. *You'll have to modfy the code a bit to conform with your
actual data structure, but the code will be very muich like that shown
above.

Cordially,
Chip Pearson
Microsoft MVP 1998 - 2010
Pearson Software Consulting, LLCwww.cpearson.com
[email on web site]

On Mon, 25 Jan 2010 14:21:20 -0800 (PST), Ed from AZ



wrote:
This is posted to both the Word and Excel VBA NG because so far that's
the only way I can figure out to do this. *But wait - I throw in FSO
to make it even messier! *Using Word and Excel 2007.


We have a proprietary program that saves its working files as XML.
When a user needs to add a list of data, he's presented with either a
grid format that's a pain to work with, or a single-form-per-item page
that's easier to use but wearying after 35+ items. *What I want to do
is get the part of the XML to be revised into an Excel worksheet so
the user can work easier. *Because this program is proprietary, the
schema is not available to me, and since I'm not familiar with XML
except as text separated by tags, I wouldn't know how to use it.


So my bright idea so far is to open the file for reading in FSO, then
paste that into a new Word doc. *This gives me a text file with XML
tags, but without Word's interpretation of the XML. *Direct import
into Excel causes Excel to interpret the tags and buries the
information I want to get. *Parsing the text in Word, I can set ranges
using the different tags and work my way down to the info. *In my
current plan, I'm thinking of driving this from Excel.


The data is set up like so:
<MAIN_CATEGORY
<DATA_PAGE num="0001"
-- page info --
<GRID_LIST
<LINE num="0001"
-- line info --
<LINE num="0002"
-- line info --
etc.


The data I want is between the <LINE tags.


I could get all the data I need by looping through and reading each
data point from <LINE to </LINE into an array, then writing that
back into Excel. *Or I could simply capture each <GRID_LIST into a
range, save it out as a separate temp XML file, then import that into
Excel.


Is this really the best way to do this?? *Or is there a much easier
way? *Any recommendations?
Ed-

  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1
Default Trying to get just part of an XML file?


You can do a saveas in excel using XML format. You may loose some of
the non row/column data from the original xml file. A XML file "MAY"
contains other properties beside the Row/column data that you see in
excel. It depends on the datta contained in the original XML file.


--
joel
------------------------------------------------------------------------
joel's Profile: 229
View this thread: http://www.thecodecage.com/forumz/sh...d.php?t=173301

Microsoft Office Help

  #8   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.general
external usenet poster
 
Posts: 120
Default Trying to get just part of an XML file?

Well, now I'm thinking I can do it like this:
-- Begin reading the XML node by node, saving everything into a string
until it hits the node that begins the data I want to pull into Excel.
-- Continue reading, pulling into Excel, until it hits the node at the
end of the data.
-- Finish reading the file, saving everything into a second string.

When I finish in Excel, I can wrap all that in the proper tags, then
assemble strStart & strData & strEnd and write that back into the XML
file.

Leaves Word out of the loop (although I may leave it in for debugging
so I write out what I capture - it's probably too much for the
Debug.Print window!) and makes things a lot smoother.

If it starts smoking, I'll yell for help!
Ed


On Feb 3, 7:10*am, Ed from AZ wrote:
I just realized one of the major reasons I included Word in this mix.
I am pulling the data into an Excel sheet so I can add to it. *I then
need to write the modified data back into the XML file.

Not being familiar with XML or XPath, my thought train was:
-- Read XML file via FSO and paste into Word
-- Find the major groups and sub-groups and set bookmarks
-- Pull the data into Excel by parsing the tags and text
-- Add new data and modify existing data as required
-- Create an array of each sub-group and group, wrapping the data in
the proper tags
-- Replace the contents of the associated bookmark range with the new
data
-- Write the whole kaboodle back into the XML file

So the hiccup in my thinking was how to get the data from Excel back
into the XML file. *All I could think of was using the bookmarks and
ranges in Word.

Is there a better way?
Ed

On Jan 28, 12:26*pm, Chip Pearson wrote:



You can read the XML file directly using the MSXML library. *In VBA,
go to the Tools menu, choose References, and scroll down to "Microsoft
XML, v6.0" and check that item. *The code uses some relatively simple
XPath to select the nodes, so I assume that you (1) already know
XPath, or (2) can figure it out, or (3) can ask for assistance with
XPath.


Without know the actual layout of your XML, I just guessed and made up
the following XML file:


<?xml version="1.0" encoding="UTF-8"?
<MAIN_CATEGORY
* * * * <DATA_PAGE num="001"
* * * * * * * * <GRID_LIST
* * * * * * * * * * * * <LINE num="001"
* * * * * * * * * * * * * * * * <nameChip Pearson</name
* * * * * * * * * * * * * * * * <cityKansas City</city
* * * * * * * * * * * * </LINE
* * * * * * * * * * * * <LINE num="002"
* * * * * * * * * * * * * * * * <nameJen Filson</name
* * * * * * * * * * * * * * * * <cityMission Hills</city
* * * * * * * * * * * * </LINE
* * * * * * * * </GRID_LIST
* * * * </DATA_PAGE
* * * * <DATA_PAGE num="002"
* * * * * * * * <GRID_LIST
* * * * * * * * * * * * <LINE num="003"
* * * * * * * * * * * * * * * * <nameJanice Downey</name
* * * * * * * * * * * * * * * * <cityWestwood Hills</city
* * * * * * * * * * * * </LINE
* * * * * * * * * * * * <LINE num="04"
* * * * * * * * * * * * * * * * <nameMike Stitt</name
* * * * * * * * * * * * * * * * <cityCharlotte</city
* * * * * * * * * * * * </LINE
* * * * * * * * </GRID_LIST
* * * * </DATA_PAGE
</MAIN_CATEGORY


Then, you can use code like


Sub AAA()
' Required reference:
' * Name: MSXML2
' * Description: Microsoft XML, v6.0
' * Typical location: C:\Windows\System32\msxml6.dll
' * GUID: {F5078F18-C551-11D3-89B9-0000F81FE221}
' * Major: 6 * *Minor: 0


Dim DOM As MSXML2.DOMDocument60
Dim FName As Variant
Dim DataPageList As MSXML2.IXMLDOMNodeList
Dim DataPage As MSXML2.IXMLDOMNodeList
Dim GridList As MSXML2.IXMLDOMNodeList
Dim Grid As MSXML2.IXMLDOMNodeList
Dim LinesList As MSXML2.IXMLDOMNodeList
Dim LineX As MSXML2.IXMLDOMNode
Dim Arr() As String
Dim N As Long
Dim M As Long
Dim P As Long
Dim K As Long
Dim R As Range
Dim J As Long


Set R = Range("C3") '<<<<< OUTPUT ON WORKSHEET
FName = Application.GetOpenFilename("XML Files (*.xml),*.xml")
If FName = False Then
* * Exit Sub
End If
Set DOM = New MSXML2.DOMDocument60
DOM.Load CStr(FName)
DOM.setProperty "SelectionLanguage", "XPath"


' get all data pages
Set DataPageList = DOM.SelectNodes("/MAIN_CATEGORY/DATA_PAGE")
For N = 0 To DataPageList.Length - 1
* * Set GridList = DataPageList.Item(N).SelectNodes("./GRID_LIST")
* * For M = 0 To GridList.Length - 1
* * * * Set LinesList = GridList.Item(M).SelectNodes("./LINE")
* * * * ReDim Arr(0 To LinesList.Length - 1)
* * * * K = 0
* * * * For P = 0 To LinesList.Length - 1
* * * * * * Set LineX = LinesList.Item(P)
* * * * * * Arr(K) = LineX.ChildNodes(0).Text & vbCrLf & _
* * * * * * * * LineX.ChildNodes(1).Text
* * * * * * K = K + 1
* * * * Next P
* * * * '<< write to worksheet
* * * * For J = 0 To UBound(Arr)
* * * * * * R.Offset(0, J).Value = Arr(J)
* * * * Next J
* * * * Set R = R.Offset(1, 0)
* * Next M
Next N
End Sub


This lists the contents between the <LINE and </LINE tags out to the
worksheet. *You'll have to modfy the code a bit to conform with your
actual data structure, but the code will be very muich like that shown
above.


Cordially,
Chip Pearson
Microsoft MVP 1998 - 2010
Pearson Software Consulting, LLCwww.cpearson.com
[email on web site]


On Mon, 25 Jan 2010 14:21:20 -0800 (PST), Ed from AZ


wrote:
This is posted to both the Word and Excel VBA NG because so far that's
the only way I can figure out to do this. *But wait - I throw in FSO
to make it even messier! *Using Word and Excel 2007.


We have a proprietary program that saves its working files as XML.
When a user needs to add a list of data, he's presented with either a
grid format that's a pain to work with, or a single-form-per-item page
that's easier to use but wearying after 35+ items. *What I want to do
is get the part of the XML to be revised into an Excel worksheet so
the user can work easier. *Because this program is proprietary, the
schema is not available to me, and since I'm not familiar with XML
except as text separated by tags, I wouldn't know how to use it.


So my bright idea so far is to open the file for reading in FSO, then
paste that into a new Word doc. *This gives me a text file with XML
tags, but without Word's interpretation of the XML. *Direct import
into Excel causes Excel to interpret the tags and buries the
information I want to get. *Parsing the text in Word, I can set ranges
using the different tags and work my way down to the info. *In my
current plan, I'm thinking of driving this from Excel.


The data is set up like so:
<MAIN_CATEGORY
<DATA_PAGE num="0001"
-- page info --
<GRID_LIST
<LINE num="0001"
-- line info --
<LINE num="0002"
-- line info --
etc.


The data I want is between the <LINE tags.


I could get all the data I need by looping through and reading each
data point from <LINE to </LINE into an array, then writing that
back into Excel. *Or I could simply capture each <GRID_LIST into a
range, save it out as a separate temp XML file, then import that into
Excel.


Is this really the best way to do this?? *Or is there a much easier
way? *Any recommendations?
Ed--

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How can I restore part of a file? poohabear Charts and Charting in Excel 1 April 2nd 07 03:45 PM
get part of a file name from a cell value for use in a macro Amethyst Excel Programming 5 February 10th 06 06:06 PM
Reference a cell as part of a file name yobrokerboy Excel Programming 13 December 26th 05 09:45 PM
import part of a text file L Mehl Excel Programming 11 March 3rd 04 06:19 PM


All times are GMT +1. The time now is 02:00 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"