Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
use vba to open pdf then save as xml
I can open manually my pdf file then choose: File- Save As- Tables in
Excel Spreadsheet (*xml) from the drop-down list. After this I can open this file in Excel and use some VBA to extract what I need. The above works manually and as I now receive a bunch of pdf's every couple of days it would be nice to automate in VBA the creation of the xml files from the pdf's if possible So far I have the following code that runs in VBA and opens the pdf file but I am unable to figure out the next step of how I can now save it as the xml file. Any help appreciated! Cheers, Peter Option Explicit Declare Function ShellExecute Lib "shell32dll" Alias "ShellExecuteA" ( _ ByVal hwnd As Long, _ ByVal lpOperation as String, _ ByVal lpFile as String, _ ByVal lpParameters as String, _ ByVal lpDirectory as String, _ ByVal nShowCmd as Long) As Long Const SW_HIDE As Long = 0 Const SW_SHOWNORMAL As Long = 1 Const SW_SHOWMAXIMIZED As Long = 3 Const SW_SHOWMINIMIZED As Long = 2 sub pdf_to_xml() ShellExecute Application.hwnd "open", "C:\temp\Test.pdf", vbNullString, "C:\", SW_SHOWNORMAL End Sub |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
use vba to open pdf then save as xml
Peter Jamieson wrote:
I can open manually my pdf file then choose: File- Save As- Tables in Excel Spreadsheet (*xml) from the drop-down list. After this I can open this file in Excel and use some VBA to extract what I need. The above works manually and as I now receive a bunch of pdf's every couple of days it would be nice to automate in VBA the creation of the xml files from the pdf's if possible So far I have the following code that runs in VBA and opens the pdf file but I am unable to figure out the next step of how I can now save it as the xml file. Any help appreciated! Jesus. You need to take Outlook Express out back and shoot it. Option Explicit Declare Function ShellExecute Lib "shell32dll" Alias "ShellExecuteA" ( _ ByVal hwnd As Long, _ ByVal lpOperation as String, _ ByVal lpFile as String, _ ByVal lpParameters as String, _ ByVal lpDirectory as String, _ ByVal nShowCmd as Long) As Long Const SW_HIDE As Long = 0 Const SW_SHOWNORMAL As Long = 1 Const SW_SHOWMAXIMIZED As Long = 3 Const SW_SHOWMINIMIZED As Long = 2 sub pdf_to_xml() ShellExecute Application.hwnd "open", "C:\temp\Test.pdf", vbNullString, "C:\", SW_SHOWNORMAL End Sub Do you know anything about programming? The above looks like it was copy- pasted from elsewhere with little understanding of its purpose. Still... (Caveat: I'm using Excel 2000, which doesn't support PDF or XML. Some of my info I had to get off MSDN.) The easiest way to see how to automate this would be to find the "record macro" menu item (in 2000, it's Tools - Macro - Record New Macro...; no idea about modern versions but the web says "Developer tab, in the Code group, click Macros") and then run through the process once. Stop recording after you've saved the XML file, then open the VBA editor and look through your recorded code. Doing that for saving a normal workbook as text gives me this (after I removed unnecessary comments, named arguments, and unneeded args): Sub Macro1() Workbooks.Open "E:\MS download links.xls" ActiveWorkbook.SaveAs "E:\MS download links.txt", xlText End Sub You should be able to do something similar for PDF-XML. To do all PDFs in a given directory, you could do something like this: Sub convertPdfToXml() cd "C:\path\to\the\PDF files" fPdf = Dir("*.pdf") While Len(fPdf) Workbooks.Open fPdf fXml = Replace(fPdf, ".pdf", ".xlsx", Compa=vbTextCompare) ActiveWorkbook.SaveAs fXml, FileFormat:=xlOpenXMLWorkbook ActiveWorkbook.Close fPdf = Dir Wend End Sub This *might* work for you IF you do the following: - fix the path containing the PDF files ("C:\path\to\the\PDF files") - confirm the "Workbooks.Open" line works as-is - confirm the extension of the XML files (I used ".xlsx"; change as needed) - check the "ActiveWorkbook.SaveAs" FileFormat (maybe xlXMLSpreadsheet?) Since I can't test this, it's up to you... but this is at least a start in the right direction. -- Things don't get better as they get older. Look at your truck. Look at your roof. Look in the MIRROR! |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
use vba to open pdf then save as xml
Hi Auric__,
Thx for your suggestions. Unfortunately the: Workbooks.Open fPdf only gets what looks like some formatting syntax from the pdf file, none of the text. I tried this suggestion of yours in Excel 2007 and 2003. Your other idea to record a macro did not work: the code got to the directory OK but could not open the pdf so no progress. Appreciate you help, Cheers, Peter "Auric__" wrote in message 44.100... Peter Jamieson wrote: I can open manually my pdf file then choose: File- Save As- Tables in Excel Spreadsheet (*xml) from the drop-down list. After this I can open this file in Excel and use some VBA to extract what I need. The above works manually and as I now receive a bunch of pdf's every couple of days it would be nice to automate in VBA the creation of the xml files from the pdf's if possible So far I have the following code that runs in VBA and opens the pdf file but I am unable to figure out the next step of how I can now save it as the xml file. Any help appreciated! Jesus. You need to take Outlook Express out back and shoot it. Option Explicit Declare Function ShellExecute Lib "shell32dll" Alias "ShellExecuteA" ( _ ByVal hwnd As Long, _ ByVal lpOperation as String, _ ByVal lpFile as String, _ ByVal lpParameters as String, _ ByVal lpDirectory as String, _ ByVal nShowCmd as Long) As Long Const SW_HIDE As Long = 0 Const SW_SHOWNORMAL As Long = 1 Const SW_SHOWMAXIMIZED As Long = 3 Const SW_SHOWMINIMIZED As Long = 2 sub pdf_to_xml() ShellExecute Application.hwnd "open", "C:\temp\Test.pdf", vbNullString, "C:\", SW_SHOWNORMAL End Sub Do you know anything about programming? The above looks like it was copy- pasted from elsewhere with little understanding of its purpose. Still... (Caveat: I'm using Excel 2000, which doesn't support PDF or XML. Some of my info I had to get off MSDN.) The easiest way to see how to automate this would be to find the "record macro" menu item (in 2000, it's Tools - Macro - Record New Macro...; no idea about modern versions but the web says "Developer tab, in the Code group, click Macros") and then run through the process once. Stop recording after you've saved the XML file, then open the VBA editor and look through your recorded code. Doing that for saving a normal workbook as text gives me this (after I removed unnecessary comments, named arguments, and unneeded args): Sub Macro1() Workbooks.Open "E:\MS download links.xls" ActiveWorkbook.SaveAs "E:\MS download links.txt", xlText End Sub You should be able to do something similar for PDF-XML. To do all PDFs in a given directory, you could do something like this: Sub convertPdfToXml() cd "C:\path\to\the\PDF files" fPdf = Dir("*.pdf") While Len(fPdf) Workbooks.Open fPdf fXml = Replace(fPdf, ".pdf", ".xlsx", Compa=vbTextCompare) ActiveWorkbook.SaveAs fXml, FileFormat:=xlOpenXMLWorkbook ActiveWorkbook.Close fPdf = Dir Wend End Sub This *might* work for you IF you do the following: - fix the path containing the PDF files ("C:\path\to\the\PDF files") - confirm the "Workbooks.Open" line works as-is - confirm the extension of the XML files (I used ".xlsx"; change as needed) - check the "ActiveWorkbook.SaveAs" FileFormat (maybe xlXMLSpreadsheet?) Since I can't test this, it's up to you... but this is at least a start in the right direction. -- Things don't get better as they get older. Look at your truck. Look at your roof. Look in the MIRROR! |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
use vba to open pdf then save as xml
Peter Jamieson presented the following explanation :
I can open manually my pdf file then choose: File- Save As- Tables in Excel Spreadsheet (*xml) from the drop-down list. After this I can open this file in Excel and use some VBA to extract what I need. The above works manually and as I now receive a bunch of pdf's every couple of days it would be nice to automate in VBA the creation of the xml files from the pdf's if possible So far I have the following code that runs in VBA and opens the pdf file but I am unable to figure out the next step of how I can now save it as the xml file. Any help appreciated! Cheers, Peter Option Explicit Declare Function ShellExecute Lib "shell32dll" Alias "ShellExecuteA" ( _ ByVal hwnd As Long, _ ByVal lpOperation as String, _ ByVal lpFile as String, _ ByVal lpParameters as String, _ ByVal lpDirectory as String, _ ByVal nShowCmd as Long) As Long Const SW_HIDE As Long = 0 Const SW_SHOWNORMAL As Long = 1 Const SW_SHOWMAXIMIZED As Long = 3 Const SW_SHOWMINIMIZED As Long = 2 sub pdf_to_xml() ShellExecute Application.hwnd "open", "C:\temp\Test.pdf", vbNullString, "C:\", SW_SHOWNORMAL End Sub Why do you think Excel can open a PDF file? I'd think you need a software that is able to convert a PDF to your chosen format if capable of doing so. The structure of a PDF isn't something Excel or VBA can handle same as it would a text file! -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
use vba to open pdf then save as xml
Thx Garry,
Yes I am aware of that and have used various pdf to xl converters but none of them were able to perform well enough for what I need. I am not trying to open the pdf in Excel directly, I am trying to use VBA to manipulate the existing manual conversion process. When I manually convert the pdf with "Save As Tables in Excel Spreadsheet" (within the adobe exec) the resultant .xml files are then easily parsed by my VBA code, far more accurately than any of the bought or free pdf to excel converters I have tried. An alternative is maybe Perl or AutoIt but I was hoping to keep everything within the Excel/VBA project. Anyway, thanks for your interest, cheers, Peter "GS" wrote in message ... Peter Jamieson presented the following explanation : I can open manually my pdf file then choose: File- Save As- Tables in Excel Spreadsheet (*xml) from the drop-down list. After this I can open this file in Excel and use some VBA to extract what I need. The above works manually and as I now receive a bunch of pdf's every couple of days it would be nice to automate in VBA the creation of the xml files from the pdf's if possible So far I have the following code that runs in VBA and opens the pdf file but I am unable to figure out the next step of how I can now save it as the xml file. Any help appreciated! Cheers, Peter Option Explicit Declare Function ShellExecute Lib "shell32dll" Alias "ShellExecuteA" ( _ ByVal hwnd As Long, _ ByVal lpOperation as String, _ ByVal lpFile as String, _ ByVal lpParameters as String, _ ByVal lpDirectory as String, _ ByVal nShowCmd as Long) As Long Const SW_HIDE As Long = 0 Const SW_SHOWNORMAL As Long = 1 Const SW_SHOWMAXIMIZED As Long = 3 Const SW_SHOWMINIMIZED As Long = 2 sub pdf_to_xml() ShellExecute Application.hwnd "open", "C:\temp\Test.pdf", vbNullString, "C:\", SW_SHOWNORMAL End Sub Why do you think Excel can open a PDF file? I'd think you need a software that is able to convert a PDF to your chosen format if capable of doing so. The structure of a PDF isn't something Excel or VBA can handle same as it would a text file! -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
use vba to open pdf then save as xml
Peter,
Have you had any luck with this? I have a hobby of collecting acronyms for a new job I have taken have created an excel spreadsheet that will alphabetize and show the definition of an acronym (if I already have it) or I can post a new acronym or jump to any particular spot in my 'database'. But I am looking to make a leap to being able to open acronym containing files (pdfs, csvs, the like), extract all the new info into my database. Sincerely, Dave On Friday, December 02, 2011 1:34 AM Peter Jamieson wrote: I can open manually my pdf file then choose: File- Save As- Tables in Excel Spreadsheet (*xml) from the drop-down list. After this I can open this file in Excel and use some VBA to extract what I need. The above works manually and as I now receive a bunch of pdf's every couple of days it would be nice to automate in VBA the creation of the xml files from the pdf's if possible So far I have the following code that runs in VBA and opens the pdf file but I am unable to figure out the next step of how I can now save it as the xml file. Any help appreciated! Cheers, Peter Option Explicit Declare Function ShellExecute Lib "shell32dll" Alias "ShellExecuteA" ( _ ByVal hwnd As Long, _ ByVal lpOperation as String, _ ByVal lpFile as String, _ ByVal lpParameters as String, _ ByVal lpDirectory as String, _ ByVal nShowCmd as Long) As Long Const SW_HIDE As Long = 0 Const SW_SHOWNORMAL As Long = 1 Const SW_SHOWMAXIMIZED As Long = 3 Const SW_SHOWMINIMIZED As Long = 2 sub pdf_to_xml() ShellExecute Application.hwnd "open", "C:\temp\Test.pdf", vbNullString, "C:\", SW_SHOWNORMAL End Sub On Friday, December 02, 2011 8:22 AM Auric__ wrote: Peter Jamieson wrote: Jesus. You need to take Outlook Express out back and shoot it. Do you know anything about programming? The above looks like it was copy- pasted from elsewhere with little understanding of its purpose. Still... (Caveat: I am using Excel 2000, which does not support PDF or XML. Some of my info I had to get off MSDN.) The easiest way to see how to automate this would be to find the "record macro" menu item (in 2000, it is Tools - Macro - Record New Macro...; no idea about modern versions but the web says "Developer tab, in the Code group, click Macros") and then run through the process once. Stop recording after you have saved the XML file, then open the VBA editor and look through your recorded code. Doing that for saving a normal workbook as text gives me this (after I removed unnecessary comments, named arguments, and unneeded args): Sub Macro1() Workbooks.Open "E:\MS download links.xls" ActiveWorkbook.SaveAs "E:\MS download links.txt", xlText End Sub You should be able to do something similar for PDF-XML. To do all PDFs in a given directory, you could do something like this: Sub convertPdfToXml() cd "C:\path\to\the\PDF files" fPdf = Dir("*.pdf") While Len(fPdf) Workbooks.Open fPdf fXml = Replace(fPdf, ".pdf", ".xlsx", Compa=vbTextCompare) ActiveWorkbook.SaveAs fXml, FileFormat:=xlOpenXMLWorkbook ActiveWorkbook.Close fPdf = Dir Wend End Sub This *might* work for you IF you do the following: - fix the path containing the PDF files ("C:\path\to\the\PDF files") - confirm the "Workbooks.Open" line works as-is - confirm the extension of the XML files (I used ".xlsx"; change as needed) - check the "ActiveWorkbook.SaveAs" FileFormat (maybe xlXMLSpreadsheet?) Since I cannot test this, it is up to you... but this is at least a start in the right direction. -- Things do not get better as they get older. Look at your truck. Look at your roof. Look in the MIRROR! On Friday, December 02, 2011 12:00 PM Peter Jamieson wrote: Hi Auric__, Thx for your suggestions. Unfortunately the: Workbooks.Open fPdf only gets what looks like some formatting syntax from the pdf file, none of the text. I tried this suggestion of yours in Excel 2007 and 2003. Your other idea to record a macro did not work: the code got to the directory OK but could not open the pdf so no progress. Appreciate you help, Cheers, Peter On Friday, December 02, 2011 4:06 PM GS wrote: Peter Jamieson presented the following explanation : Why do you think Excel can open a PDF file? I'd think you need a software that is able to convert a PDF to your chosen format if capable of doing so. The structure of a PDF is not something Excel or VBA can handle same as it would a text file! -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc On Friday, December 02, 2011 7:45 PM Peter Jamieson wrote: Thx Garry, Yes I am aware of that and have used various pdf to xl converters but none of them were able to perform well enough for what I need. I am not trying to open the pdf in Excel directly, I am trying to use VBA to manipulate the existing manual conversion process. When I manually convert the pdf with "Save As Tables in Excel Spreadsheet" (within the adobe exec) the resultant .xml files are then easily parsed by my VBA code, far more accurately than any of the bought or free pdf to excel converters I have tried. An alternative is maybe Perl or AutoIt but I was hoping to keep everything within the Excel/VBA project. Anyway, thanks for your interest, cheers, Peter |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
use vba to open pdf then save as xml
Peter Jamieson wrote on 12/02/2011 01:34 ET :
I can open manually my pdf file then choose: File- Save As- Tables in Excel Spreadsheet (*xml) from the drop-down list. After this I can open this file in Excel and use some VBA to extract what I need. The above works manually and as I now receive a bunch of pdf's every couple of days it would be nice to automate in VBA the creation of the xml files from the pdf's if possible So far I have the following code that runs in VBA and opens the pdf file but I am unable to figure out the next step of how I can now save it as the xml file. Any help appreciated! Cheers, Peter Option Explicit Declare Function ShellExecute Lib "shell32dll" Alias "ShellExecuteA" ( _ ByVal hwnd As Long, _ ByVal lpOperation as String, _ ByVal lpFile as String, _ ByVal lpParameters as String, _ ByVal lpDirectory as String, _ ByVal nShowCmd as Long) As Long Const SW_HIDE As Long = 0 Const SW_SHOWNORMAL As Long = 1 Const SW_SHOWMAXIMIZED As Long = 3 Const SW_SHOWMINIMIZED As Long = 2 sub pdf_to_xml() ShellExecute Application.hwnd "open", "C:tempTest.pdf", vbNullString, "C:", SW_SHOWNORMAL End Sub did you find the solution, I am too trying to do the same.....inserting a pdf than I need the output as a single xml file. Regards.. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Looping Open and Save as CSV | Excel Programming | |||
whenever open or save it is very slow to open | Excel Discussion (Misc queries) | |||
Open-Save Strategy | Excel Programming | |||
Open and Save XML | Excel Discussion (Misc queries) | |||
open & save old xls file | Excel Programming |