Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
Dear all:
I am currently working on a rather large document that I need to format a certain specific way. The document includes nearly 57,000 lines of plain text that need to be indented according to a simple rule so that it can then be imported into an application as a tab- delimited file. My system is a Mac G5 running the latest version of Leopard (not Snow Leopard) which is 10.5.8 and I have Microsoft Office 2004 installed (haven't upgraded to 2008 and will probably wait until next version to upgrade). Below is an example of the type of file I now have: Body Regions;A01 Abdomen;A01.047 Abdominal Cavity;A01.047.025 Peritoneum;A01.047.025.600 Douglas' Pouch;A01.047.025.600.225 Mesentery;A01.047.025.600.451 Mesocolon;A01.047.025.600.451.535 Omentum;A01.047.025.600.573 Peritoneal Cavity;A01.047.025.600.678 Peritoneal Stomata;A01.047.025.600.700 Retroperitoneal Space;A01.047.025.750 Abdominal Wall;A01.047.050 Groin;A01.047.365 Inguinal Canal;A01.047.412 Umbilicus;A01.047.849 Back;A01.176 Lumbosacral Region;A01.176.519 Sacrococcygeal Region;A01.176.780 Breast;A01.236 Mammary Glands, Human;A01.236.249 This is just a small portion of the file I am working with. This is a "flat" text file but what I need is a "hierarchical" or "tree structure" file where each of these lines is indented with a number of "tab keystrokes" corresponding to its level as depicted by the alphanumeric code following the term. Example: Body Regions;A01 Abdomen;A01.047 Abdominal Cavity;A01.047.025 Peritoneum;A01.047.025.600 Douglas' Pouch;A01.047.025.600.225 Mesentery;A01.047.025.600.451 Mesocolon;A01.047.025.600.451.535 Omentum;A01.047.025.600.573 Peritoneal Cavity;A01.047.025.600.678 Peritoneal Stomata;A01.047.025.600.700 Retroperitoneal Space;A01.047.025.750 Abdominal Wall;A01.047.050 Groin;A01.047.365 Inguinal Canal;A01.047.412 Umbilicus;A01.047.849 Back;A01.176 Lumbosacral Region;A01.176.519 Sacrococcygeal Region;A01.176.780 Breast;A01.236 Mammary Glands, Human;A01.236.249 As you will notice each line has been indented using the tab key a number of times corresponding to the size of the alphanumeric code. Lines with an alphanumeric code containing only 3 characters stay on the left most position (no tabs). Example: Body Regions;A01 Lines with an alphanumeric code containing 7 characters (example: A01.047) are indented with one single tab keystroke. Example: Abdomen;A01.047 Lines with an alphanumeric code containing 11 characters (example: A01.047.025) are indented with two tab keystrokes. Example: Abdominal Cavity;A01.047.025 Lines with an alphanumeric code containing 15 characters (example: A01.047.025.600) are indented with three tab keystrokes. Example: Peritoneum;A01.047.025.600 And so on, and so forth. Once this step is finished and the file has been edited from a flat text to one with a hierarchical or tree structure using tab entries (to derive a tab-delimited file) the second part of the project involves the removal of certain characters and replacement with others. Basically I need the format to go from the existing: Abdominal Cavity;A01.047.025 Where the term is followed by a semicolon and alphanumeric code. To the following: Abdominal Cavity [A01.047.025] Where a space is added after the term, the semicolon is removed and brackets are placed at the beginning and end of the alphanumeric code. Considering the size of my document as I stated (little less than 57,000 entries) I would like to find a way to automate the process. How can these two processes be accomplished with the use of a macro ? I have never worked with macros before and any help would be extremely valuable and appreciated. I have both Excel 04 and Word 04 as part of the Office 2004 package in case the information is relevant. I am not sure which of the applications would be better suited for this task but am open to your suggestions. Also I would appreciate suggestions on textbooks that would cover VBA for Excel and Word. I would like to find books that would cover everything from A to Z but with an approach that makes it easy for a beginner to understand and work with VBA on Microsoft Office for the Mac. Any recommendations will be appreciated. Thank you in advance for your help. JRC |
#2
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
JRC,
I have assumed that your document is a text file. (If it isn't the code can be corrected to account for that.) The macro below will read in a file and add the tabs and add the spaces and brackets as you described. Run the macro, select the file with the data, and the macro will put the new file named "Output.txt" into the same folder. HTH, Bernie MS Excel MVP Sub AddTabsToFile() Dim FileName As String Dim FileNumIn As Integer Dim FileNumOut As Integer Dim ResultStr As String Dim myStr As String Dim myS As Variant Dim i As Integer FileName = Application.GetOpenFilename If FileName = "" Then End FileNumIn = FreeFile() Open FileName For Input As FileNumIn FileNumOut = FreeFile() Open "Output.txt" For Output Access Write As FileNumOut Do While Seek(FileNumIn) <= LOF(FileNumIn) 'Store One Line Of Text From File To Variable Line Input #FileNumIn, ResultStr myStr = "" myS = Split(ResultStr, ";") For i = 1 To (Len(myS(1)) - 3) / 4 myStr = myStr & vbTab Next i myStr = myStr & myS(0) & " [" & myS(1) & "]" Print #FileNumOut, myStr Loop Close FileNumIn Close FileNumOut End Sub "JRC" wrote in message ... Dear all: I am currently working on a rather large document that I need to format a certain specific way. The document includes nearly 57,000 lines of plain text that need to be indented according to a simple rule so that it can then be imported into an application as a tab- delimited file. My system is a Mac G5 running the latest version of Leopard (not Snow Leopard) which is 10.5.8 and I have Microsoft Office 2004 installed (haven't upgraded to 2008 and will probably wait until next version to upgrade). Below is an example of the type of file I now have: Body Regions;A01 Abdomen;A01.047 Abdominal Cavity;A01.047.025 Peritoneum;A01.047.025.600 Douglas' Pouch;A01.047.025.600.225 Mesentery;A01.047.025.600.451 Mesocolon;A01.047.025.600.451.535 Omentum;A01.047.025.600.573 Peritoneal Cavity;A01.047.025.600.678 Peritoneal Stomata;A01.047.025.600.700 Retroperitoneal Space;A01.047.025.750 Abdominal Wall;A01.047.050 Groin;A01.047.365 Inguinal Canal;A01.047.412 Umbilicus;A01.047.849 Back;A01.176 Lumbosacral Region;A01.176.519 Sacrococcygeal Region;A01.176.780 Breast;A01.236 Mammary Glands, Human;A01.236.249 This is just a small portion of the file I am working with. This is a "flat" text file but what I need is a "hierarchical" or "tree structure" file where each of these lines is indented with a number of "tab keystrokes" corresponding to its level as depicted by the alphanumeric code following the term. Example: Body Regions;A01 Abdomen;A01.047 Abdominal Cavity;A01.047.025 Peritoneum;A01.047.025.600 Douglas' Pouch;A01.047.025.600.225 Mesentery;A01.047.025.600.451 Mesocolon;A01.047.025.600.451.535 Omentum;A01.047.025.600.573 Peritoneal Cavity;A01.047.025.600.678 Peritoneal Stomata;A01.047.025.600.700 Retroperitoneal Space;A01.047.025.750 Abdominal Wall;A01.047.050 Groin;A01.047.365 Inguinal Canal;A01.047.412 Umbilicus;A01.047.849 Back;A01.176 Lumbosacral Region;A01.176.519 Sacrococcygeal Region;A01.176.780 Breast;A01.236 Mammary Glands, Human;A01.236.249 As you will notice each line has been indented using the tab key a number of times corresponding to the size of the alphanumeric code. Lines with an alphanumeric code containing only 3 characters stay on the left most position (no tabs). Example: Body Regions;A01 Lines with an alphanumeric code containing 7 characters (example: A01.047) are indented with one single tab keystroke. Example: Abdomen;A01.047 Lines with an alphanumeric code containing 11 characters (example: A01.047.025) are indented with two tab keystrokes. Example: Abdominal Cavity;A01.047.025 Lines with an alphanumeric code containing 15 characters (example: A01.047.025.600) are indented with three tab keystrokes. Example: Peritoneum;A01.047.025.600 And so on, and so forth. Once this step is finished and the file has been edited from a flat text to one with a hierarchical or tree structure using tab entries (to derive a tab-delimited file) the second part of the project involves the removal of certain characters and replacement with others. Basically I need the format to go from the existing: Abdominal Cavity;A01.047.025 Where the term is followed by a semicolon and alphanumeric code. To the following: Abdominal Cavity [A01.047.025] Where a space is added after the term, the semicolon is removed and brackets are placed at the beginning and end of the alphanumeric code. Considering the size of my document as I stated (little less than 57,000 entries) I would like to find a way to automate the process. How can these two processes be accomplished with the use of a macro ? I have never worked with macros before and any help would be extremely valuable and appreciated. I have both Excel 04 and Word 04 as part of the Office 2004 package in case the information is relevant. I am not sure which of the applications would be better suited for this task but am open to your suggestions. Also I would appreciate suggestions on textbooks that would cover VBA for Excel and Word. I would like to find books that would cover everything from A to Z but with an approach that makes it easy for a beginner to understand and work with VBA on Microsoft Office for the Mac. Any recommendations will be appreciated. Thank you in advance for your help. JRC |
#3
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
On Dec 4, 5:40*am, "Bernie Deitrick" <deitbe @ consumer dot org
wrote: JRC, I have assumed that your document is a text file. *(If it isn't the code can be corrected to account for that.) The macro below will read in a file and add the tabs and add the spaces and brackets as you described. Run the macro, select the file with the data, and the macro will put the new file named "Output.txt" into the same folder. HTH, Bernie MS Excel MVP Sub AddTabsToFile() Dim FileName As String Dim FileNumIn As Integer Dim FileNumOut As Integer Dim ResultStr As String Dim myStr As String Dim myS As Variant Dim i As Integer FileName = Application.GetOpenFilename If FileName = "" Then End FileNumIn = FreeFile() Open FileName For Input As FileNumIn FileNumOut = FreeFile() Open "Output.txt" For Output Access Write As FileNumOut Do While Seek(FileNumIn) <= LOF(FileNumIn) * *'Store One Line Of Text From File To Variable * *Line Input #FileNumIn, ResultStr * *myStr = "" * *myS = Split(ResultStr, ";") * *For i = 1 To (Len(myS(1)) - 3) / 4 * * * *myStr = myStr & vbTab * *Next i * *myStr = myStr & myS(0) & " [" & myS(1) & "]" * *Print #FileNumOut, myStr Loop Close FileNumIn Close FileNumOut End Sub "JRC" wrote in message ... Dear all: I am currently working on a rather large document that I need to format a certain specific way. The document includes nearly 57,000 lines of plain text that need to be indented according to a simple rule so that it can then be imported into an application as a tab- delimited file. My system is a Mac G5 running the latest version of Leopard (not Snow Leopard) which is 10.5.8 and I have Microsoft Office 2004 installed (haven't upgraded to 2008 and will probably wait until next version to upgrade). Below is an example of the type of file I now have: Body Regions;A01 Abdomen;A01.047 Abdominal Cavity;A01.047.025 Peritoneum;A01.047.025.600 Douglas' Pouch;A01.047.025.600.225 Mesentery;A01.047.025.600.451 Mesocolon;A01.047.025.600.451.535 Omentum;A01.047.025.600.573 Peritoneal Cavity;A01.047.025.600.678 Peritoneal Stomata;A01.047.025.600.700 Retroperitoneal Space;A01.047.025.750 Abdominal Wall;A01.047.050 Groin;A01.047.365 Inguinal Canal;A01.047.412 Umbilicus;A01.047.849 Back;A01.176 Lumbosacral Region;A01.176.519 Sacrococcygeal Region;A01.176.780 Breast;A01.236 Mammary Glands, Human;A01.236.249 This is just a small portion of the file I am working with. This is a "flat" text file but what I need is a "hierarchical" or "tree structure" file where each of these lines is indented with a number of "tab keystrokes" corresponding to its level as depicted by the alphanumeric code following the term. Example: Body Regions;A01 * * Abdomen;A01.047 * * * * *Abdominal Cavity;A01.047.025 * * * * * * * Peritoneum;A01.047.025.600 * * * * * * * * * *Douglas' Pouch;A01.047.025.600.225 * * * * * * * * * *Mesentery;A01.047.025.600.451 * * * * * * * * * * * * Mesocolon;A01.047.025.600.451.535 * * * * * * * * * *Omentum;A01.047.025.600.573 * * * * * * * * * * * * Peritoneal Cavity;A01.047.025.600.678 * * * * * * * * * *Peritoneal Stomata;A01.047.025.600.700 * * * * * * * Retroperitoneal Space;A01.047.025.750 * * * * *Abdominal Wall;A01.047.050 * * * * *Groin;A01.047.365 * * * * *Inguinal Canal;A01.047.412 * * * * *Umbilicus;A01.047.849 * * Back;A01.176 * * * * *Lumbosacral Region;A01.176.519 * * * * *Sacrococcygeal Region;A01.176.780 * * Breast;A01.236 * * * * *Mammary Glands, Human;A01.236.249 As you will notice each line has been indented using the tab key a number of times corresponding to the size of the alphanumeric code. Lines with an alphanumeric code containing only 3 characters stay on the left most position (no tabs). Example: Body Regions;A01 Lines with an alphanumeric code containing 7 characters (example: A01.047) are indented with one single tab keystroke. Example: * * Abdomen;A01.047 Lines with an alphanumeric code containing 11 characters (example: A01.047.025) are indented with two tab keystrokes. Example: * * * * *Abdominal Cavity;A01.047.025 Lines with an alphanumeric code containing 15 characters (example: A01.047.025.600) are indented with three tab keystrokes. Example: * * * * * * * Peritoneum;A01.047.025.600 And so on, and so forth. Once this step is finished and the file has been edited from a flat text to one with a hierarchical or tree structure using tab entries (to derive a tab-delimited file) the second part of the project involves the removal of certain characters and replacement with others. Basically I need the format to go from the existing: Abdominal Cavity;A01.047.025 Where the term is followed by a semicolon and alphanumeric code. To the following: Abdominal Cavity [A01.047.025] Where a space is added after the term, the semicolon is removed and brackets are placed at the beginning and end of the alphanumeric code. Considering the size of my document as I stated (little less than 57,000 entries) I would like to find a way to automate the process. How can these two processes be accomplished with the use of a macro ? I have never worked with macros before and any help would be extremely valuable and appreciated. I have both Excel 04 and Word 04 as part of the Office 2004 package in case the information is relevant. I am not sure which of the applications would be better suited for this task but am open to your suggestions. Also I would appreciate suggestions on textbooks that would cover VBA for Excel and Word. I would like to find books that would cover everything *from A to Z but with an approach that makes it easy for a beginner to understand and work with VBA on Microsoft Office for the Mac. Any recommendations will be appreciated. Thank you in advance for your help. JRC Hello, Bernie. Thank you very much for your help and the really detailed post. I have tried copying and pasting your code into Excel but have not been successful. As soon as I try to run it Excel gives me a error message stating there is something wrong with the syntax and highlighting the first line in yellow. I am not at all experienced with VBA so I am finding it difficult to locate the reason for the problem. Do you have any idea as to what is causing the problem ? Just in case this is helpful I tried the subroutines from John and JLatham but both caused the same or a similar type of error. Thank you again, JRC |
#4
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
I think Bernie has provided an excellent code solution for you, tighter than
the one I wrote and was going to present, and gets the job done very well. But I'll put mine up later as an alternative for 2 reasons: it shows a second solution for the same problem, and also echos the output to the text file on an Excel worksheet so you can get an idea of what was actually written (tabs show up as little squares in the worksheet). One good book for starting to learn VBA is Visual Basic Programming for the Absolute Beginner, which you can see on Amazon at http://www.amazon.com/Visual-Basic-P...761535535#noop I have also tried to put together an introductory booklet for those trying to learn VBA for Excel, and you're welcome to see if it might not add a little to other books you find on the subject: http://www.jlathamsite.com/Teach/VBA...troduction.pdf OK, the code I promised you: Sub ReadTextFile() 'reads .txt file and places it into an 'Excel worksheet and writes it to a 'tab delimited file named "tdfFile.csv" 'in the same folder with the source .txt file ' Const semicolon = ";" Const tabIndicator = "." Dim sourceFile As String ' the .txt file to be read Dim buffNum As Integer Dim rawData As String ' one line of data from the .txt file Dim splitText As Variant Dim tabCount As Integer Dim LC As Integer Dim rowCount As Long Dim outputText As String Dim delimitedFile As String Dim tdfBuffNum As Integer sourceFile = Application.GetOpenFilename If sourceFile = "False" Then Exit Sub ' user cancelled End If 'set up the output file tdfBuffNum = FreeFile() delimitedFile = "tdfFile.csv" delimitedFile = Left(sourceFile, InStrRev(sourceFile, _ Application.PathSeparator)) & delimitedFile Open delimitedFile For Output As #tdfBuffNum 'clear contents of active sheet just so we 'can show what we're doing on it ActiveSheet.Cells.ClearContents Cells(1, 1) = "Output written to: " & delimitedFile rowCount = 2 ' initialize 'start reading the source file buffNum = FreeFile() Open sourceFile For Input As #buffNum Do While Not EOF(buffNum) Line Input #buffNum, rawData 'line must have a semicolon in it to use If InStr(rawData, semicolon) 0 Then splitText = Split(rawData, semicolon) 'number of tabs needed depends on 'number of periods in the code part. 'Bernie's method is faster tabCount = 0 For LC = 1 To Len(splitText(1)) If Mid(splitText(1), LC, 1) = tabIndicator Then tabCount = tabCount + 1 End If Next outputText = String(tabCount, vbTab) & _ splitText(0) & " [" & splitText(1) & "]" Print #tdfBuffNum, outputText 'echo to worksheet Cells(rowCount, 1) = outputText rowCount = rowCount + 1 End If Loop Close #buffNum Close #tdfBuffNum End Sub "JRC" wrote: Dear all: I am currently working on a rather large document that I need to format a certain specific way. The document includes nearly 57,000 lines of plain text that need to be indented according to a simple rule so that it can then be imported into an application as a tab- delimited file. My system is a Mac G5 running the latest version of Leopard (not Snow Leopard) which is 10.5.8 and I have Microsoft Office 2004 installed (haven't upgraded to 2008 and will probably wait until next version to upgrade). Below is an example of the type of file I now have: Body Regions;A01 Abdomen;A01.047 Abdominal Cavity;A01.047.025 Peritoneum;A01.047.025.600 Douglas' Pouch;A01.047.025.600.225 Mesentery;A01.047.025.600.451 Mesocolon;A01.047.025.600.451.535 Omentum;A01.047.025.600.573 Peritoneal Cavity;A01.047.025.600.678 Peritoneal Stomata;A01.047.025.600.700 Retroperitoneal Space;A01.047.025.750 Abdominal Wall;A01.047.050 Groin;A01.047.365 Inguinal Canal;A01.047.412 Umbilicus;A01.047.849 Back;A01.176 Lumbosacral Region;A01.176.519 Sacrococcygeal Region;A01.176.780 Breast;A01.236 Mammary Glands, Human;A01.236.249 This is just a small portion of the file I am working with. This is a "flat" text file but what I need is a "hierarchical" or "tree structure" file where each of these lines is indented with a number of "tab keystrokes" corresponding to its level as depicted by the alphanumeric code following the term. Example: Body Regions;A01 Abdomen;A01.047 Abdominal Cavity;A01.047.025 Peritoneum;A01.047.025.600 Douglas' Pouch;A01.047.025.600.225 Mesentery;A01.047.025.600.451 Mesocolon;A01.047.025.600.451.535 Omentum;A01.047.025.600.573 Peritoneal Cavity;A01.047.025.600.678 Peritoneal Stomata;A01.047.025.600.700 Retroperitoneal Space;A01.047.025.750 Abdominal Wall;A01.047.050 Groin;A01.047.365 Inguinal Canal;A01.047.412 Umbilicus;A01.047.849 Back;A01.176 Lumbosacral Region;A01.176.519 Sacrococcygeal Region;A01.176.780 Breast;A01.236 Mammary Glands, Human;A01.236.249 As you will notice each line has been indented using the tab key a number of times corresponding to the size of the alphanumeric code. Lines with an alphanumeric code containing only 3 characters stay on the left most position (no tabs). Example: Body Regions;A01 Lines with an alphanumeric code containing 7 characters (example: A01.047) are indented with one single tab keystroke. Example: Abdomen;A01.047 Lines with an alphanumeric code containing 11 characters (example: A01.047.025) are indented with two tab keystrokes. Example: Abdominal Cavity;A01.047.025 Lines with an alphanumeric code containing 15 characters (example: A01.047.025.600) are indented with three tab keystrokes. Example: Peritoneum;A01.047.025.600 And so on, and so forth. Once this step is finished and the file has been edited from a flat text to one with a hierarchical or tree structure using tab entries (to derive a tab-delimited file) the second part of the project involves the removal of certain characters and replacement with others. Basically I need the format to go from the existing: Abdominal Cavity;A01.047.025 Where the term is followed by a semicolon and alphanumeric code. To the following: Abdominal Cavity [A01.047.025] Where a space is added after the term, the semicolon is removed and brackets are placed at the beginning and end of the alphanumeric code. Considering the size of my document as I stated (little less than 57,000 entries) I would like to find a way to automate the process. How can these two processes be accomplished with the use of a macro ? I have never worked with macros before and any help would be extremely valuable and appreciated. I have both Excel 04 and Word 04 as part of the Office 2004 package in case the information is relevant. I am not sure which of the applications would be better suited for this task but am open to your suggestions. Also I would appreciate suggestions on textbooks that would cover VBA for Excel and Word. I would like to find books that would cover everything from A to Z but with an approach that makes it easy for a beginner to understand and work with VBA on Microsoft Office for the Mac. Any recommendations will be appreciated. Thank you in advance for your help. JRC . |
#5
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
On Dec 4, 12:44*pm, JLatham wrote:
I think Bernie has provided an excellent code solution for you, tighter than the one I wrote and was going to present, and gets the job done very well.. * But I'll put mine up later as an alternative for 2 reasons: it shows a second solution for the same problem, and also echos the output to the text file on an Excel worksheet so you can get an idea of what was actually written (tabs show up as little squares in the worksheet). One good book for starting to learn VBA is Visual Basic Programming for the Absolute Beginner, which you can see on Amazon athttp://www.amazon.com/Visual-Basic-Programming-Absolute-Beginner/dp/0... I have also tried to put together an introductory booklet for those trying to learn VBA for Excel, and you're welcome to see if it might not add a little to other books you find on the subject:http://www.jlathamsite.com/Teach/VBA...VBA_AnIntroduc... OK, the code I promised you: Sub ReadTextFile() 'reads .txt file and places it into an 'Excel worksheet and writes it to a 'tab delimited file named "tdfFile.csv" 'in the same folder with the source .txt file ' * Const semicolon = ";" * Const tabIndicator = "." * Dim sourceFile As String ' the .txt file to be read * Dim buffNum As Integer * Dim rawData As String ' one line of data from the .txt file * Dim splitText As Variant * Dim tabCount As Integer * Dim LC As Integer * Dim rowCount As Long * Dim outputText As String * Dim delimitedFile As String * Dim tdfBuffNum As Integer * sourceFile = Application.GetOpenFilename * If sourceFile = "False" Then * * Exit Sub ' user cancelled * End If * 'set up the output file * tdfBuffNum = FreeFile() * delimitedFile = "tdfFile.csv" * delimitedFile = Left(sourceFile, InStrRev(sourceFile, _ * *Application.PathSeparator)) & delimitedFile * Open delimitedFile For Output As #tdfBuffNum * 'clear contents of active sheet just so we * 'can show what we're doing on it * ActiveSheet.Cells.ClearContents * Cells(1, 1) = "Output written to: " & delimitedFile * rowCount = 2 ' initialize * 'start reading the source file * buffNum = FreeFile() * Open sourceFile For Input As #buffNum * Do While Not EOF(buffNum) * * Line Input #buffNum, rawData * * 'line must have a semicolon in it to use * * If InStr(rawData, semicolon) 0 Then * * * splitText = Split(rawData, semicolon) * * * 'number of tabs needed depends on * * * 'number of periods in the code part. * * * 'Bernie's method is faster * * * tabCount = 0 * * * For LC = 1 To Len(splitText(1)) * * * * If Mid(splitText(1), LC, 1) = tabIndicator Then * * * * * tabCount = tabCount + 1 * * * * End If * * * Next * * * outputText = String(tabCount, vbTab) & _ * * * *splitText(0) & " [" & splitText(1) & "]" * * * Print #tdfBuffNum, outputText * * * 'echo to worksheet * * * Cells(rowCount, 1) = outputText * * * rowCount = rowCount + 1 * * End If * Loop * Close #buffNum * Close #tdfBuffNum End Sub "JRC" wrote: Dear all: I am currently working on a rather large document that I need to format a certain specific way. The document includes nearly 57,000 lines of plain text that need to be indented according to a simple rule so that it can then be imported into an application as a tab- delimited file. My system is a Mac G5 running the latest version of Leopard (not Snow Leopard) which is 10.5.8 and I have Microsoft Office 2004 installed (haven't upgraded to 2008 and will probably wait until next version to upgrade). Below is an example of the type of file I now have: Body Regions;A01 Abdomen;A01.047 Abdominal Cavity;A01.047.025 Peritoneum;A01.047.025.600 Douglas' Pouch;A01.047.025.600.225 Mesentery;A01.047.025.600.451 Mesocolon;A01.047.025.600.451.535 Omentum;A01.047.025.600.573 Peritoneal Cavity;A01.047.025.600.678 Peritoneal Stomata;A01.047.025.600.700 Retroperitoneal Space;A01.047.025.750 Abdominal Wall;A01.047.050 Groin;A01.047.365 Inguinal Canal;A01.047.412 Umbilicus;A01.047.849 Back;A01.176 Lumbosacral Region;A01.176.519 Sacrococcygeal Region;A01.176.780 Breast;A01.236 Mammary Glands, Human;A01.236.249 This is just a small portion of the file I am working with. This is a "flat" text file but what I need is a "hierarchical" or "tree structure" file where each of these lines is indented with a number of "tab keystrokes" corresponding to its level as depicted by the alphanumeric code following the term. Example: Body Regions;A01 * * *Abdomen;A01.047 * * * * * Abdominal Cavity;A01.047.025 * * * * * * * *Peritoneum;A01.047.025.600 * * * * * * * * * * Douglas' Pouch;A01.047.025.600.225 * * * * * * * * * * Mesentery;A01.047.025.600.451 * * * * * * * * * * * * *Mesocolon;A01.047.025.600.451.535 * * * * * * * * * * Omentum;A01.047.025.600.573 * * * * * * * * * * * * *Peritoneal Cavity;A01.047.025.600.678 * * * * * * * * * * Peritoneal Stomata;A01.047.025.600.700 * * * * * * * *Retroperitoneal Space;A01.047.025.750 * * * * * Abdominal Wall;A01.047.050 * * * * * Groin;A01.047.365 * * * * * Inguinal Canal;A01.047.412 * * * * * Umbilicus;A01.047.849 * * *Back;A01.176 * * * * * Lumbosacral Region;A01.176.519 * * * * * Sacrococcygeal Region;A01.176.780 * * *Breast;A01.236 * * * * * Mammary Glands, Human;A01.236.249 As you will notice each line has been indented using the tab key a number of times corresponding to the size of the alphanumeric code. Lines with an alphanumeric code containing only 3 characters stay on the left most position (no tabs). Example: Body Regions;A01 Lines with an alphanumeric code containing 7 characters (example: A01.047) are indented with one single tab keystroke. Example: * * *Abdomen;A01.047 Lines with an alphanumeric code containing 11 characters (example: A01.047.025) are indented with two tab keystrokes. Example: * * * * * Abdominal Cavity;A01.047.025 Lines with an alphanumeric code containing 15 characters (example: A01.047.025.600) are indented with three tab keystrokes. Example: * * * * * * * *Peritoneum;A01.047.025.600 And so on, and so forth. Once this step is finished and the file has been edited from a flat text to one with a hierarchical or tree structure using tab entries (to derive a tab-delimited file) the second part of the project involves the removal of certain characters and replacement with others. Basically I need the format to go from the existing: Abdominal Cavity;A01.047.025 Where the term is followed by a semicolon and alphanumeric code. To the following: Abdominal Cavity [A01.047.025] Where a space is added after the term, the semicolon is removed and brackets are placed at the beginning and end of the alphanumeric code. Considering the size of my document as I stated (little less than 57,000 entries) I would like to find a way to automate the process. How can these two processes be accomplished with the use of a macro ? I have never worked with macros before and any help would be extremely valuable and appreciated. I have both Excel 04 and Word 04 as part of the Office 2004 package in case the information is relevant. I am not sure which of the applications would be better suited for this task but am open to your suggestions. Also I would appreciate suggestions on textbooks that would cover VBA for Excel and Word. I would like to find books that would cover everything *from A to Z but with an approach that makes it easy for a beginner to understand and work with VBA on Microsoft Office for the Mac. Any recommendations will be appreciated. Thank you in advance for your help. JRC . Hello, JLatham. Thank you very much for your really helpful post and for the really good suggestions. I have downloaded the document you created and will study it either tonight or tomorrow night. Also I was very happy with the book suggestion and will order me a copy as I am sure it will help me. Just as I posted to John and Bernie, I am having problems running the routines you all have offered me. I am sure it has to do with my lack of experience with VBA. Do you have any ideas as to what might be causing this ? Thank you for your really helpful and detailed post. Best regards, JRC |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Basic Excel Formatting to Make a Readable Document | Excel Discussion (Misc queries) | |||
Conditional Formatting while sharing document | Excel Discussion (Misc queries) | |||
Series Formatting changes when I save and reopen a document | Charts and Charting in Excel | |||
How to create a formatted 'readable' document based on Excel document? | Excel Discussion (Misc queries) | |||
Broken formatting while pasting document | Excel Discussion (Misc queries) |