View Single Post
  #2   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Bernie Deitrick Bernie Deitrick is offline
external usenet poster
 
Posts: 5,441
Default VBA for document formatting - HELP.

JRC,

I have assumed that your document is a text file. (If it isn't the code can be corrected to account
for that.)

The macro below will read in a file and add the tabs and add the spaces and brackets as you
described.

Run the macro, select the file with the data, and the macro will put the new file named "Output.txt"
into the same folder.

HTH,
Bernie
MS Excel MVP

Sub AddTabsToFile()
Dim FileName As String
Dim FileNumIn As Integer
Dim FileNumOut As Integer
Dim ResultStr As String
Dim myStr As String
Dim myS As Variant
Dim i As Integer

FileName = Application.GetOpenFilename
If FileName = "" Then End

FileNumIn = FreeFile()
Open FileName For Input As FileNumIn

FileNumOut = FreeFile()
Open "Output.txt" For Output Access Write As FileNumOut

Do While Seek(FileNumIn) <= LOF(FileNumIn)
'Store One Line Of Text From File To Variable
Line Input #FileNumIn, ResultStr
myStr = ""
myS = Split(ResultStr, ";")
For i = 1 To (Len(myS(1)) - 3) / 4
myStr = myStr & vbTab
Next i
myStr = myStr & myS(0) & " [" & myS(1) & "]"
Print #FileNumOut, myStr
Loop

Close FileNumIn
Close FileNumOut
End Sub



"JRC" wrote in message
...
Dear all:

I am currently working on a rather large document that I need to
format a certain specific way. The document includes nearly 57,000
lines of plain text that need to be indented according to a simple
rule so that it can then be imported into an application as a tab-
delimited file.

My system is a Mac G5 running the latest version of Leopard (not Snow
Leopard) which is 10.5.8 and I have Microsoft Office 2004 installed
(haven't upgraded to 2008 and will probably wait until next version to
upgrade).

Below is an example of the type of file I now have:

Body Regions;A01
Abdomen;A01.047
Abdominal Cavity;A01.047.025
Peritoneum;A01.047.025.600
Douglas' Pouch;A01.047.025.600.225
Mesentery;A01.047.025.600.451
Mesocolon;A01.047.025.600.451.535
Omentum;A01.047.025.600.573
Peritoneal Cavity;A01.047.025.600.678
Peritoneal Stomata;A01.047.025.600.700
Retroperitoneal Space;A01.047.025.750
Abdominal Wall;A01.047.050
Groin;A01.047.365
Inguinal Canal;A01.047.412
Umbilicus;A01.047.849
Back;A01.176
Lumbosacral Region;A01.176.519
Sacrococcygeal Region;A01.176.780
Breast;A01.236
Mammary Glands, Human;A01.236.249

This is just a small portion of the file I am working with. This is a
"flat" text file but what I need is a "hierarchical" or "tree
structure" file where each of these lines is indented with a number
of
"tab keystrokes" corresponding to its level as depicted by the
alphanumeric code following the term. Example:

Body Regions;A01
Abdomen;A01.047
Abdominal Cavity;A01.047.025
Peritoneum;A01.047.025.600
Douglas' Pouch;A01.047.025.600.225
Mesentery;A01.047.025.600.451
Mesocolon;A01.047.025.600.451.535
Omentum;A01.047.025.600.573
Peritoneal Cavity;A01.047.025.600.678
Peritoneal Stomata;A01.047.025.600.700
Retroperitoneal Space;A01.047.025.750
Abdominal Wall;A01.047.050
Groin;A01.047.365
Inguinal Canal;A01.047.412
Umbilicus;A01.047.849
Back;A01.176
Lumbosacral Region;A01.176.519
Sacrococcygeal Region;A01.176.780
Breast;A01.236
Mammary Glands, Human;A01.236.249

As you will notice each line has been indented using the tab key a
number of times corresponding to the size of the alphanumeric code.
Lines with an alphanumeric code containing only 3 characters stay on
the left most position (no tabs). Example:
Body Regions;A01

Lines with an alphanumeric code containing 7 characters (example:
A01.047) are indented with one single tab keystroke. Example:
Abdomen;A01.047

Lines with an alphanumeric code containing 11 characters (example:
A01.047.025) are indented with two tab keystrokes. Example:
Abdominal Cavity;A01.047.025

Lines with an alphanumeric code containing 15 characters (example:
A01.047.025.600) are indented with three tab keystrokes. Example:
Peritoneum;A01.047.025.600

And so on, and so forth.

Once this step is finished and the file has been edited from a flat
text to one with a hierarchical or tree structure using tab entries
(to derive a tab-delimited file) the second part of the project
involves the removal of certain characters and replacement with
others. Basically I need the format to go from the existing:

Abdominal Cavity;A01.047.025

Where the term is followed by a semicolon and alphanumeric code.

To the following:

Abdominal Cavity [A01.047.025]

Where a space is added after the term, the semicolon is removed and
brackets are placed at the beginning and end of the alphanumeric
code.

Considering the size of my document as I stated (little less than
57,000 entries) I would like to find a way to automate the process.

How can these two processes be accomplished with the use of a macro ?

I have never worked with macros before and any help would be extremely
valuable and appreciated. I have both Excel 04 and Word 04 as part of
the Office 2004 package in case the information is relevant. I am not
sure which of the applications would be better suited for this task
but am open to your suggestions.

Also I would appreciate suggestions on textbooks that would cover VBA
for Excel and Word. I would like to find books that would cover
everything from A to Z but with an approach that makes it easy for a
beginner to understand and work with VBA on Microsoft Office for the
Mac. Any
recommendations will be appreciated.

Thank you in advance for your help.


JRC