I have a huge text file that contains paragraphs
separated by dotted borders. Each border
is a line of stars such as: ****************
What is an efficient way to selectively extract
paragraphs and append them to a new text file?
I will only select a paragraph if it does not
contain profanity words or other select words.
For example, given this input:
------BEGIN INPUT-------
Hello. This is a short novel written
by someone.
********************
I dont give a **** who reads this.
********************
Hey, sometimes **** happens, but
you gotta keep going.
********************
The end!
********************
------END INPUT-------
The output should be:
OUTPUT FILE:
Hello. This is a short novel written
by someone.
The end!
You'll need to search each line for the specific profanity words you
want to filter out along with lines containing the asterisks! Once you
have a delimited list of profanity words you can loop through it using
a For..Each construct and the InStr() function.
You'll also need to load the file into an array and inner loop that
using a For..Next construct. If you find words then 'falg' that line's
array element to a single unlikely character (like "~" for example) and
use the Filter() function on the array to strip out elements containing
the 'flag' character. Once done you can write the array out to a
file...
Const sProfaneWords$ = "word1,word2,word3" '//and so on
Dim vWord, vData, n&
' This assumes the full path and filename is held in 'sFilename'
vData = Split(ReadTextFile(sFilename), vbCrLf)
For Each vWord In Split(sProfaneWords, ",")
For n = LBound(vData) to UBound(vData)
If (InStr(vData(n), vWord) 0) _
Or (InStr(vData(n), "*") 0) Then
vData(n) = "~": Exit For
Next 'n
Next 'vWord
vData = Filter(vData, "~", False)
WriteTextFile Join(vData, vbCrLf), sFilename
...which uses the following support routines...
Function ReadTextFile$(Filename$)
' Reads large amounts of data from a text file in one single step.
Dim iNum%
On Error GoTo ErrHandler
iNum = FreeFile(): Open Filename For Input As #iNum
ReadTextFile = Space$(LOF(iNum))
ReadTextFile = Input(LOF(iNum), iNum)
ErrHandler:
Close #iNum: If Err Then Err.Raise Err.Number, , Err.Description
End Function 'ReadTextFile()
Sub WriteTextFile(TextOut$, Filename$, _
Optional AppendMode As Boolean = False)
' Reusable procedure that Writes/Overwrites or Appends
' large amounts of data to a Text file in one single step.
' **Does not create a blank line at the end of the file**
Dim iNum%
On Error GoTo ErrHandler
iNum = FreeFile()
If AppendMode Then
Open Filename For Append As #iNum: Print #iNum, vbCrLf & TextOut;
Else
Open Filename For Output As #iNum: Print #iNum, TextOut;
End If
ErrHandler:
Close #iNum: If Err Then Err.Raise Err.Number, , Err.Description
End Sub 'WriteTextFile()
--
Garry
Free usenet access at
http://www.eternal-september.org
Classic
VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.
vb.general.discussion