ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Parse with newline "0D 0D 0A" (https://www.excelbanter.com/excel-programming/450739-parse-newline-0d-0d-0a.html)

Robert Crandal[_3_]

Parse with newline "0D 0D 0A"
 
I have a text file that contains lots of sentences and
paragraphs. Paragraphs are separated by a "newline".

....however, upon close inspection of the text file with
a debugger, each newline character is represented as
3 bytes: 0D 0D 0A in hex, or 13 13 10 in decimal form.

Now, suppose I read the entire file into a string variable
as follows:

Dim s as String
s = ReadTextFile("C:\book.txt") ' user defined func()

At this point, the string variable "s" contains an entire
book of paragraphs separated by 3 bytes equal to
0D 0D 0A.

How do I use the Split() function to extract each paragraph?

I have tried this code:

v = Split(s, vbCrLf)

But, it doesn't quite work. Is "vbCrLf" the same as
0D 0D 0A?

I'd appreciate any help. Gary, feel free to chime in as
well. ;)





Claus Busch

Parse with newline "0D 0D 0A"
 
Hi Robert,

Am Thu, 26 Mar 2015 03:40:45 -0700 schrieb Robert Crandal:

v = Split(s, vbCrLf)


try:
v= Split(s, Chr(13) & Chr(13) & Chr(10))


Regards
Claus B.
--
Vista Ultimate / Windows7
Office 2007 Ultimate / 2010 Professional

GS[_2_]

Parse with newline "0D 0D 0A"
 
Blank lines between paragraphs contain an extra Chr(13) from the
previous line when parsing by blank lines. That means Claus'
recommendation is going to split your string into groups of sentences
by blank lines between those groups. A better approach is to use a
global scope constant for a blank line...

Const sBlankLine$ = Chr(13) & Chr(13) & Chr(10)

...and use it like this...

s = Split(ReadTextFile("C:\book.txt"), sBlankLine)

...where I assume 's' is dimmed as Variant so an array results. Though,
IMO, it would be better 'best practice' to name vars for the type of
data they contain when not used as a 'floating' value. (ie: vData is
better documenting as to the imported contents from a file) This makes
debugging and forward maintainance of code (by you or others) easier to
understand the use of vars without having to search their definition!

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion



Claus Busch

Parse with newline "0D 0D 0A"
 
Hi Robert,

Am Thu, 26 Mar 2015 11:57:30 +0100 schrieb Claus Busch:

v= Split(s, Chr(13) & Chr(13) & Chr(10))


or replace chr(13) with nullstring and split at chr(10):

v = Split(Replace(s, Chr(13), ""), Chr(10))


Regards
Claus B.
--
Vista Ultimate / Windows7
Office 2007 Ultimate / 2010 Professional

GS[_2_]

Parse with newline "0D 0D 0A"
 
Oops.., I forgot you can't do constants that way! Better to go with...

Dim gsBlankLine$

...and in a startup routine, initialize all globals...

Sub InitGlobals()
gsBlankLine = Chr(13) & Chr(13) & Chr(10)
'..others
End Sub

...where the prefix "gs" denotes 'global scope string' var.

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion



GS[_2_]

Parse with newline "0D 0D 0A"
 
Hi Robert,

Am Thu, 26 Mar 2015 11:57:30 +0100 schrieb Claus Busch:

v= Split(s, Chr(13) & Chr(13) & Chr(10))


or replace chr(13) with nullstring and split at chr(10):

v = Split(Replace(s, Chr(13), ""), Chr(10))


Regards
Claus B.


Claus,
That might not work as expected if the text file contains groups of
sentences separated by a blank line, and you further want to parse each
sentence without 'breaking' the original structure.

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion



GS[_2_]

Parse with newline "0D 0D 0A"
 
Claus,
That might not work as expected if the text file contains groups of
sentences separated by a blank line, and you further want to parse
each sentence without 'breaking' the original structure.


For clarity...

Data recorders output blocks of data (usually at fixed intervals) that
will contain #lines per block. Each 'write' will typically add a
trailing blank line to each block...

[single block structure]
**BEGIN**
Interval...
DataItem
DataItem ID
DataItem Description
DataItem Value

**END**

[multiple block structure]
**BEGIN**
Interval...
DataItem
DataItem ID
DataItem Description
DataItem Value

Interval...
DataItem
DataItem ID
DataItem Description
DataItem Value

**END**

...where the last line in the file will be blank, each block consists of
5 bits of data separated by a trailing blank line.

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion



Robert Crandal[_3_]

Parse with newline "0D 0D 0A"
 
"GS" wrote:

Dim gsBlankLine$

..and in a startup routine, initialize all globals...

Sub InitGlobals()
gsBlankLine = Chr(13) & Chr(13) & Chr(10)
'..others
End Sub


Awesome! Thank you Claus and Gary once again.




All times are GMT +1. The time now is 08:17 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com