View Single Post
  #20   Report Post  
Posted to microsoft.public.excel.programming
T_o_n_y T_o_n_y is offline
external usenet poster
 
Posts: 43
Default That makes sense to me

Nick,

What you've written makes sense to me, and that is why I've had trouble
concluding that it is a unicode file. As I wrote in an earlier post, I'm
leaning toward it being a text file that has 6 Ascii(0) characters inserted
into it.

The question remains: is there a way to have Excel import these characters
rather than completely ignoring them?

-Tony

"NickHK" wrote:

Tony,
The "special characters" I see in your uploaded file are Asc(0). It is not a
Unicode file.

As for the "0" for every other cell, that is expected if you have all ANSI
text stored in a UNICODE format. The lower byte will always be 0 as no
values exceed decimal 255 or FF hex.

So do you have a Unicode file or not ?

NickHK

"T_o_n_y" wrote in message
...
Tom,

I'm perplexed at your response because, I'm about as far from ignoring

your
posts as possible. Indeed, I generally skip directly to your posts when

on
this newsgroup since they are more helpful than anyone's, containing

actual
sample code that can be used.

It's just that the output from the macro you sent me led me to the
conclusion that you were mistaken this time. The output shows "0" for

every
other cell which is not what I would expect from UNICODE with 2 bytes per
character. Furthermore, rather than revealing the presence of the special
characters, your macro also had them stripped away.

Here's what I mean. The file I've uploaded contains the following in the
first line,

$$158++yyyyyy++1++8++4.50 etc...

I've substituting + for spaces and y for the special characters above. As
you can see there are 2 spaces followed by 6 special characters followed

by 2
spaces. The output from your macro completely omits the 6 special
characters, if I'm reading it correctly.

As I wrote, I spent "anoter few hours" researching into UNICODE in order

to
investigate the possibility you raised...but nothing I found seemed to
confirm it. In addition, Excel has two different UNICODE types (UTF-8 and
UTF-7) which one can select in the text import wizard. I tried both of

them
and neither gave me success in importing the special characters, as judged

by
using c pearson's CellView add-in, which allows character by character
visualization of cell contents.

Thank you again for your help,
-Tony

"Tom Ogilvy" wrote:

Guess it was a waste of time trying to explain it to you. Did you bother

to
read it?

--
Regards,
Tom Ogilvy


"T_o_n_y" wrote in message
...
I tried your macro, but unfortunately Excel still did not import the
special
characters. Recall that there are 6 special characters between the

$$158
and
the 1 8 in the first line of the file:
$$158 1 8 4.50 1.0000 0.8000 3.0010 1.5740

For that section, the output from your macro looked like this:
36 = $
0 =
36 = $
0 =
49 = 1
0 =
53 = 5
0 =
56 = 8
0 =
32 =
0 =
32 =
0 =
32 =
0 =
32 =
0 =
49 = 1
0 =
32 =
0 =
32 =
0 =
56 = 8

In other words, the 6 characters got stripped away again so that all

you
see
are the 2 spaces which appear on either side of the 6 special

characters.

The only way I've found for Excel to even recognize that those

characters
exist is to use the "Delimited" option during text import and specify
"spaces" as the delimiting character with the "Treat consecutive
delimiters
as one" feature unchecked. Unfortunately, that method of importing

would
mean a huge rework of my existing code.

I spent another few hours trying to research the UNICODE possibilty

you
mentioned, but still was unable to come up with anything.

At a loss...
-Tony

"Tom Ogilvy" wrote:

put this in a workbook. Change the path to point to your file:

Sub ReadStraightTextFile()
Dim strTest As String
Dim bytArray() As Byte
Dim intcount As Integer
Dim col As Long
Open "E:\Data1\W158.DAT" For Input As #1
col = 0
Do While Not EOF(1)
Line Input #1, strTest
col = col + 1
bytArray = strTest
i = 0
For intcount = LBound(bytArray) To UBound(bytArray)
i = i + 1
Cells(i, col) = bytArray(intcount) & " = " &
Chr(bytArray(intcount))
Next

Loop 'Close the file

Close #1
End Sub


Have blank sheet as the activesheet. Run the macro.

It appears to me that the file is UNICODE. unlike an ascii file that

has
one byte per character, a unicode file has two bytes per character.

there are 8 bits to a byte, so an ascii file can have 8^2 = 256
different/unique character codes. In a unicode file, 2 bytes is 16
bits,
so 2^16 = 65536 possible unique characters.

I didn't see any actual characters that couldn't be represented by

Ascii,
so
you could read every Odd character .


It appears that opening it in Excel automatically converts it to

Ascii,
so
you haven't lost any information, but if you want to edit it and

write it
back out, you would need to save it as Unicode Text. I know that is

an
option in at least xl2000 and I assume later.


--
Regards,
Tom Ogilvy

"T_o_n_y" wrote in message
...
Tom,

Thank you for your reply. I followed your procedure but only got

four
"32"
s in that blank section; that is, there are only 4 spaces there.

This
confirms what I've suspected, namely, that Excel is simply not
importing
those characters. I've also tried using C. Pearson's Cell View

Add-in
with
the same result (http://www.cpearson.com/excel/CellView.htm).

As you point out, the characters also get stripped when I cut and

paste
into
this forum. Therefore, I've emailed you separately the file I

referred
to
as
an attachment (it's a text document called W158.DAT) sent from
myother_acct.
If I knew how to post it to this forum, I would.

I appreaciate your help...this is a frustrating problem for me. Is
there
a
way to import the text file character by character?

-Tony



"Tom Ogilvy" wrote:

put your string in cell A1. Then in B1 or another cell in the

first
row
put
in this formula

=CODE(MID($A$1,ROW(),1))
Assume the above formula is in B1
in C1:
=CHAR(B1)

now select B1:C1 and drag fill down until the formula starts

returning
#Value errors.

The only thing between the characters in your post are ascii code

32
which
is a space.

Possibly they didn't get carried forward in the email.

--
Regards,
Tom Ogilvy



"T_o_n_y" wrote in message
...
I need to import text files into Excel without losing special
characters.
I've tried several methods, but each time Excel imports in the

file,
ignoring
those characters. The following is an example line, but what

you
can't
see
are the 6 special characters which appear between the $$158 and

the
1
8!

$$158 1 8 4.50 1.0000 0.8000 3.0010
1.5740

I know they are there, however since I opened the document using
Word,
which displays them as a y with 2 dots above them.

My Excel VBA code needs to import these characters so that it
doesn't
get
lost when extracting the data using MID(,,,) function. The text
file
were
generated using old FORTRAN programs, and there are thousands of
them...my
VBA routines need to access these files in order to modernize

our
system.

Examples of what I've tried (all of these ignore the y

characters)

Workbooks.OpenText Filename:=fname, Origin:=437, _
StartRow:=1, dataType:=xlFixedWidth, FieldInfo:=Array(0, 2)

With ActiveSheet.QueryTables.Add(Connection:="TEXT;" & fname,
Destination:=Cells(2, Col))

Open FName For Input Access Read As #1
While Not EOF(1)
Line Input #1, WholeLine
Cells(RowNdx, ColNdx).Value = WholeLine
RowNdx = RowNdx + 1