Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 224
Default Reading file saved from Excel as Unicode text

I inherited some Excel code that does this:

Set fso = CreateObject("Scripting.FileSystemObject")
strTempFile = fso.GetSpecialFolder(2) & "\" & fso.GetTempName & ".txt"
'2 = TemporaryFolder.

ActiveWorkbook.SaveAs(Filename:=strTempFile, FileFormat:=xlText)

nSourceFile = FreeFile
Open strTempFile For Input As #nSourceFile
sText = Input$(LOF(1), 1)
Close

Then, sText is uploaded to a server.

That works fine for English, German, and French. But then we added
Japanese. Excel's xlText (tab delimited) file format changes all
Japanese characters into question marks. So, I changed that line to
save as Unicode tab delimited instead:

ActiveWorkbook.SaveAs(Filename:=strTempFile,
FileFormat:=xlUnicodeText)


The file looks great in Notepad; Japanese characters are preserved.
But, then I get Error 62 on this line:

sText = Input$(LOF(1), 1)


So, I changed that section to this:

Open strTempFile For Binary As #nSourceFile
sText = InputB$(LOF(1), 1)


With that, I don't get errors in VBA, but the backend server chokes on
extra characters that I guess must be invisible in Notepad.


So I tried this:


Set oTextStream = fso.OpenTextFile(strTempFile )
sText = oTextStream.ReadAll


The result is garbled; where the Japanese characters should be,
instead there are random characters like this:

K0Q0f0$B!&(Bc0_0$B!&%?(BL0$B!&%f(B0$B!&!V( B0$B%-(B0$B!&%#(B0$B%F(B0$B%M(B0g0



Any suggestions on how I can preserve the Japanese characters without
getting extra characters that are invisible in Notepad?



Thanks,

Greg
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 46
Default Reading file saved from Excel as Unicode text


"Greg Lovern" wrote:

<snip

The result is garbled; where the Japanese characters should be,
instead there are random characters like this:

K0Q0f0・c0_0・タL0・ユ0・「0*0・ィ0テ0 ネ0g0


That result isn't garbled - it looks exactly like a unicode string being
displayed in ascii, because it is one. UTF-16 would be a good educated guess.

VBA strings are not unicode, they're only ascii (well, 8-bit anyway). So if
you're handling UTF-16 strings then the VBA native string functions will be
useless to you (this may or may not apply to Excel 2007 strings, I've no
idea).

You need some unicode aware string tools, I'm not aware of any for vba but
that doesn't mean they don't exist. It's also a very complex area and I
wouldn't recommend anyone trying to write their own tools to handle it.


Put very approximately; UTF-16 works on words (pairs of bytes) to represent
a character. For example (byte values shown in hex) a space rendered as 20
in ascii and would be the byte pair 0020 in UTF-16*. A line break might be
000D which vba strings would interpret as two characters 00 (which would
display as a block or question mark as it has no ascii character to display)
and 0D the carriage return. Using 0D as the end of line marker (as is
correct) would leave the vba string handler with an extra byte 00 at the end
of the string. Culling this may be enough to get the string through but it's
not substitute for proper localisation functions.

There's more to it than that but it's enough to understand the scope of the
problem. Notepad knows about UTF-16 and a number of other text formats and
renders the text correctly, which is why it is fine there.


Not sure how much that actually helps but at least you know what's
happening...
  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 224
Default Reading file saved from Excel as Unicode text

On Apr 24, 3:31 pm, James Snell
wrote:
You need some unicode aware string tools, I'm not aware of any for vba


I was under the impression that FileSystemObject is Unicode-aware; is
that not correct?


proper localisation functions.


Are these available in a library, or do you mean rolling your own?


Greg
  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 46
Default Reading file saved from Excel as Unicode text

Sorry for the delay - I'm not always around to respond to posts.

FileSystemObject is unicode aware in that you can open a file with a unicode
name, the content that is returned by a stream is language neutral (it's just
a bytestream) which would need to be handled downstream. By the same token
you could oly open a unicode filename if you were able to represent that
unicode name in VBA.

I've not looked for existing libraries, on further inspection there are some
unicode functions available in vba such as ascw() & chrw(). It may be easier
to just open the file up as a hidden sheet / workbook.

The biggest issue is that you'd have to know what flavour of unicode the
server can speak (if any). If that's only ANSI (in which case you'd have to
rethink the use of japanese) if it's UTF-7 / UTF-8 then a script like
(http://www.planet-source-code.com/vb...44569&lngWId=1) might help.

Do you have the info about the server? Once you know what that handles then
you might be able to web-search for a conversion script...


"Greg Lovern" wrote:

On Apr 24, 3:31 pm, James Snell
wrote:
You need some unicode aware string tools, I'm not aware of any for vba


I was under the impression that FileSystemObject is Unicode-aware; is
that not correct?


proper localisation functions.


Are these available in a library, or do you mean rolling your own?


Greg

  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 224
Default Reading file saved from Excel as Unicode text

Thanks, we've been banging on this for a few days now and I've posted
a new question that reflects some little progress we've made and more
detail on the problem:

http://groups.google.com/group/micro...f2fb1df?hl=en#

It turns out the server requires Shift-JIS, not Unicode. But we're
still trying to go through Unicode, then convert the Unicode to Shift-
JIS, because Excel won't save as Shift-JIS. Details in the thread
referenced above. I hope you look at that post and comment.


Thanks,

Greg



On Apr 29, 5:10 am, James Snell
wrote:
Sorry for the delay - I'm not always around to respond to posts.

FileSystemObject is unicode aware in that you can open a file with a unicode
name, the content that is returned by a stream is language neutral (it's just
a bytestream) which would need to be handled downstream. By the same token
you could oly open a unicode filename if you were able to represent that
unicode name in VBA.

I've not looked for existing libraries, on further inspection there are some
unicode functions available in vba such as ascw() & chrw(). It may be easier
to just open the file up as a hidden sheet / workbook.

The biggest issue is that you'd have to know what flavour of unicode the
server can speak (if any). If that's only ANSI (in which case you'd have to
rethink the use of japanese) if it's UTF-7 / UTF-8 then a script like
(http://www.planet-source-code.com/vb...sp?txtCodeId=4...) might help.

Do you have the info about the server? Once you know what that handles then
you might be able to web-search for a conversion script...

"Greg Lovern" wrote:
On Apr 24, 3:31 pm, James Snell
wrote:
You need some unicode aware string tools, I'm not aware of any for vba


I was under the impression that FileSystemObject is Unicode-aware; is
that not correct?


proper localisation functions.


Are these available in a library, or do you mean rolling your own?


Greg




Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
VBA macro for reading text file into Excel Javed Khan Excel Discussion (Misc queries) 0 October 14th 09 06:29 PM
Saving double-byte characters as unicode text in a CSV file Michael Bond Excel Discussion (Misc queries) 4 January 29th 09 10:01 PM
Reading from a text file Jeff Excel Discussion (Misc queries) 1 November 8th 06 08:47 PM
reading from text file to excel file dgoel Excel Programming 0 April 18th 05 06:49 PM
how to produce a csv file from a unicode-saved text file SAL Excel Programming 0 September 29th 03 03:26 AM


All times are GMT +1. The time now is 01:31 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright 2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"