Export CSV dash is \226 (unicode)
Gord Dibben wrote:
You want to get rid of all the "pipes" that de-limiting the text?
Pipe character is found above Enter key Shift\
EditReplace
What: Alt + 0124 ( enter digits via Numpad)
With: nothing
Gord Dibben MS Excel MVP
On Fri, 13 Nov 2009 14:35:54 -0800, wjr wrote:
Received an excel file which needs to be exported into a plain ascii csv
file. It appears that the originator had smart quoting or some kind of
unicode setup for this spreadsheet. The system which requires the cvs
file can not use unicode characters. The - characters must be a simple
hex 2d type of dash as in:
| 00 nul| 01 soh| 02 stx| 03 etx| 04 eot| 05 enq| 06 ack| 07 bel|
| 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si |
| 10 dle| 11 dc1| 12 dc2| 13 dc3| 14 dc4| 15 nak| 16 syn| 17 etb|
| 18 can| 19 em | 1a sub| 1b esc| 1c fs | 1d gs | 1e rs | 1f us |
| 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' |
| 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / |
| 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 |
| 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e | 3f ? |
| 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G |
| 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O |
| 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W |
| 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ |
| 60 ` | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g |
| 68 h | 69 i | 6a j | 6b k | 6c l | 6d m | 6e n | 6f o |
| 70 p | 71 q | 72 r | 73 s | 74 t | 75 u | 76 v | 77 w |
| 78 x | 79 y | 7a z | 7b { | 7c | | 7d } | 7e ~ | 7f del|
How to strip this out?
Not the problem at all. I was showing the ascii table for those who
don't know it.
Start from here in case you are confused.
Here is a real piece of data saved from excel as csv file. NOTE: it's
part of the first field, so don't get confused by the lack of a ','
character.
AS SEEN in EXCEL:
Animal Rights - General
AS SAVED TO CSV:
Animal Rights û General
Here is an octal dump of the saved part:
od -bc x
0000000 101 156 151 155 141 154 040 122 151 147 150 164 163 040 226 040
A n i m a l R i g h t s 226
0000020 107 145 156 145 162 141 154 040 040 040 012
G e n e r a l \n
0000033
As you can very clearly see from this, the - dash charcter is 226 and
not a proper -.
Here is what I should be getting:
0000000 A n i m a l R i g h t s -
101 156 151 155 141 154 040 122 151 147 150 164 163 040 055 040
0000020 G e n e r a l \n
107 145 156 145 162 141 154 012
0000030
|