Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Determine delimiter in csv or txt file
To all,
I would like to be able to programatically determine the delimiter within a txt or csv file so that the user is not faced with the Excel import text wizard.. The files come from several sources and the delimiter is likely to be tab, comma or semicolon and less likely a pipe or tilda. I'm thinking I need to count the number of each potential delimiter in a given number of rows probably excluding those that occur within quoted text. If you have any thoughts or know of any examples they'd be a great help Cheers Kieran |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Determine delimiter in csv or txt file
The following code will open the file and test for probable delimiters. If
Split returns an array of more than one element when splitting using a specific character as the delimiter, that character is a probable delimiter of the text on a single line of the input file. Change FName to the name of your file, and change PossibleDelimiters to include all the characters that might be a delimiter of the data. Sub AAA() Dim FNum As Integer Dim FName As String Dim Ndx As Long Dim InputLine As String Dim Arr As Variant Dim C As String Dim PossibleDelimiters As String PossibleDelimiters = ",;|~" & vbTab FName = "C:\Test.txt" '<<< CHANGE FNum = FreeFile Open FName For Input Access Read As #FNum Line Input #FNum, InputLine Close #FNum For Ndx = 1 To Len(PossibleDelimiters) C = Mid(PossibleDelimiters, Ndx, 1) Arr = Split(InputLine, C) If IsArray(Arr) = True Then If UBound(Arr) - LBound(Arr) + 1 1 Then Debug.Print "Likely Delimiter: ", C, Asc(C) End If End If Next Ndx End Sub -- Cordially, Chip Pearson Microsoft MVP - Excel Pearson Software Consulting, LLC www.cpearson.com (email address is on the web site) "Kieran H" wrote in message ps.com... To all, I would like to be able to programatically determine the delimiter within a txt or csv file so that the user is not faced with the Excel import text wizard.. The files come from several sources and the delimiter is likely to be tab, comma or semicolon and less likely a pipe or tilda. I'm thinking I need to count the number of each potential delimiter in a given number of rows probably excluding those that occur within quoted text. If you have any thoughts or know of any examples they'd be a great help Cheers Kieran |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Determine delimiter in csv or txt file
this looks for the most frequently occuring non letter, non number character
Sub ReadStraightTextFile() Dim sStr As String Dim LineofText As String Dim schr As String Dim s(0 To 127) As Long Dim sepChar As String, sepMax As Long Dim sName As Variant, i As Long Dim bStop As Boolean, s1 As String sName = Application.GetOpenFilename(FileFilter:="CSV Files (*.csv),*.csv") If sName = False Then Exit Sub Open sName For Input As #1 sStr = "" i = 1 Do While Not EOF(1) If i 2 Then Exit Do Line Input #1, LineofText sStr = LineofText i = i + 1 Loop Close #1 bStop = False For i = 1 To Len(sStr) schr = Mid(sStr, i, 1) If schr = """" Then bStop = Not bStop If Not bStop Then s(Asc(schr)) = s(Asc(schr)) + 1 s1 = s1 & schr End If Next sepMax = 0 sepChar = "," For i = 9 To 127 ' exclude double quote and space If i < 34 and i < 32 Then schr = Chr(i) If Not IsNumeric(schr) Then If UCase(schr) = LCase(schr) Then If s(i) sepMax Then sepChar = schr sepMax = s(i) End If End If End If End If Next MsgBox asc(sepChar) & ": " & sepChar End Sub It would be easily defeated by odd patterns !!!!!!!!!!!!!!!!,!!!!!!!!!!!,/////////// and some not so odd patterns 1/1/2007,1/2/2007,1/3/2007 but if you have some knowledge of the possibilities you might be able to craft it to a reasonable approach. -- Regards, Tom Ogilvy "Kieran H" wrote: To all, I would like to be able to programatically determine the delimiter within a txt or csv file so that the user is not faced with the Excel import text wizard.. The files come from several sources and the delimiter is likely to be tab, comma or semicolon and less likely a pipe or tilda. I'm thinking I need to count the number of each potential delimiter in a given number of rows probably excluding those that occur within quoted text. If you have any thoughts or know of any examples they'd be a great help Cheers Kieran |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Determine delimiter in csv or txt file
Haven't timed it, but this might be a faster way of doing this:
Sub test() MsgBox Asc(GuessDelimiter("C:\DelimTest.txt")) End Sub Function GuessDelimiter(strFile As String) As String Dim i As Byte Dim n As Long Dim x As Long Dim z As Byte Dim strText As String Dim arrDelimiters(1 To 5) As String strText = OpenTextFileToString(strFile) arrDelimiters(1) = "," arrDelimiters(2) = ";" arrDelimiters(3) = "|" arrDelimiters(4) = "~" arrDelimiters(5) = vbTab For i = 1 To 5 If i = 1 Then n = CountChar(arrDelimiters(i), strText) Else n = CountChar(arrDelimiters(i), strText, x) End If If n x Then x = n z = i End If Next i If z 0 Then GuessDelimiter = arrDelimiters(z) End If End Function Function OpenTextFileToString(ByVal strFile As String) As String Dim hFile As Long 'obtain file handle, open file and load into a string buffer hFile = FreeFile Open strFile For Input As #hFile OpenTextFileToString = Input$(LOF(hFile), hFile) Close #hFile End Function Function CountChar(strChar As String, _ strString As String, _ Optional lStopAtCount As Long = -1) As Long Dim lPos As Long Dim n As Long lPos = InStr(1, strString, strChar, vbBinaryCompare) If lPos = 0 Then CountChar = 0 Exit Function End If If lStopAtCount = -1 Then Do Until lPos = 0 lPos = InStr(lPos + 1, strString, strChar, vbBinaryCompare) n = n + 1 Loop Else Do Until lPos = 0 Or n lStopAtCount lPos = InStr(lPos + 1, strString, strChar, vbBinaryCompare) n = n + 1 Loop End If CountChar = n End Function You might make it a bit faster by altering the order of the delimiters in arrDelimiters. RBS "Kieran H" wrote in message ps.com... To all, I would like to be able to programatically determine the delimiter within a txt or csv file so that the user is not faced with the Excel import text wizard.. The files come from several sources and the delimiter is likely to be tab, comma or semicolon and less likely a pipe or tilda. I'm thinking I need to count the number of each potential delimiter in a given number of rows probably excluding those that occur within quoted text. If you have any thoughts or know of any examples they'd be a great help Cheers Kieran |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Determine delimiter in csv or txt file
Yes, I can see it makes sense to only look at the first line of the file for
several reasons. So, this would be better compared to my old function: Function GuessDelimiter(strFile As String) As String Dim i As Byte Dim n As Long Dim x As Long Dim z As Byte Dim hFile As Long Dim strFirstLine As String Dim arrDelimiters(1 To 5) As String hFile = FreeFile Open strFile For Input Access Read As #hFile Line Input #hFile, strFirstLine Close #hFile arrDelimiters(1) = "|" arrDelimiters(2) = "~" arrDelimiters(3) = ";" arrDelimiters(4) = vbTab arrDelimiters(5) = "," For i = 1 To 5 If i = 1 Then n = CountChar(arrDelimiters(i), strFirstLine) Else n = CountChar(arrDelimiters(i), strFirstLine, x) End If If n x Then x = n z = i End If Next i If z 0 Then GuessDelimiter = arrDelimiters(z) End If End Function RBS "Chip Pearson" wrote in message ... The following code will open the file and test for probable delimiters. If Split returns an array of more than one element when splitting using a specific character as the delimiter, that character is a probable delimiter of the text on a single line of the input file. Change FName to the name of your file, and change PossibleDelimiters to include all the characters that might be a delimiter of the data. Sub AAA() Dim FNum As Integer Dim FName As String Dim Ndx As Long Dim InputLine As String Dim Arr As Variant Dim C As String Dim PossibleDelimiters As String PossibleDelimiters = ",;|~" & vbTab FName = "C:\Test.txt" '<<< CHANGE FNum = FreeFile Open FName For Input Access Read As #FNum Line Input #FNum, InputLine Close #FNum For Ndx = 1 To Len(PossibleDelimiters) C = Mid(PossibleDelimiters, Ndx, 1) Arr = Split(InputLine, C) If IsArray(Arr) = True Then If UBound(Arr) - LBound(Arr) + 1 1 Then Debug.Print "Likely Delimiter: ", C, Asc(C) End If End If Next Ndx End Sub -- Cordially, Chip Pearson Microsoft MVP - Excel Pearson Software Consulting, LLC www.cpearson.com (email address is on the web site) "Kieran H" wrote in message ps.com... To all, I would like to be able to programatically determine the delimiter within a txt or csv file so that the user is not faced with the Excel import text wizard.. The files come from several sources and the delimiter is likely to be tab, comma or semicolon and less likely a pipe or tilda. I'm thinking I need to count the number of each potential delimiter in a given number of rows probably excluding those that occur within quoted text. If you have any thoughts or know of any examples they'd be a great help Cheers Kieran |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Determine delimiter in csv or txt file
On Feb 23, 10:18 pm, "RB Smissaert"
wrote: Haven't timed it, but this might be a faster way of doing this: Sub test() MsgBox Asc(GuessDelimiter("C:\DelimTest.txt")) End Sub Function GuessDelimiter(strFile As String) As String Dim i As Byte Dim n As Long Dim x As Long Dim z As Byte Dim strText As String Dim arrDelimiters(1 To 5) As String strText = OpenTextFileToString(strFile) arrDelimiters(1) = "," arrDelimiters(2) = ";" arrDelimiters(3) = "|" arrDelimiters(4) = "~" arrDelimiters(5) = vbTab For i = 1 To 5 If i = 1 Then n = CountChar(arrDelimiters(i), strText) Else n = CountChar(arrDelimiters(i), strText, x) End If If n x Then x = n z = i End If Next i If z 0 Then GuessDelimiter = arrDelimiters(z) End If End Function Function OpenTextFileToString(ByVal strFile As String) As String Dim hFile As Long 'obtain file handle, open file and load into a string buffer hFile = FreeFile Open strFile For Input As #hFile OpenTextFileToString = Input$(LOF(hFile), hFile) Close #hFile End Function Function CountChar(strChar As String, _ strString As String, _ Optional lStopAtCount As Long = -1) As Long Dim lPos As Long Dim n As Long lPos = InStr(1, strString, strChar, vbBinaryCompare) If lPos = 0 Then CountChar = 0 Exit Function End If If lStopAtCount = -1 Then Do Until lPos = 0 lPos = InStr(lPos + 1, strString, strChar, vbBinaryCompare) n = n + 1 Loop Else Do Until lPos = 0 Or n lStopAtCount lPos = InStr(lPos + 1, strString, strChar, vbBinaryCompare) n = n + 1 Loop End If CountChar = n End Function You might make it a bit faster by altering the order of the delimiters in arrDelimiters. RBS "KieranH" wrote in message ps.com... To all, I would like to be able to programatically determine the delimiter within a txt or csv file so that the user is not faced with the Excel import text wizard.. The files come from several sources and the delimiter is likely to be tab, comma or semicolon and less likely a pipe or tilda. I'm thinking I need to count the number of each potential delimiter in a given number of rows probably excluding those that occur within quoted text. If you have any thoughts or know of any examples they'd be a great help Cheers Kieran- Hide quoted text - - Show quoted text - This is excelent - Real world problems - Real world solutions My thanks and respect Cheers Kieran |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Determine delimiter in csv or txt file
No trouble.
As you don't really have to check the whole file look at my other posting and that of Chip Pearson. RBS "Kieran H" wrote in message ups.com... On Feb 23, 10:18 pm, "RB Smissaert" wrote: Haven't timed it, but this might be a faster way of doing this: Sub test() MsgBox Asc(GuessDelimiter("C:\DelimTest.txt")) End Sub Function GuessDelimiter(strFile As String) As String Dim i As Byte Dim n As Long Dim x As Long Dim z As Byte Dim strText As String Dim arrDelimiters(1 To 5) As String strText = OpenTextFileToString(strFile) arrDelimiters(1) = "," arrDelimiters(2) = ";" arrDelimiters(3) = "|" arrDelimiters(4) = "~" arrDelimiters(5) = vbTab For i = 1 To 5 If i = 1 Then n = CountChar(arrDelimiters(i), strText) Else n = CountChar(arrDelimiters(i), strText, x) End If If n x Then x = n z = i End If Next i If z 0 Then GuessDelimiter = arrDelimiters(z) End If End Function Function OpenTextFileToString(ByVal strFile As String) As String Dim hFile As Long 'obtain file handle, open file and load into a string buffer hFile = FreeFile Open strFile For Input As #hFile OpenTextFileToString = Input$(LOF(hFile), hFile) Close #hFile End Function Function CountChar(strChar As String, _ strString As String, _ Optional lStopAtCount As Long = -1) As Long Dim lPos As Long Dim n As Long lPos = InStr(1, strString, strChar, vbBinaryCompare) If lPos = 0 Then CountChar = 0 Exit Function End If If lStopAtCount = -1 Then Do Until lPos = 0 lPos = InStr(lPos + 1, strString, strChar, vbBinaryCompare) n = n + 1 Loop Else Do Until lPos = 0 Or n lStopAtCount lPos = InStr(lPos + 1, strString, strChar, vbBinaryCompare) n = n + 1 Loop End If CountChar = n End Function You might make it a bit faster by altering the order of the delimiters in arrDelimiters. RBS "KieranH" wrote in message ps.com... To all, I would like to be able to programatically determine the delimiter within a txt or csv file so that the user is not faced with the Excel import text wizard.. The files come from several sources and the delimiter is likely to be tab, comma or semicolon and less likely a pipe or tilda. I'm thinking I need to count the number of each potential delimiter in a given number of rows probably excluding those that occur within quoted text. If you have any thoughts or know of any examples they'd be a great help Cheers Kieran- Hide quoted text - - Show quoted text - This is excelent - Real world problems - Real world solutions My thanks and respect Cheers Kieran |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Using ~ as a delimiter for Excel file saves | Excel Discussion (Misc queries) | |||
Delimiter for csv and text file using ADO | Excel Programming | |||
Flat File with @ as delimiter | Excel Discussion (Misc queries) | |||
Adding a delimiter when importing a txt file | Excel Programming | |||
Tab Delimiter File | Excel Discussion (Misc queries) |