Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Ever wondered how to work with large text imports 256 columns?
Since I found a rather nice solution to a problem I had, I wanted to
share it - as is: I needed to import a rather large (ca. 50,000 rows) text file, which had a good bit over 256 columns, but only needed a few of those columns in an Excel sheet, then export most of the rest to a text file again, but without *some* columns. To "trick" Excel into doing the file reading for me, I used Workbooks.OpenText but instead of giving it the true column separator (tab), I told the function Tab:=False and Semicolon:=True (using a separator that did not occur at all in my data). That way, Excel read the whole file at once, into column 1 of a new workbook. 1-liner instead of bothering to do all the file reading yourself :) Then, for the columns I needed in an Excel Worksheet, I wrote a little VBA function to get those columns from a string that contains a whole tab-separated (or other separator) row, as stored in column 1 now. I then looped over the number of rows and assigned the return value of my function to the cell content in column 1 of my (new) target worksheet. Now you can use the Excel Range.TextToColumns function on that sheet and you got the first part of the job done. For the data I wanted to re-export, I wrote another function that saves a text file *directly* from the imported worksheets column 1, by looping over each row, re-assembling the line without the columns I wanted to remove, and then saving to a text file. Done. I processed 266 columns, 49195 rows, extracting the first 11 columns to an Excel Sheet, re-exporting all but 1 column to a text file, on a 2.21 Dual Core System with 2GB RAM and WinXP SP2 within about 16 seconds, that includes opening the original text file (28.9MB) and saving both the new Workbook and the re-export of the text without an unwanted column. For the code used in my case, see below if interested. HTH someone! Lars VBA Code used (might need some adaptation for your needs): ' Function: getFirstXColumns ' Purpose: return the first colNumber tab-separated values from _ ' strLine, including the trailing tabs ' Example Usage (returns "abc" & vbTab & "def" & vbTab): ' columnValues = getFirstXColumns ("abc" & vbTab & "def" & vbTab _ ' & "ghi" & vbTab, 2) Public Function getFirstXColumns(strLine As String, _ colNumber As Integer) As String Dim col As Integer Dim pos As Long pos = 0 For col = 1 To colNumber pos = InStr(pos + 1, strLine, vbTab) If (pos = 0) Then getFirstXColumns = strLine Exit Function End If Next col getFirstXColumns = Left(strLine, pos) End Function ' Function: saveTextWithoutColumn ' Purpose: consider column 1 of ws a tab-separated table, determine _ ' the position of the column indicated by colName and save the _ ' table to the indicated file without that column ' Example Usage: ' saveTextWithoutColumn ws, "myfilename.txt", "unwanted column name" Private Sub saveTextWithoutColumn(ws As Worksheet, file As String, _ colName As String) Dim buffer As String Dim col As Long, colNumber As Long Dim pos As Long, endpos As Long Dim lastRow As Long, row As Long Dim fileHandle As Integer Dim skipColStart As Long, skipColEnd As Long buffer = ws.Cells(1, 1).value pos = 1 endpos = 1 col = 0 Do While (endpos 0) ' while a new col separator was found (vbTab) endpos = InStr(pos, buffer, vbTab) If (endpos 0) Then col = col + 1 If (Mid(buffer, pos, endpos - pos) = colName) Then Exit Do End If Else col = 0 End If pos = endpos + 1 Loop colNumber = col ' Determine the highest used row number in the imported data lastRow = ws.Cells.Find("*", SearchOrder:=xlByRows, _ SearchDirection:=xlPrevious).row fileHandle = FreeFile() On Error Resume Next Kill file On Error GoTo 0 Open file For Binary Access Write As #fileHandle ' if-statement wrapped around whole loop in case of non-existant ' column for speed optimisation If colNumber = 0 Then For row = 1 To lastRow ' attach line break to line and write to file buffer = ws.Cells(row, 1).value & lineBreak Put #fileHandle, , buffer Next row Else For row = 1 To lastRow buffer = ws.Cells(row, 1).value ' get current line ' determine start and end positions of column to remove pos = 0 For col = 1 To colNumber - 1 pos = InStr(pos + 1, buffer, vbTab) If (pos = 0) Then Exit For Next col skipColStart = pos + 1 skipColEnd = InStr(skipColStart, buffer, vbTab) ' assemble & write line to file without column to be skipped buffer = Left(buffer, skipColStart - 1) _ & Mid(buffer, skipColEnd + 1) & lineBreak Put #fileHandle, , buffer Next row End If Close #fileHandle End Sub Sub handleData(path As String, filename As String) Dim wb As Workbook Dim ws As Worksheet Dim wsNew As Worksheet Dim lastRow As Long Dim row As Long Workbooks.OpenText path & filename & ".txt", Origin:=xlMSDOS, _ DataType:=xlDelimited, Tab:=False, Semicolon:=True, _ DecimalSeparator:=".", ThousandsSeparator:="," Set wb = Workbooks(filename & ".txt") Set ws = wb.Worksheets(filename) ' Determine the highest used row number in the imported data lastRow = ws.Cells.Find("*", SearchOrder:=xlByRows, _ SearchDirection:=xlPrevious).row Set wsNew = wb.Worksheets.Add() wsNew.Name = "my data" For row = 1 To lastRow ' first 11 columns wanted in my case - might differ for you wsNew.Cells(row, 1).value = getFirstXColumns( _ ws.Cells(row, 1).value, 11) Next row wsNew.Range(wsNew.Cells(1, 1), _ wsNew.Cells(lastRow, 1)).TextToColumns DataType:=xlDelimited, _ Tab:=True, DecimalSeparator:=".", ThousandsSeparator:="," saveTextWithoutColumn ws, "myfilename.txt", "unwanted column name" Application.DisplayAlerts = False ws.Delete Set ws = Nothing wb.SaveAs path & wsNew.Name & ".xls", xlNormal Application.DisplayAlerts = True wb.Close Set wsNew = Nothing Set wb = Nothing End Sub |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Imports Excel = Microsoft.Office.Interop.Excel doesn't work | Excel Programming | |||
need way to handle large csv imports that overflow cell size | Excel Discussion (Misc queries) | |||
repetitive text imports | Excel Programming | |||
ODBC imports numbers as text | Excel Programming | |||
Excel,Auto fit - does work for large text fields | Excel Discussion (Misc queries) |