|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Is there a way to delete the beginning part of an array so it doesn't overflow?
Hi Guys,
So I need to import a large text file into an array. I only need the last 5,000 lines or so. Most of the files are ok, but occasionally (around 30000 lines) I get an overflow error. Since I only need the last five thousand lines or so, is there a way to delete the beginning part of the array as more is added after 5,000 lines, I.E. when line 5001 is imported line 1 gets erased? I tried the below but still got the overflow error. I wouldn't have to delete pieces of the array one at a time, either - I could say that at 10,000 the first 5000 get deleted, at 15,000 the next 5000 get deleted, etc. Or might I just be better off reading the file twice to count the lines? Code:
Do While Not fo.AtEndOfStream Count = Count + 1 ReDim Preserve AllArray(Count) AllArray(Count) = fo.ReadLine 'If Count > 5000 Then AllArray(Count - 5000) = "" Loop |
#2
|
||||
|
||||
Yeah, I've never used ReDim Preserve but as I understand the documentation that won't work in this case.
The simplest way may indeed be to read the file twice. It probably wouldn't slow your program down, much. A more complicated alternative, if you want to try it, would be to have two arrays, and every time you have to "delete" the first 20K-or-whatever entries you just ReDim the other array and copy over the last 5K lines. Since that involves all RAM, not disc I/O, it might be faster. Might be a pain to program reliably, though; I'd probably just read the file twice, unless it's so big that reading it twice slows your program down by 15% or more. But there's another possibility; how about using a Collection? That makes it much easier, because you can delete the first item, or indeed any item at all, from a Collection. It may be slower but it may not; I've had to store a lot of information in collections, from time to time, and I can't tell that they're any slower than arrays. |
#3
|
|||
|
|||
Thanks! While reprogramming, I found another problem: I had dimmed "count" as an integer, and apparently Excel doesn't think there's integers above 30,000! I now have the program read the file twice by closing and reopening it and only put the end of the file in the array, and I got it to run on a 70k line file on a seven year old laptop in only a few seconds.
|
#4
|
|||
|
|||
I also thought that I would post the code in case anyone else in the future tries a similar method:
Counts whole file (megacount is the new counter I made to create the entire file) Code:
Do While Not fo.AtEndOfStream 'Counts all lines fo.readline Megacount = Megacount + 1 Loop Code:
fo.Close Set fo = fso.OpenTextFile(FName, 1) Code:
countagain=0 'This is required to count what line the file is on during the second run Count=0 If Megacount < 10000 Then 'For small files under 10000 lines, entire array is imported Do While Not fo.AtEndOfStream 'Puts the rest of the file into an array after title line Count = Count + 1 ReDim Preserve AllArray(Count) AllArray(Count) = fo.readline Loop Else 'Bigger files above 10000 lines. 'Cycles through beginning of file without putting anything in array Do Until Countagain > Megacount - 9999 fo.readline Countagain = Countagain + 1 Loop Do While Not fo.AtEndOfStream 'Actually puts things in array, Count is the array counter ReDim Preserve AllArray(Count) AllArray(Count) = fo.readline Count = Count + 1 Loop End If |
#5
|
||||
|
||||
I kinda thought that reading the file twice wouldn't hurt you. If it gets to be more than a few megs in size, then it may be time to try a more complex method.
By the way, about integers greater than 32K; I think you'll find — let me look it up just to be sure .... yes, the Integer data type is stored in two bytes of storage and can therefore go "from -32,768 to 32,767". But the Long data type uses a full word (four bytes) of storage and can hold values "from -2,147,483,648 to 2,147,483,647". And in Office 2010 there's even a LongLong type that doubles storage yet again (8 bytes, 64 bits) and holds "-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807". So you can probably hold more data in your arrays than you feared. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Delete bookmark which doesn't have a name | rarulkumar | Word VBA | 2 | 04-19-2013 03:19 AM |
Control direction of resizing textarea on overflow | noodle | PowerPoint | 0 | 03-31-2013 04:52 AM |
Convert String Array to Integer Array from a User Input? | tinfanide | Excel Programming | 4 | 12-26-2012 08:56 PM |
Files beginning ~$ | RobinW | Word | 1 | 06-28-2012 05:35 PM |
Before the table but after beginning of doc? | Jaymond Flurrie | Word VBA | 2 | 08-16-2011 06:27 AM |