![]() |
#1
|
|||
|
|||
![]()
Hi
I have this source text (*.rtf) containing information in (sort of) tab-seperated columns. I want to copy it into a excel worksheet to do further analyzing. My problem is, that the tabulator signs 'moving' the text into columns are of different length in the .rtf/.doc file, but exporting it excel treats them the same: In the rtf file the signs look like this: -------------> ------------------------------------------> Reading them as separators excel treats them alike. I guess what I'm looking for is an easy way to (maybe w/search&replace) make the 'long' tabs into several tabs: -------------> ------------->------------->-------------> Real examples attached It must be a 'automatic' solution, since we're talking about a rtf file of 1000+ pages.. |
#2
|
||||
|
||||
![]()
You document has been formatted with different numbers of tab-stops for the different paragraphs, depending on what they contain. You can reformat the document with consistent tabbing with the following macro:
Code:
Sub Demo() Application.ScreenUpdating = False Dim i As Long, j As Long, para As Paragraph, Fmt As ParagraphFormat With ActiveDocument .DefaultTabStop = 0 j = .Paragraphs(1).TabStops.Count .Range.ParagraphFormat = .Paragraphs(1).Range.ParagraphFormat For i = 1 To .Paragraphs.Count While (Len(.Paragraphs(i).Range.Text) - Len(Replace(.Paragraphs(i).Range.Text, vbTab, ""))) < j .Paragraphs(i).Range.InsertBefore vbTab Wend Next End With Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
|||
|
|||
![]()
Thank you sooooo much!!! This works (almost) exactly how I imagined. The only problem is:
When there is nothing in the last "column" in the source text, the macro moves the content from the second to last "column" to this one instead, or, moves everything a column too much to the right (maybe that what it does). New example (1th page source text, 2nd page after macro) |
#4
|
||||
|
||||
![]()
Try the following code revision:
Code:
Sub Demo() Application.ScreenUpdating = False Dim i As Long, j As Long, k As Long, para As Paragraph, Fmt As ParagraphFormat, HPos As Single With ActiveDocument .DefaultTabStop = 0 j = .Paragraphs(1).TabStops.Count For i = 1 To .Paragraphs.Count For k = 1 To j If .Paragraphs(i).Range.Characters.Last.Information(wdHorizontalPositionRelativeToTextBoundary) < _ .Paragraphs(1).TabStops(k).Position Then .Paragraphs(i).Range.Characters.Last.InsertBefore vbTab Next While (Len(.Paragraphs(i).Range.Text) - Len(Replace(.Paragraphs(i).Range.Text, vbTab, ""))) < j .Paragraphs(i).Range.InsertBefore vbTab Wend Next .Range.ParagraphFormat = .Paragraphs(1).Range.ParagraphFormat End With Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#5
|
|||
|
|||
![]()
First of all thank you so much!!
Is it possible to add some code to the macro, making it possible to follow the progress - without delaying it too much? What I mean is, since the document is VERY long, literally thousands of pages, the process always takes a while (not to sat a very long time). It would be nice to be able to follow the process. Not by adding Code:
Application.ScreenUpdating = True Last edited by Flabbergaster; 10-29-2012 at 09:07 AM. Reason: misspelling |
#6
|
||||
|
||||
![]()
Try:
Code:
Sub Demo() Application.ScreenUpdating = False Dim i As Long, j As Long, k As Long, l As Long Dim para As Paragraph, Fmt As ParagraphFormat, HPos As Single With ActiveDocument .DefaultTabStop = 0 j = .Paragraphs(1).TabStops.Count l = .Paragraphs.Count For i = 1 To l Application.StatusBar = "Processing paragraph " & i & " of " & l For k = 1 To j If .Paragraphs(i).Range.Characters.Last.Information(wdHorizontalPositionRelativeToTextBoundary) < _ .Paragraphs(1).TabStops(k).Position Then .Paragraphs(i).Range.Characters.Last.InsertBefore vbTab Next While (Len(.Paragraphs(i).Range.Text) - Len(Replace(.Paragraphs(i).Range.Text, vbTab, ""))) < j .Paragraphs(i).Range.InsertBefore vbTab Wend If i Mod 100 = 0 Then DoEvents Next .Range.ParagraphFormat = .Paragraphs(1).Range.ParagraphFormat End With Application.StatusBar = "Done!!" Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#7
|
|||
|
|||
![]()
Hi again
You're a genius.... Thanks. Is there any tweaks one could add to make it run faster? I see now, that there is 275103 paragraphs and it takes appr. 1500 pr. 10 min now? |
#8
|
||||
|
||||
![]()
I've tweaked the code for about as much performance as can be eked out of it:
Code:
Sub Demo() Application.ScreenUpdating = True Dim i As Long, j As Long, k As Long, l As Long, m As Long Dim pRng As Paragraph, HPos As Single, StrStops As String With ActiveDocument .DefaultTabStop = 0 j = .Paragraphs(1).TabStops.Count For k = 1 To j StrStops = StrStops & "," & .Paragraphs(1).TabStops(k).Position Next l = .Paragraphs.Count With .Range.Duplicate For Each pRng In .Paragraphs i = i + 1 Application.StatusBar = "Processing paragraph " & i & " of " & l With pRng.Range m = 0 For k = 1 To j If .Characters.Last.Information(wdHorizontalPositionRelativeToTextBoundary) < _ Split(StrStops, ",")(k) Then m = j - k Exit For End If Next Select Case m Case 1: .Characters.Last.InsertBefore vbTab Case 2: .Characters.Last.InsertBefore vbTab & vbTab Case 3: .Characters.Last.InsertBefore vbTab & vbTab & vbTab Case 4: .Characters.Last.InsertBefore vbTab & vbTab & vbTab & vbTab Case 5: .Characters.Last.InsertBefore vbTab & vbTab & vbTab & vbTab & vbTab Case 6: .Characters.Last.InsertBefore vbTab & vbTab & vbTab & vbTab & vbTab & vbTab Case 7: .Characters.Last.InsertBefore vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab Case 8: .Characters.Last.InsertBefore vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab Case 9: .Characters.Last.InsertBefore vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab End Select m = Len(.Text) - Len(Replace(.Text, vbTab, "")) Select Case m Case j - 1: .InsertBefore vbTab Case j - 2: .InsertBefore vbTab & vbTab Case j - 3: .InsertBefore vbTab & vbTab & vbTab Case j - 4: .InsertBefore vbTab & vbTab & vbTab & vbTab Case j - 5: .InsertBefore vbTab & vbTab & vbTab & vbTab & vbTab Case j - 6: .InsertBefore vbTab & vbTab & vbTab & vbTab & vbTab & vbTab Case j - 7: .InsertBefore vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab Case j - 8: .InsertBefore vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab Case j - 9: .InsertBefore vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab & vbTab End Select If i Mod 10 = 0 Then DoEvents ActiveDocument.UndoClear End If End With Next .ParagraphFormat = .Paragraphs(1).Range.ParagraphFormat End With End With Application.StatusBar = "Done!!" Application.ScreenUpdating = True End Sub Ultimately, you'd do far better getting whatever's generating the source file to put all the required tabs into the file at the outset.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#9
|
|||
|
|||
![]()
Wow, thanks... Already much faster
Which part do I change to a) make the updating less frequent ( is it i = i + 1 -> i = i +, say, 10 or 25?? b) remove the progress report completely |
#10
|
||||
|
||||
![]()
To delete the progress report, comment out or delete the line:
Application.StatusBar = "Processing paragraph " & i & " of " & l To give, say, a % progress report requires adding more code to calculate the %. That calculation has its own overhead which, compared, updating the screen as per the existing progress report, takes only marginally less time. An alternative would be to relocate the existing status code to just above/below the 'DoEvents' line. That'll update the report only every 10th paragraph.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
![]() |
Tags |
columns, search/replace, tabulator |
Thread Tools | |
Display Modes | |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Advanced search and replace for bullet points | TishyMouse | Word | 5 | 02-17-2012 06:32 AM |
Any wildcards search and replace in Powerpoint 2010? | tinfanide | PowerPoint | 0 | 09-10-2011 02:17 AM |
![]() |
seteshpl | Word | 1 | 09-06-2011 01:35 AM |
Search Replace Copy | dblack7211 | Word | 0 | 05-05-2010 01:19 PM |
Search and Replace - Clear Search box | JostClan | Word | 1 | 05-04-2010 08:46 PM |