![]() |
#1
|
|||
|
|||
![]()
I have a large (3Million+ word) document I converted from RTF into a Word Doc.
At the start of many (maybe most) sections, the section title is repeated for some reason. So it has something like: Section 1: How to do something Section 1: How to do something Paragraphs on how to do something. Section 2: More details. Section 2: More details. Paragraphs of more details and so on. I already figured out how to do a case-sensitive find/replace on "Section" and not replace the text but to make the style of those lines all Heading 1 - which formatted them and added them to the navigation pane at the same time. Which was awesome! Now I am hoping there is some way that I can ferret out all of the duplicate section headings and remove the extras. I found if I right-click the title in the navigation pane I can delete it from there (and learned to make sure it was the first of the duplicates, as the second would delete the doubled title PLUS all of the contents in that section!). But with nearly 5,000 sections total, and close to (or maybe more than) half of them duplicated, this would be a long process. Is there any way to go through and do this with some variation of a find/replace? |
#2
|
||||
|
||||
![]()
You could use a wildcard Find/Replace, where:
Find = (Section [0-9][!^13]@^13)\1 Replace = \1
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
|||
|
|||
![]()
Awesome, thanks! I knew there had to be a way.
Would you mind explaining to me what the various terms are there? Specifically, the !^13, @^13, and \1? (Also, since it goes up into the thousands, would I need to change [0-9] to [0-999], or would the [0-9] cover any/all numbers?) |
#4
|
||||
|
||||
![]()
The [!^13]@^13 Find expression tells Word to find any string of characters, other than a paragraph break, terminated by a paragraph break.
The \1 Find expression tells Word to find exactly the same string as that found by the expression between the (). It doesn't matter to the Find what character follows the first digit, provided it's not a paragraph break. Hence numbers 0-9999999... would be found.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#5
|
|||
|
|||
![]()
So would
Find = (Section [!^13]@^13)\1 Replace = \1 Also work? |
#6
|
|||
|
|||
![]()
Hmm, word found no results with that search string. I used, exactly,
Find = (Section [0-9][!^13]@^13)\1 Replace = \1 And even though I had one of the repeated section headings visible on screen, it reported that no results were found. |
#7
|
||||
|
||||
![]()
Not if you require 'Section' to be followed by a number and not a letter.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#8
|
|||
|
|||
![]()
Would it matter that they are styled as Heading 1, and in the navigation pane? I wouldn't think it would matter, but...?
|
#9
|
||||
|
||||
![]()
Did you check the 'use wildcards' option?
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#10
|
|||
|
|||
![]()
Of course I hadn't, lol. Thanks again! Worked a treat!
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
look for VBA to remove Duplicates from Subfolders | nf24eg | Outlook | 0 | 08-12-2021 05:03 AM |
![]() |
akol1214 | Excel | 3 | 10-04-2018 11:58 PM |
Macro to keep first instance and remove duplicates in certain column | zhead | Excel | 2 | 03-18-2015 10:16 AM |
![]() |
rcVBA | Word VBA | 4 | 05-15-2013 03:08 PM |
Macro to remove duplicates in Refrences list | HowardC | Word VBA | 0 | 05-20-2010 09:57 AM |