Microsoft Office Forums

Go Back   Microsoft Office Forums > >

Reply
 
Thread Tools Display Modes
  #1  
Old 02-06-2022, 04:17 PM
hephalumph hephalumph is offline find/remove duplicates Windows 11 find/remove duplicates Office 2021
Novice
find/remove duplicates
 
Join Date: Feb 2022
Posts: 9
hephalumph is on a distinguished road
Default find/remove duplicates

I have a large (3Million+ word) document I converted from RTF into a Word Doc.

At the start of many (maybe most) sections, the section title is repeated for some reason.

So it has something like:

Section 1: How to do something

Section 1: How to do something

Paragraphs on how to do something.

Section 2: More details.

Section 2: More details.



Paragraphs of more details

and so on.

I already figured out how to do a case-sensitive find/replace on "Section" and not replace the text but to make the style of those lines all Heading 1 - which formatted them and added them to the navigation pane at the same time. Which was awesome!

Now I am hoping there is some way that I can ferret out all of the duplicate section headings and remove the extras.

I found if I right-click the title in the navigation pane I can delete it from there (and learned to make sure it was the first of the duplicates, as the second would delete the doubled title PLUS all of the contents in that section!).

But with nearly 5,000 sections total, and close to (or maybe more than) half of them duplicated, this would be a long process.

Is there any way to go through and do this with some variation of a find/replace?
Reply With Quote
  #2  
Old 02-06-2022, 05:15 PM
macropod's Avatar
macropod macropod is offline find/remove duplicates Windows 10 find/remove duplicates Office 2016
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 22,342
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

You could use a wildcard Find/Replace, where:
Find = (Section [0-9][!^13]@^13)\1
Replace = \1
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #3  
Old 02-06-2022, 06:58 PM
hephalumph hephalumph is offline find/remove duplicates Windows 11 find/remove duplicates Office 2021
Novice
find/remove duplicates
 
Join Date: Feb 2022
Posts: 9
hephalumph is on a distinguished road
Default

Awesome, thanks! I knew there had to be a way.
Would you mind explaining to me what the various terms are there? Specifically, the !^13, @^13, and \1?

(Also, since it goes up into the thousands, would I need to change [0-9] to [0-999], or would the [0-9] cover any/all numbers?)
Reply With Quote
  #4  
Old 02-06-2022, 07:08 PM
macropod's Avatar
macropod macropod is offline find/remove duplicates Windows 10 find/remove duplicates Office 2016
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 22,342
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

The [!^13]@^13 Find expression tells Word to find any string of characters, other than a paragraph break, terminated by a paragraph break.

The \1 Find expression tells Word to find exactly the same string as that found by the expression between the ().

It doesn't matter to the Find what character follows the first digit, provided it's not a paragraph break. Hence numbers 0-9999999... would be found.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #5  
Old 02-06-2022, 07:16 PM
hephalumph hephalumph is offline find/remove duplicates Windows 11 find/remove duplicates Office 2021
Novice
find/remove duplicates
 
Join Date: Feb 2022
Posts: 9
hephalumph is on a distinguished road
Default

So would
Find = (Section [!^13]@^13)\1
Replace = \1

Also work?
Reply With Quote
  #6  
Old 02-06-2022, 07:20 PM
hephalumph hephalumph is offline find/remove duplicates Windows 11 find/remove duplicates Office 2021
Novice
find/remove duplicates
 
Join Date: Feb 2022
Posts: 9
hephalumph is on a distinguished road
Default

Hmm, word found no results with that search string. I used, exactly,
Find = (Section [0-9][!^13]@^13)\1
Replace = \1

And even though I had one of the repeated section headings visible on screen, it reported that no results were found.
Reply With Quote
  #7  
Old 02-06-2022, 07:20 PM
macropod's Avatar
macropod macropod is offline find/remove duplicates Windows 10 find/remove duplicates Office 2016
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 22,342
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

Not if you require 'Section' to be followed by a number and not a letter.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #8  
Old 02-06-2022, 07:21 PM
hephalumph hephalumph is offline find/remove duplicates Windows 11 find/remove duplicates Office 2021
Novice
find/remove duplicates
 
Join Date: Feb 2022
Posts: 9
hephalumph is on a distinguished road
Default

Would it matter that they are styled as Heading 1, and in the navigation pane? I wouldn't think it would matter, but...?
Reply With Quote
  #9  
Old 02-06-2022, 07:21 PM
macropod's Avatar
macropod macropod is offline find/remove duplicates Windows 10 find/remove duplicates Office 2016
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 22,342
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

Did you check the 'use wildcards' option?
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #10  
Old 02-06-2022, 07:23 PM
hephalumph hephalumph is offline find/remove duplicates Windows 11 find/remove duplicates Office 2021
Novice
find/remove duplicates
 
Join Date: Feb 2022
Posts: 9
hephalumph is on a distinguished road
Default

Quote:
Originally Posted by macropod View Post
Did you check the 'use wildcards' option?
Of course I hadn't, lol. Thanks again! Worked a treat!
Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
look for VBA to remove Duplicates from Subfolders nf24eg Outlook 0 08-12-2021 05:03 AM
find/remove duplicates Find duplicates in colums by row only akol1214 Excel 3 10-04-2018 11:58 PM
Macro to keep first instance and remove duplicates in certain column zhead Excel 2 03-18-2015 10:16 AM
find/remove duplicates find and delete duplicates rcVBA Word VBA 4 05-15-2013 03:08 PM
Macro to remove duplicates in Refrences list HowardC Word VBA 0 05-20-2010 09:57 AM

Other Forums: Access Forums

All times are GMT -7. The time now is 08:41 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft