Microsoft Office Forums

Go Back   Microsoft Office Forums > >

Reply
 
Thread Tools Display Modes
  #1  
Old 12-26-2016, 05:33 PM
macropod's Avatar
macropod macropod is offline How to find duplicate phrases/paragraphs in a long document Windows 7 64bit How to find duplicate phrases/paragraphs in a long document Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 22,513
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

You could use almost identical code for 'sentences' (note my previous caveat):
Code:
Sub FindDuplicateSentences()
Application.ScreenUpdating = False
Dim i As Long, RngSrc As Range, RngFnd As Range
Const Clr As Long = wdBrightGreen
Dim eTime As Single
eTime = Timer
Options.DefaultHighlightColorIndex = Clr
With ActiveDocument
  With .Range.Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .Forward = True
    .Format = False
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    .Execute
  End With
  For i = 1 To .Sentences.Count
    If i Mod 100 = 0 Then DoEvents
    On Error Resume Next
    Set RngSrc = .Sentences(i)
    If RngSrc.HighlightColorIndex <> Clr Then
      Set RngFnd = .Range(.Sentences(i).End, .Range.End)
      If Len(RngSrc.Text) < 256 Then
        With RngFnd.Find
          .Text = RngSrc.Text
          .Replacement.Text = "^&"
          .Replacement.Highlight = True
          .Wrap = wdFindStop
          .Execute Replace:=wdReplaceAll
        End With
      Else
        With RngFnd
          With .Find
            .Text = Left(RngSrc.Text, 255)
            .Wrap = wdFindStop
            .Execute
          End With
          Do While .Find.Found
            If RngSrc.Text = .Duplicate.Text Then
              RngSrc.HighlightColorIndex = Clr
              .Duplicate.HighlightColorIndex = Clr
            End If
            .Collapse wdCollapseEnd
            .Find.Execute
          Loop
        End With
      End If
    End If
  Next
End With
' Report time taken. Elapsed time calculation allows for execution to extend past midnight.
MsgBox "Finished. Elapsed time: " & (Timer - eTime + 86400) Mod 86400 & " seconds."
Application.ScreenUpdating = True
End Sub
I'd expect this to take somewhat longer, though. However, if you've already highlighted to duplicate paras, execution should be a bit quicker with those paragraphs already highlighted.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #2  
Old 12-26-2016, 09:28 PM
iamgator iamgator is offline How to find duplicate phrases/paragraphs in a long document Windows 7 64bit How to find duplicate phrases/paragraphs in a long document Office 2007
Banned
How to find duplicate phrases/paragraphs in a long document
 
Join Date: Dec 2016
Posts: 3
iamgator is on a distinguished road
Default

the second task took a little over 4500 seconds... But it was worth the wait...this task would have driven me nuts without your help.. I genuinely appreciate it.. However, I would also like to add that, with the "sentences" macro, I also received several false positives....is this because the script also calculates familiarity between sentences such that sentences beyond a given threshold are automatically flagged as duplicates, even without word-for-word duplication...? That would be interesting to know...

Last edited by iamgator; 12-26-2016 at 09:32 PM. Reason: needed to add extra information
Reply With Quote
  #3  
Old 12-27-2016, 01:34 AM
macropod's Avatar
macropod macropod is offline How to find duplicate phrases/paragraphs in a long document Windows 7 64bit How to find duplicate phrases/paragraphs in a long document Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 22,513
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

Quote:
Originally Posted by iamgator View Post
I also received several false positives....is this because the script also calculates familiarity between sentences such that sentences beyond a given threshold are automatically flagged as duplicates, even without word-for-word duplication...? That would be interesting to know...
There shouldn't be any false positives in terms of VBA 'sentences'. However, because of the limitations in what VBA counts as a sentence, parts of grammatical sentences may be highlighted even though the grammatical sentences differ. This could even lead to a situation where all of a grammatical sentence somewhere in the document is highlighted because the VBA 'sentence' parts of it are found elsewhere.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
Reply

Tags
macro, vba



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to find duplicate phrases/paragraphs in a long document find a way to show a closest to using lat and long cordinates Steve81uk Excel Programming 4 02-02-2015 07:04 PM
How to find duplicate phrases/paragraphs in a long document How to find and delete duplicate words in doc cinvest Word 1 09-29-2014 08:34 PM
How can I find paragraphs all in italics? Robert2 Word 1 01-28-2014 02:54 PM
How do I find Repeating Words/Phrases? CCD2016 PowerPoint 0 12-01-2013 09:37 PM
How to find duplicate phrases/paragraphs in a long document How can I find paragraphs all in italics? Robert2 Word 1 06-30-2013 03:57 AM

Other Forums: Access Forums

All times are GMT -7. The time now is 03:44 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft