View Single Post
 
Old 02-06-2016, 02:09 PM
macropod's Avatar
macropod macropod is offline Windows 7 64bit Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,234
macropod has a brilliant futuremacropod has a brilliant futuremacropod has a brilliant futuremacropod has a brilliant futuremacropod has a brilliant futuremacropod has a brilliant futuremacropod has a brilliant futuremacropod has a brilliant futuremacropod has a brilliant futuremacropod has a brilliant futuremacropod has a brilliant future
Default Cleaning up Text Pasted from Websites, E-mails, PDFs etc.

When you paste text from a PDF, web site or an email, you may end up with a paragraph break at the end of every line within a logical paragraph, and two such breaks between logical paragraphs. Such text stubbornly refuses to honour justification, for example, because there's nothing to justify - it's all a series of one-line paragraphs. You should be able to see this if you have Word configured to display formatting marks on-screen. Clicking the symbol on the toolbar/home tab toggles this on/off.

The following series of wildcard Find/Replace actions cleans up text pasted from emails, websites, etc., that insert paragraph breaks at the end of every line. Note also that the process assumes there are at least two such paragraph breaks between the 'real' paragraphs.

To do a wildcard Find/Replace, open the Find/Replace dialogue, then click 'More' and click on the 'use wildcards' option.

Find = [ ^s^t]{1,}^13
Replace = ^p
Find = ([!^13^l])([^13^l])([!^13^l])
Replace = \1 \3
Find = [^s ]{2,}
Replace = ^32
Find = ([a-z])-[ ^s]{1,}([a-z])
Replace = \1\2
Find = [^13^l]{1,}
Replace = ^p

Note: Depending on your system's regional settings, you may need to replace all the commas in the above Find/Replace expressions with semi-colons. For example:
[ ^s^t]{1,}^13
becomes:
[ ^s^t]{1;}^13

The following macro automates the above Find/Replace sequence, as well as dealing with any internationalisation issues.
Code:
Sub CleanUpPastedText()
'Turn Off Screen Updating
Application.ScreenUpdating = False
Dim StrFR As String, i As Long
'Paired F/R expressions, each separated by |
StrFR = "[ ^s^t]{1,}^13|^p|([!^13^l])([^13^l])([!^13^l])|\1 \3|[^s ]{2,}| |([a-z])-[^s ]{1,}([a-z])|\1\2|[^13^l]{1,}|^p"
'Address any Internationalisation issues
If Application.International(wdListSeparator) = ";" Then
  StrFR = Replace(StrFR, ",", ";")
End If
With ActiveDocument.Range.Find
  .ClearFormatting
  .Replacement.ClearFormatting
  .Forward = True
  .Wrap = wdFindStop
  .Format = False
  .MatchAllWordForms = False
  .MatchSoundsLike = False
  .MatchWildcards = True
  'Process all F/R expressions
  For i = 0 To UBound(Split(StrFR, "|")) Step 2
    .Text = Split(StrFR, "|")(i)
    .Replacement.Text = Split(StrFR, "|")(i + 1)
    .Execute Replace:=wdReplaceAll
  Next
End With
'Restore Screen Updating
Application.ScreenUpdating = True
End Sub
For PC macro installation & usage instructions, see: Installing Macros
For Mac macro installation & usage instructions, see: Word:mac - Install a Macro

If you'd prefer to run the macro against just a selected range, change:
ActiveDocument
to:
Selection
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]