View Single Post
 
Old 04-13-2011, 03:09 PM
macropod's Avatar
macropod macropod is offline Windows 7 32bit Office 2000
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,963
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

Hi Metamag,

Often, when you paste text into Word from web sites, email or PDFs, you'll get a paragraph break at the end of each line within a logical paragraph, and two such breaks between logical paragraphs. Such text stubbornly refuses to honour justification, because there's nothing to justify - it's all a series on one-line paragraphs. You should be able to see this if you have Word configured to display formatting marks on-screen. Clicking the ¶ symbol on the toolbar/home tab toggles this on/off.

You can fix this kind of paragraph formatting using a series of wildcard Find/Replace actions, with:
Find = [ ^s^t]{1,}^13
Replace = ^p
Find = ([!^13])([^13])([!^13])
Replace = \1 \3
Find = [ ]{2,}
Replace = ^32
Find = ([a-z])-[ ]{1,}([a-z])
Replace = \1\2
Find = [^13]{1,}
Replace = ^p
Note that the above assumes there are at least two such paragraph breaks between the 'real' paragraphs.

A macro equivalent is:
Code:
Sub CleanUpPastedText()
'Turn Off Screen Updating
Application.ScreenUpdating = False
With ActiveDocument.Range.Find
  .ClearFormatting
  .Replacement.ClearFormatting
  .Forward = True
  .Wrap = wdFindStop
  .Format = False
  .MatchAllWordForms = False
  .MatchSoundsLike = False
  .MatchWildcards = True
  'Eliminate spaces & tabs before paragraph breaks.
  .Text = "[ ^s^t]{1,}^13"
  .Replacement.Text = "^p"
  .Execute Replace:=wdReplaceAll
  'Replace single paragraph breaks with a space
  .Text = "([!^13])([^13])([!^13])"
  .Replacement.Text = "\1 \3"
  'Replace all double spaces with single spaces
  .Execute Replace:=wdReplaceAll
  .Text = "[ ]{2,}"
  .Replacement.Text = " "
  'Delete hypens in hyphenated text formerly split across lines
  .Execute Replace:=wdReplaceAll
  .Text = "([a-z])-[ ]{1,}([a-z])"
  .Replacement.Text = "\1\2"
  .Execute Replace:=wdReplaceAll
  'Limit paragraph breaks to one per 'real' paragraph.
  .Text = "[^13]{1,}"
  .Replacement.Text = "^p"
  .Execute Replace:=wdReplaceAll
End With
'Restore Screen Updating
Application.ScreenUpdating = True
End Sub
For PC macro installation & usage instructions, see: http://www.gmayor.com/installing_macro.htm
For Mac macro installation & usage instructions, see: http://word.mvps.org/Mac/InstallMacro.html

If you'd prefer to run the macro against just a selected range, change:
ActiveDocument
to:
Selection
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]

Last edited by macropod; 04-13-2011 at 05:08 PM. Reason: Minor Code Fix
Reply With Quote