View Single Post
 
Old 09-16-2021, 06:39 PM
bigjoec bigjoec is offline Windows 10 Office 2019
Novice
 
Join Date: Sep 2021
Posts: 3
bigjoec is on a distinguished road
Default

Quote:
Originally Posted by Peterson View Post
If there is a paragraph mark at the end of every instance of a block of 8-point text, and the only attribute you need in order to find the text blocks is the font size, then you could a wildcard find/replace, as follows:

Find:
(*)(^13)

In the Font section, set the size to 8 points

Replace:
\1 \2

(To be clear, there's a space between the 1 and the second slash)
Thanks, but unfortunately there are no paragraph marks. It's currently fully embedded in the regular text, just with a different font size.

This is step one in reformatting it to get the page numbers out of the text and where they belong.

I have the regex to do what I want with the page numbers:
Code:
        .Text = "(p. )([0-9]{1,4})"
        .Replacement.Text = "^mPAGE \2^l"
But the issue is that the body text also includes items of the form "p. ##", mostly references to other documents. So I can't use my find-replace blindly or it will screw up the body text. Fortunately the internal page numbers are distinguished by being in 8 point font (the rest is in 12 point), so I can leverage that in the find-replace.

However, I'm still not hitting 100% of the cases because there are times where document page number is followed immediately by a number in the body text (e.g. "... they sold p. 175,000 widgets at a price of..."), and Find isn't finding it because the regex is hitting on "p. 175" but that whole string doesn't meet the font size constraint because the 5 is in 12-point font.

So I just want to throw in a space after every time the font changes away from 8 points. It will add in a spurious space in some places where I don't want one, but I'm willing to accept that.

Once I build in this pre-processing step, my existing macro should cover all cases.
Reply With Quote