View Single Post
 
Old 11-26-2014, 08:58 PM
macropod's Avatar
macropod macropod is offline Windows 7 64bit Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 22,467
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

A searchable .pdf file is one that has had an OCR conversion done, but the original scanned image sits in front of the text, so the appearance remains unchanged. If you save such as file as a Word document, you'll end up with the same arrangement in the document. What you'll see is the scanned images, not the text. To see the text, you'd delete the scanned images.

Assuming that saving the file as a "Text (OCR) to rich text (rtf)" format outputs just the text, any differences in characters will be because of the fonts and/or point sizes used, apart from what you'd expect via bold/italics. You can clear all of this via Ctrl-A, then Ctrl-Space and Ctrl-Q. Apart from whatever formatting belongs to the various paragraph Styles, any shading, bold, italics, etc. should all disappear.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote