#1
|
|||
|
|||
Printing text clearly when converted from a .pdf file
I am using Office 2010. I have scanned a document, and have saved it as a .pdf file. Then, in Adobe, I converted the file to a Word file. The conversion went okay, although some of the characters are not crisp, as if one were preparing the document in Word to begin with. Is there a way to sharpen the characters in Word without specifying that they be bold-faced?
Gordon Houston, Texas |
#2
|
||||
|
||||
Your description suggests you've only inserted the scanned page images into Word. To get the text, you actually need to run an OCR process on the scanned page images (you can do this with Adobe Acrobat Pro 8 [available as a free download from http://www.techspot.com/downloads/4683-adobe-acrobat-8-free.html - note the serial# mentioned there], or with the OCR packages that come with many scanners), then insert the text into Word. The inserted text will have all the qualities of normal text in a document. Do note that OCR conversions often require some post-conversion editing to correct OCR errors.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
|||
|
|||
Hello, Sir Paul --
Thanks for the prompt response. I have an HP All-in-One Printer (7200 series), and I used its HP Solution Center software to scan the document to a searchable .pdf file. I read this document into Adobe Acrobat 9 Standard, and performed a Save As to a Word .doc file. This I read into Word 2010 and converted to a .docx file. Per your suggestion, I went the route (using the HP Solution Center software) of saving the file as a "Text (OCR) to rich text (rtf)" format. I then read this file in to Word 2010, and saved it as a .docx file. The quality of Adobe's conversion to Word of the original .pdf file was far superior to the .rtf route described. The editing for a 25-page document was considerably less tedious, despite some of the lightened characters. There was still some minor abnormal shading of characters, using the .rtf format route, although somewhat better than that experienced in the Adobe route. However, I still may be missing something from the description that you provided. Gordon |
#4
|
||||
|
||||
A searchable .pdf file is one that has had an OCR conversion done, but the original scanned image sits in front of the text, so the appearance remains unchanged. If you save such as file as a Word document, you'll end up with the same arrangement in the document. What you'll see is the scanned images, not the text. To see the text, you'd delete the scanned images.
Assuming that saving the file as a "Text (OCR) to rich text (rtf)" format outputs just the text, any differences in characters will be because of the fonts and/or point sizes used, apart from what you'd expect via bold/italics. You can clear all of this via Ctrl-A, then Ctrl-Space and Ctrl-Q. Apart from whatever formatting belongs to the various paragraph Styles, any shading, bold, italics, etc. should all disappear.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#5
|
|||
|
|||
I tried this approach on my Adobe-to-Word converted document, and it performed exactly as you indicated - sharp, clear text. Unfortunately, one heck of a lot of re-formatting comes with the job, so I suppose that it is a matter of aesthetics versus efficiency!
Thanks for your inputs - most helpful! |
#6
|
|||
|
|||
Follow-up:
If one highlights just sections of the text, and then uses Control + space, but not the Control + Q function, one achieves the character clarity, and does not lose the formatting. What is the purpose of the Control + Q entry? GB |
#7
|
||||
|
||||
Ctrl-Space removes any character-level formatting (e.g. bold, italic) that doesn't match the paragraph Style.
Ctrl-Q removes any paragraph-level formatting (e.g. indents, alignment) that doesn't match the paragraph Style.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
Thread Tools | |
Display Modes | |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
PDF file converted to docx is not accepting image inserts | georgiefame | Word | 3 | 12-28-2013 10:27 AM |
PowerPoint converted text sections to images - can't edit! | dan_cowen | PowerPoint | 0 | 05-21-2013 04:38 PM |
Text disappears (but headings and text boxes ok) when printing 1 page of a document | msfordummies | Word | 1 | 02-21-2013 10:28 PM |
My plain text post got converted to rich text in a reply, how to convert it back? | david.karr | Outlook | 0 | 01-05-2012 09:46 AM |
Incoming Mail Converted to Text | luke1438 | Outlook | 4 | 03-13-2011 07:47 AM |