#1
|
|||
|
|||
Frames, Frames, Frames: How to eliminate them but keep the text?
I'm new to Word 2010. I have a 600 page document that I've assembled by scanning a huge document into groups of 50 pages. The software that I used (Readiris) put frames around every paragraph and graphic.
Now I want to reassemble the original document by inserting each group of 50 pages at the end of each document. Unfortunately, I can't find a way to get my insertion point outside of the last frame on the last page. If I insert the additional 50 pages inside of the last frame on the last page, it skews the text and makes a mess. HELP! PLEASE!! |
#2
|
|||
|
|||
First make sure that these are actually frames and not textboxes and not text boundaries. One way to be sure is to (in a copy of your document) use Ctrl-A to select all text in the document and then apply the body text style. If they are frames, the frames should disappear. That may make a hash of your formatting because the conversion program was using those frames for something.
You are not going to want to hear this, but my recommendation when faced with a converted document that you want to edit is to paste the text into Notepad and then copy it from Notepad into a new Word document and format that document using Styles. Understanding Styles in Microsoft Word I have not used Readiris. All methods of conversion into Word (including Word itself) make a hash of formatting, creating documents that are virtually impossible to edit. This is usually because of changing margins by paragraph but can be for other reasons. The formatting is a nightmare. Numbering goes crazy. It certainly does not sound like Readiris is any exception. In addition, scanned documents get text through the process of OCR. While OCR has improved dramatically, it is still far from perfect. You will want to proofread and double-proofread your text. More on Frames and Textboxes in Microsoft Word |
#3
|
|||
|
|||
Quote:
It is also possible that you have Tables rather than frames. If it is Tables, when your insertion point is inside a Table, the Tables tabs should appear on your Ribbon. They will not be visible outside a Table. Even if the Ribbon itself is not showing, the Yellow Table Tools tab would display. |
#4
|
|||
|
|||
I'd agree with Charles – cleaning the results of a scan, while always tedious, is quicker and easier outside Word. You can also insert pseudotags (prefix a slug of text you recognise as a level two heading with a string like [h2]) in the text file and then use them to apply style by search/replace.
Good luck! |
#5
|
|||
|
|||
The Outcome
Thanks guys for your guidance. The boxes did turn out to be frames. And I did find the technique for removing the frame. But as you said, no matter how good the scan program is, everything turns out to be a big mess.
This document is almost 700 pages long (with all of its supplemental parts), and has hundreds of graphics. And while Readiris actually did a great job of scanning and OCR'ing, it is way too big and daunting of a chore to mess with. I did manage to find the bulk of the original Word document that I had saved to CD many years ago, but it missing many pages that I wrote and illustrated at a later date. So, since I have a printer / scanner / faxer /etc. that has a sheet feeder that holds about 100 pages on it, and it doesn't care if I scan to a file or scan to another pile of paper, and since I only need one copy right now, I'm just going to make one copy of this document at a time for now. Thank you so much for sharing your expertise! |
#6
|
|||
|
|||
You are welcome.
|
#7
|
||||
|
||||
If you have Adobe Acrobat Pro (available as a free download from http://www.techspot.com/downloads/46...at-8-free.html - note the serial# mentioned there), you can use it to both OCR and save the file to Word. The resulting Word file may still have the page images, but it's a simple matter to use a macro to delete them (though if you have graphics you want to keep, that may not be such a good idea for the pages concerned), leaving behind the text - probably not in frames. In any event, once OCRd by Acrobat, the text and graphics (some cropping may be needed) can be copied/pasted into Word.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#8
|
|||
|
|||
Hi Paul,
I had never tried copying multi-page selections of text from a pdf with Acrobat. I have Acrobat X Standard. When I tried pasting from a pdf print of a web page, I got the typical problems with pasting directly from a web page like every line becoming a paragraph. I then tried copying from "Word 2003 Visual Basic Programming" by John Low which is a e-book and got essentially gibberish like: ________'_'__'________7______F_ #A ___9'___E=____'_______G_ _A ____'__'_______"_ I then tried on a scanned police report and got very poor formatting including loss of columns and a paragraph mark at the end of every line. I guess I would not recommend it but then I am not using the Pro version. I found a this site to download Acrobat 8 Pro and will try that. |
#9
|
||||
|
||||
Quote:
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#10
|
|||
|
|||
Well, that was exciting. I am leaving Acrobat 8 Pro installed on my system but switched the default back to Acrobat X Standard. The results copyiing from the pdf files were the same with Acrobat 8 as with X.
|
#11
|
||||
|
||||
I'd have been surprised if it were otherwise; changing the software your use to access your PDFs doesn't change their content - and that's where the paragraph breaks are.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
Tags |
frame, frames, insert text |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Eliminate time stamp in Track Changes | Aden | Word | 19 | 07-10-2023 09:02 PM |
Show animation frames in handout View | ttagami.PEAR | PowerPoint | 4 | 01-03-2014 11:24 AM |
Word with frames, table of contents, and hyperlinks to html | NHMC | Word | 0 | 12-09-2009 12:54 PM |
Eliminate paragraph breaks | geobruin | Word | 1 | 06-12-2009 06:55 AM |
asp and frames HELP | fatooma | Misc | 0 | 04-04-2006 01:04 PM |