![]() |
#1
|
|||
|
|||
![]()
A little back-story: I'm recreating some documents from PDFs (a bunch of Open University math textbooks that were supplied in printed format, and downloadable PDFs - except the PDFs are just images and as such are not searchable, highlight-able or able to be hyperlinked in any meaningful way). So I OCR'd the files which is OK for the text blocks, but garbles the formulae and diagrams. I've got a decent part of the way through the first textbook, matching fonts and creating styles, but I had a bright idea - why don't I export the PDF files as TIFF images, and as I'm doing a page I'll set the TIFF file as the page background, which should make it easier to match the layout.
The best option, I decided, would be to record a macro that would allow me to pick a file, size it to fit the page and set it to go behind the text. Should be relatively painless... right? Wrong. I click the record macro button, name it and set a button. I click insert... Picture and browse to the file. I try to right click on the image. Nothing. No context menu appears at all (it seems I can't get a context menu anywhere - not just on the picture). So I click the Ribbon Picture tab, and then the "Advanced Layout" arrow at the bottom right of the Size section. Click the middle "Text Wrapping" tab, and select "Behind Text". Click OK. Word promptly crashes and asks me to send an error report (which I do - M$ might as well work for their money). I tried a new document, different images, different PC. Same each time. It seems you cannot change the text wrapping property of an image while recording a macro (actually, it seems that just opening and then closing the Advanced Layout window will cause Word to crash). Surely I can't have been the only person to run into this problem? ![]() Thanks in advance for any help! |
#2
|
||||
|
||||
![]()
IMHO, you'd do better to use a 'proper' PDF to Doc conversion package. Be that as it may, if you insert the 'PDF' page image, then immediately run the following macro, it will be resized to the page size and postioned behind the text.
Code:
Sub FormatPDFPage() Application.ScreenUpdating = False With Selection If .InlineShapes.Count > 0 Then .InlineShapes(1).ConvertToShape If .ShapeRange.Count = 0 Then Exit Sub With .ShapeRange(1) .LockAspectRatio = msoTrue .Width = Selection.Sections(1).PageSetup.PageWidth If .Height > Selection.Sections(1).PageSetup.PageHeight Then .Height = Selection.Sections(1).PageSetup.PageHeight End If .WrapFormat.Type = wdWrapBehind End With End With Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
|||
|
|||
![]()
Thanks Paul. I'm going to try some of those ideas in updating the macros I put together yesterday from bits I found on the Internet. I particularly like the use of PageWidth and PageHeight. I ended up using 2 macros - one to select and insert the Image, and the other to scale and position it.
As far as converting the PDF is concerned, this is pretty much a last resort. The main problem is that the PDF is just made up of flat images - there's not much for a conversion tool to work with. Abbyy FineReader made the best attempt, using the OCR'd text behind the PDF image (outputting as PDF), but for some reason it reduces the image quality which wasn't brilliant to begin with - and of course I couldn't see how well it had managed to interpret the text. Outputting as a Word Document showed that it hadn't done particularly well at all, even with a load of manual training of the OCR engine. Eventually I gave up and chose unformatted text from the OCR and started recreating the documents manually. Anyway, this is the code I've come up with, in two parts because it seems that the picture doesn't gain focus after the first macro inserts it, and so using With Selection fails. Quote:
|
#4
|
|||
|
|||
![]()
I've just attached a page from the original PDF and my attempt at recreating it in Word - just to give an idea as to why OCR conversion wasn't really having much success. 16 pages down, approximately 600 to go...
![]() |
#5
|
||||
|
||||
![]()
Have you approached the Uni for a searchable version? They might be quite happy to provide one once they're made aware of the issue. Reproducing a 600-page document puts you at risk of copyright violation. Certainly not something they'd see as falling with a 'fair use' provision.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#6
|
|||
|
|||
![]()
True - although I'm not intending to distribute it. It'll just be me, and my Kindle; and recompiling the document is forcing me to read it very carefully (which can't hurt) and also making me brush up on my Word / Math typing skills.
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
![]() |
KJJ | Word VBA | 14 | 11-10-2016 08:18 PM |
Powerpoint automatically changing picture size when adding a picture (2010) | One_Life | PowerPoint | 7 | 01-20-2012 06:57 AM |
![]() |
Nano07 | Word VBA | 2 | 11-02-2011 05:14 AM |
![]() |
kjk20 | Word VBA | 6 | 07-19-2011 06:18 AM |
Macro to put content into keywords properties? | erik2000 | Word VBA | 3 | 03-05-2010 10:14 PM |