#1
|
|||
|
|||
Extract pictures from Word
Hi
Looking for a pointer or two here. I have a Word document with a large number of pictures in it - too many to do this manually or to export as an HTML file. Unless there's a simple way to do this non-programatically, then this is how I see it going. 1. Find the first Heading 1. (eg. "myFirstHeading1") 2. Find the first picture after that point. 3. Export to jpg (or whatever) with the name of the graphic suffixed with 1. eg "myFirstHeading1#1.jpg" 4. Find the next graphic and export. eg. myFirstHeading1#2.jpg 5. Continue until the the next Heading1. Let's call it "mySecondHeading1" 6. The next graphic will therefore be something like mySecondHeading1#1.jpg, then mySecondHeading1#2.jpg, mySecondHeading1#3.jpg. Whilst I have a very good understanding of Excel VBA, I don't have a clue how to turn this Word program into any form of reality. Thanks Martin |
#2
|
||||
|
||||
Assuming the document is in the docx or docm format:
1. Copy the file 2. Change the copy's extension from docx or docm to zip 3. Open the zip archive 4. Extract the images from the zip archive.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
|||
|
|||
Thanks for replying. I'm afraid that wouldn't work any more than saving as an HTML document. The steps I listed above have a requirement to name the graphics. Clever solution leveraging Word's Open XML format, but as I said, won't work here.
Cheers though Martin |
#4
|
||||
|
||||
Unfortunately the graphics are not stored in the document with their original file names, so what you ask is not likely.
See http://www.gmayor.com/extract_images_from_word.htm
__________________
Graham Mayor - MS MVP (Word) (2002-2019) Visit my web site for more programming tips and ready made processes www.gmayor.com |
#5
|
||||
|
||||
You refer to a series of headings. Do these always use Word's built-in Heading Styles? If not, finding which heading an image relates to would be difficult.
Even if you are using heading Styles, extracting and naming the images the way you want will be problematic. For starters, images may be inserted in-line with the text or as floating objects. If you have a mix, or if they're inserted as the latter, their relative location on a page may have nothing to do with where they're anchored. Consequently, getting the numbering right would pose a real challenge.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#6
|
|||
|
|||
Gmayer. I think you misunderstood my request. Thanks for replying though.
|
#7
|
|||
|
|||
Macropod
I manually set each section to use a heading 1 style. That bit, whilst time-consuming was doable. I can guarantee that it's consistent though. So to reiterate, I'm after the correct VBA to both name and extract the pictures with the correct heading 1 plus suffix eg "H1#1.jpg, H1#2.jpg, H1#3.jpg, H2#1.jpg, H2#2.jpg where H1, H2. etc are all heading 1 styles from the document in question. I'd need to loop though each section and extract the pictures and add the IDs as part of the process. 1. Looping through cells in Excel is a piece of cake, but I can't see how to do it in Word. 2. I also don't know how to select and export a picture; allocating it a chosen meaningful name as part of the process. Thanks Martin |
#8
|
||||
|
||||
I hadn't misunderstood your request,.There is no process available in VBA to do what you actually stated that you require. I simply pointed out a method that is as close as you are likely to get to recover the original format of the images.
However, some years ago, Stephan Lebans produced a function for older DOC format documents some time ago that would save a selected image as BMP format. This is not the same as the original image, but a facsimile of it that may fulfil your requirements. The document containing the code is available in a zip file linked from http://www.lebans.com/msword.htm. The document in the zip can be opened in current Word versions and when saved as DOTM format, still appears to work with XML format document images. Paul has pointed out the pitfalls of targetting the images. If you can overcome that then you may be able to loop through the images and process them using the macro function. I guess it rather depends on whether BMP is an acceptable alternative to the original format.
__________________
Graham Mayor - MS MVP (Word) (2002-2019) Visit my web site for more programming tips and ready made processes www.gmayor.com |
#9
|
|||
|
|||
Bmp is quite adequate. Thanks.
|
#10
|
||||
|
||||
Cross-posted at: https://www.excelforum.com/word-prog...from-word.html
For cross-posting etiquette, please read: http://www.excelguru.ca/content.php?184
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#11
|
|||
|
|||
Yes it was. No-one answered so I reposted.
|
#12
|
||||
|
||||
Martin
What you are asking sounds doable but would take more time than I have available at the moment. Macropod raised the question of inline vs floating shapes which would need to be answered but the basic (aircode) approach I would use is to loop the graphics and apply the heading 1 style at that point to work out the previous heading 1 number. You could also use a counter to keep track of how many graphics have already been found and reset that when the heading number increases. My initial musings are Code:
Sub ExportPicts() Dim aShp As InlineShape, i As Integer, sPath As String, sName As String Dim iCounter As Integer sPath = ActiveDocument.Path & Application.PathSeparator For Each aShp In ActiveDocument.InlineShapes iCounter = iCounter + 1 aShp.Range.Paragraphs(1).Style = "Heading 1" i = aShp.Range.Paragraphs(1).Range.ListFormat.ListString - 1 aShp.Range.Paragraphs(1).Style = "Normal" Debug.Print i, iCounter 'not sure on the code to export inlineshape Next aShp End Sub
__________________
Andrew Lockton Chrysalis Design, Melbourne Australia |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
XML Schema Parent Child Extraction | ChrisOK | Excel Programming | 0 | 11-16-2017 08:22 AM |
Log file extraction | xendistar | Excel | 0 | 09-01-2016 02:08 PM |
Data extraction from a cell! | JTevez | Excel | 2 | 10-14-2015 11:56 PM |
Bibliography creation and automatic extraction | styxsailor | Word | 3 | 11-30-2012 02:42 PM |
Email Extraction | dpad | Outlook | 1 | 08-17-2010 06:01 PM |