![]() |
|
#1
|
|||
|
|||
|
Can anyone help me with this?
I'm trying to take hundreds of word docs and compile them into a spread sheet. Each word doc has 4-10 works cited on the first page. Looking like The Madison Times, WI, “Insurance Exchanges Will Extend Coverage to Millions,” September 23, 2013, By Freddie Allen Each word doc is for a specific date and it's file name is that date. Example "September 19, 2013" I just need the spread sheet to list the date that is for the whole document and the publications that are listed in bold and italicized. Here is an example of what one of the word docs look like. [IMG] [/IMG]
|
|
#2
|
||||
|
||||
|
You can do this, sceneitguitar. But you're going to have to write a VBA program to do it, and you're going to have to start by collecting the data from the Word documents using the Word Application object. Oh, the code can reside in Excel, if you want to do it that way; but it's nevertheless going to have to talk to the Word object. That means you'll have to learn how to automate in Word as well as in Excel.
I've done both, and although I know Excel objects a lot better than the Word object, it's not too hard; it's just very, very different. Are the folks on the Word forum any use for this? Because unless you're a self-starter, you're going to have to ask them a lot of questions about how VBA/Word works. |
|
#3
|
|||
|
|||
|
Well I'm glad it can be done, but it sounds like big undertaking. Thanks for the info. I'm going to really try to do this.
|
|
#4
|
||||
|
||||
|
My general process would be something like this:
0) In your Excel program, a) start a new worksheet to contain the list you're creating, and b) fire up the Word application. 1a) If your Excel document already has the list of Word documents (path and file name), then your VBA program just to go down the list one at a time. For each Word document in the list, execute steps 2 and 3. 1b) But if you expect your program to create the list of Word documents from scratch, then you must have some way of deciding which Word documents should go on the list and which should not. So you'll have your program look up all the Word documents in a certain folder (or folders), picking out those whose file name fits a certain pattern (having a date in the name, at least) and ignoring the rest. Maybe other criteria will go into that decision too. For each Word document that makes the cut, execute steps 2 and 3. 2) Add the Word document's date to your list. 3) Use the Word Application object to open the current Word document. Pick out the section that has the citations in it. Pick out the desired information from each citation (name of publication and title of article, I think you said) and list it with the Word document's date. The part that has to happen in Word is at the part about opening the Word document, finding the citations section at the beginning and picking out a long character string that represents all the citations. You may want to use Word also to pick out each citation as a separate paragraph; I probably would. I don't use VBA/Word much, but I have done a little. If you need help with that, or with parts of the rest of the program, I would ask in the Word forum first, but if they're no help there (I'm not predicting; I have no idea), feel free to ask here next. In the latter case, though, do it in the Excel-programming forum; you'll probably have better luck there. |
|
#5
|
||||
|
||||
|
The Excel code to do this would be something like:
Code:
Option Explicit
Public FSO As Object 'a FileSystemObject
Public oFolder As Object 'the folder object
Public oSubFolder As Object 'the subfolders collection
Public oFiles As Object 'the files object
Dim LRow As Long
Dim wdApp As Word.Application
Dim wdDoc As Word.Document
Dim XlWkSht As Excel.Worksheet
Sub Main()
' Minimise screen flickering
Application.ScreenUpdating = False
MsgBox ActiveSheet.Name
Set XlWkSht = ActiveSheet
MsgBox ActiveSheet.Name
LRow = XlWkSht.Cells.SpecialCells(xlCellTypeLastCell).Row + 1
Set wdApp = New Word.Application
wdApp.Visible = True
Dim StrFolder As String
' Browse for the starting folder
StrFolder = GetTopFolder
If StrFolder = "" Then Exit Sub
' Search the top-level folder
Call GetFolder(StrFolder & "\")
' Search the subfolders for more files
Call SearchSubFolders(StrFolder)
' Return control of status bar to Excel
Application.StatusBar = ""
'Close Word
wdApp.Quit
' Restore screen updating
Application.ScreenUpdating = True
Set wdDoc = Nothing: Set wdApp = Nothing: Set XlWkSht = Nothing
End Sub
Function GetTopFolder() As String
GetTopFolder = ""
Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
If (Not oFolder Is Nothing) Then GetTopFolder = oFolder.Items.Item.Path
Set oFolder = Nothing
End Function
Sub SearchSubFolders(strStartPath As String)
If FSO Is Nothing Then
Set FSO = CreateObject("scripting.filesystemobject")
End If
Set oFolder = FSO.GetFolder(strStartPath)
Set oSubFolder = oFolder.subfolders
For Each oFolder In oSubFolder
Set oFiles = oFolder.Files
' Search the current folder
Call GetFolder(oFolder.Path & "\")
' Call ourself to see if there are subfolders below
SearchSubFolders oFolder.Path
Next
End Sub
Sub GetFolder(StrFolder As String)
Dim strFile As String
strFile = Dir(StrFolder & "*.doc")
' Process the files in the folder
While strFile <> ""
' Update the status bar is just to let us know where we are
Application.StatusBar = StrFolder & strFile
Call UpdateWkBk(StrFolder, strFile)
strFile = Dir()
Wend
End Sub
Sub UpdateWkBk(StrFolder As String, strFile As String)
Dim StrRef As String
' Open the document
Set wdDoc = wdApp.Documents.Open(StrFolder & strFile, AddToRecentFiles:=False, ReadOnly:=False, Format:=wdOpenFormatAuto, Visible:=False)
With wdDoc
If .ProtectionType = wdNoProtection Then
' Find and Extract the data
With .Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "^13[!^13]@" & strFile & "[!^13]{1,}"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchWildcards = True
.Execute
End With
Do While .Find.Found
.Start = .Start + 1
If .Words.First.Font.Bold = True And .Words.First.Font.Italic = True Then
.Copy
With XlWkSht
.Cells(LRow, 1).Value = strFile
.Cells(LRow, 2).Paste
End With
LRow = LRow + 1
End If
.Collapse wdCollapseEnd
.Find.Execute
Loop
End With
End If
'Close the document
.Close SaveChanges:=False
End With
' Let Word do its housekeeping
DoEvents
End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
|
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
Creating a list in Word from an Excel spreadsheet
|
gingernut | Mail Merge | 1 | 03-19-2013 02:12 PM |
| Hundreds of Docs | quarkrad | Office | 5 | 01-15-2012 02:27 PM |
Inserting excel spreadsheet into word doc
|
educpara58 | Excel | 2 | 07-28-2011 01:22 AM |
Re: From Excel to Word - empty docs?
|
Jaymond Flurrie | Word VBA | 2 | 05-08-2011 04:36 PM |
| Word Fill-in form with Excel spreadsheet | karik | Word | 0 | 01-11-2010 08:45 AM |