#1
|
|||
|
|||
Hundreds of word docs compiled into Excel Spreadsheet
Can anyone help me with this?
I'm trying to take hundreds of word docs and compile them into a spread sheet. Each word doc has 4-10 works cited on the first page. Looking like The Madison Times, WI, “Insurance Exchanges Will Extend Coverage to Millions,” September 23, 2013, By Freddie Allen Each word doc is for a specific date and it's file name is that date. Example "September 19, 2013" I just need the spread sheet to list the date that is for the whole document and the publications that are listed in bold and italicized. Here is an example of what one of the word docs look like. [IMG][/IMG] |
#2
|
||||
|
||||
You can do this, sceneitguitar. But you're going to have to write a VBA program to do it, and you're going to have to start by collecting the data from the Word documents using the Word Application object. Oh, the code can reside in Excel, if you want to do it that way; but it's nevertheless going to have to talk to the Word object. That means you'll have to learn how to automate in Word as well as in Excel.
I've done both, and although I know Excel objects a lot better than the Word object, it's not too hard; it's just very, very different. Are the folks on the Word forum any use for this? Because unless you're a self-starter, you're going to have to ask them a lot of questions about how VBA/Word works. |
#3
|
|||
|
|||
Well I'm glad it can be done, but it sounds like big undertaking. Thanks for the info. I'm going to really try to do this.
|
#4
|
||||
|
||||
My general process would be something like this:
0) In your Excel program, a) start a new worksheet to contain the list you're creating, and b) fire up the Word application. 1a) If your Excel document already has the list of Word documents (path and file name), then your VBA program just to go down the list one at a time. For each Word document in the list, execute steps 2 and 3. 1b) But if you expect your program to create the list of Word documents from scratch, then you must have some way of deciding which Word documents should go on the list and which should not. So you'll have your program look up all the Word documents in a certain folder (or folders), picking out those whose file name fits a certain pattern (having a date in the name, at least) and ignoring the rest. Maybe other criteria will go into that decision too. For each Word document that makes the cut, execute steps 2 and 3. 2) Add the Word document's date to your list. 3) Use the Word Application object to open the current Word document. Pick out the section that has the citations in it. Pick out the desired information from each citation (name of publication and title of article, I think you said) and list it with the Word document's date. The part that has to happen in Word is at the part about opening the Word document, finding the citations section at the beginning and picking out a long character string that represents all the citations. You may want to use Word also to pick out each citation as a separate paragraph; I probably would. I don't use VBA/Word much, but I have done a little. If you need help with that, or with parts of the rest of the program, I would ask in the Word forum first, but if they're no help there (I'm not predicting; I have no idea), feel free to ask here next. In the latter case, though, do it in the Excel-programming forum; you'll probably have better luck there. |
#5
|
||||
|
||||
The Excel code to do this would be something like:
Code:
Option Explicit Public FSO As Object 'a FileSystemObject Public oFolder As Object 'the folder object Public oSubFolder As Object 'the subfolders collection Public oFiles As Object 'the files object Dim LRow As Long Dim wdApp As Word.Application Dim wdDoc As Word.Document Dim XlWkSht As Excel.Worksheet Sub Main() ' Minimise screen flickering Application.ScreenUpdating = False MsgBox ActiveSheet.Name Set XlWkSht = ActiveSheet MsgBox ActiveSheet.Name LRow = XlWkSht.Cells.SpecialCells(xlCellTypeLastCell).Row + 1 Set wdApp = New Word.Application wdApp.Visible = True Dim StrFolder As String ' Browse for the starting folder StrFolder = GetTopFolder If StrFolder = "" Then Exit Sub ' Search the top-level folder Call GetFolder(StrFolder & "\") ' Search the subfolders for more files Call SearchSubFolders(StrFolder) ' Return control of status bar to Excel Application.StatusBar = "" 'Close Word wdApp.Quit ' Restore screen updating Application.ScreenUpdating = True Set wdDoc = Nothing: Set wdApp = Nothing: Set XlWkSht = Nothing End Sub Function GetTopFolder() As String GetTopFolder = "" Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0) If (Not oFolder Is Nothing) Then GetTopFolder = oFolder.Items.Item.Path Set oFolder = Nothing End Function Sub SearchSubFolders(strStartPath As String) If FSO Is Nothing Then Set FSO = CreateObject("scripting.filesystemobject") End If Set oFolder = FSO.GetFolder(strStartPath) Set oSubFolder = oFolder.subfolders For Each oFolder In oSubFolder Set oFiles = oFolder.Files ' Search the current folder Call GetFolder(oFolder.Path & "\") ' Call ourself to see if there are subfolders below SearchSubFolders oFolder.Path Next End Sub Sub GetFolder(StrFolder As String) Dim strFile As String strFile = Dir(StrFolder & "*.doc") ' Process the files in the folder While strFile <> "" ' Update the status bar is just to let us know where we are Application.StatusBar = StrFolder & strFile Call UpdateWkBk(StrFolder, strFile) strFile = Dir() Wend End Sub Sub UpdateWkBk(StrFolder As String, strFile As String) Dim StrRef As String ' Open the document Set wdDoc = wdApp.Documents.Open(StrFolder & strFile, AddToRecentFiles:=False, ReadOnly:=False, Format:=wdOpenFormatAuto, Visible:=False) With wdDoc If .ProtectionType = wdNoProtection Then ' Find and Extract the data With .Range With .Find .ClearFormatting .Replacement.ClearFormatting .Text = "^13[!^13]@" & strFile & "[!^13]{1,}" .Replacement.Text = "" .Forward = True .Wrap = wdFindStop .Format = False .MatchWildcards = True .Execute End With Do While .Find.Found .Start = .Start + 1 If .Words.First.Font.Bold = True And .Words.First.Font.Italic = True Then .Copy With XlWkSht .Cells(LRow, 1).Value = strFile .Cells(LRow, 2).Paste End With LRow = LRow + 1 End If .Collapse wdCollapseEnd .Find.Execute Loop End With End If 'Close the document .Close SaveChanges:=False End With ' Let Word do its housekeeping DoEvents End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
Thread Tools | |
Display Modes | |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Creating a list in Word from an Excel spreadsheet | gingernut | Mail Merge | 1 | 03-19-2013 02:12 PM |
Hundreds of Docs | quarkrad | Office | 5 | 01-15-2012 02:27 PM |
Inserting excel spreadsheet into word doc | educpara58 | Excel | 2 | 07-28-2011 01:22 AM |
Re: From Excel to Word - empty docs? | Jaymond Flurrie | Word VBA | 2 | 05-08-2011 04:36 PM |
Word Fill-in form with Excel spreadsheet | karik | Word | 0 | 01-11-2010 08:45 AM |