#1
|
|||
|
|||
Capturing numbered headings
I have a macro that captures statments like "shall" in a word document and dumps them into a .csv file. Right now it captures the sentence and sometimes the heading number of a paragraph (not always0. I really want the macro to not only capture the sentence, but also the paragraph or subparagrap numbers and associate it with each sentence. Below is what I have right now. Any suggestions
Code:
Sub ExtractSentence() Dim ExcelApp As Object Dim ExcelWB As Object Dim aRange As Range Dim intRowCount As Integer Dim MyColumn As String, Lastrow As Long Set ExcelApp = CreateObject("Excel.Application") ExcelApp.Visible = True Set ExcelWB = ExcelApp.Workbooks.Add intRowCount = 1 Set aRange = ActiveDocument.Range With aRange.Find Do .Text = "shall" .Execute If .Found Then aRange.Expand Unit:=wdSentence aRange.Copy aRange.Collapse wdCollapseEnd ExcelWB.Sheets(1).Cells(intRowCount, 1).Select ExcelWB.Sheets(1).Paste intRowCount = intRowCount + 1 End If Loop While .Found intRowCount = intRowCount + 1 End With Set aRange = ActiveDocument.Range With aRange.Find Do .Text = "will" .Execute If .Found Then aRange.Expand Unit:=wdSentence aRange.Copy aRange.Collapse wdCollapseEnd ExcelWB.Sheets(1).Cells(intRowCount, 1).Select ExcelWB.Sheets(1).Paste intRowCount = intRowCount + 1 End If Loop While .Found intRowCount = intRowCount + 1 End With Set aRange = ActiveDocument.Range With aRange.Find Do .Text = "Must" .Execute If .Found Then aRange.Expand Unit:=wdSentence aRange.Copy aRange.Collapse wdCollapseEnd ExcelWB.Sheets(1).Cells(intRowCount, 1).Select ExcelWB.Sheets(1).Paste intRowCount = intRowCount + 1 End If Loop While .Found intRowCount = intRowCount + 1 End With Set aRange = ActiveDocument.Range With aRange.Find Do .Text = "Should" .Execute If .Found Then aRange.Expand Unit:=wdSentence aRange.Copy aRange.Collapse wdCollapseEnd ExcelWB.Sheets(1).Cells(intRowCount, 1).Select ExcelWB.Sheets(1).Paste intRowCount = intRowCount + 1 End If Loop While .Found intRowCount = intRowCount + 1 End With Set aRange = ActiveDocument.Range With aRange.Find Do .Text = "provide" .Execute If .Found Then aRange.Expand Unit:=wdSentence aRange.Copy aRange.Collapse wdCollapseEnd ExcelWB.Sheets(1).Cells(intRowCount, 1).Select ExcelWB.Sheets(1).Paste intRowCount = intRowCount + 1 End If Loop While .Found intRowCount = intRowCount + 1 End With Set aRange = Nothing fileextstr = ".csv": fileformatnum = 6 ExcelApp.DisplayAlerts = False ExcelWB.SaveAs "C:\users\joe.valenzuela\desktop\shredder.csv" & fileextstr, fileformatnum ExcelApp.DisplayAlerts = True End Sub Last edited by macropod; 05-06-2014 at 04:00 PM. Reason: Added code tags & formatting |
#2
|
||||
|
||||
In what way are the paragraph or subparagraph numbers associated with each sentence? Do they actually prefix the paragraph or could they be in some other paragraph? Are these numbers created via Word's auto paragraph numbering tools, or are they manually-typed?
It would also be helpful if you could attach a document to a post with some representative data (delete anything sensitive). You do this via the paperclip symbol on the 'Go Advanced' tab at the bottom of this screen.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
|||
|
|||
In response to the your questions: the sentences are within a Paragraph or subparagraph (so they are not necessarily in direct position with a sentance). they are, however, unique (no duplicate paragraphs). Typically these are created via word‘s style headings. Sometimes, however, these documents are converted to a PDF and I have to convert them back to word (where they are just simple text in front of the paragraph). Attached is a sample document that was released as a word document
|
#4
|
||||
|
||||
None of the target paragraphs in your attachment is numbered, though some have bullet points. The only numbered paragraphs are the headings - and they use list-level numbering. There are also no numbered paragraphs associated with the Appendices. Assuming all your documents are formatted that way, try:
Code:
Sub Demo() Application.ScreenUpdating = False Dim ExcelApp As Object, ExcelWB As Object, ExcelWS As Object Dim StrFnd As String, StrOut As String, i As Long, j As Long Dim Rng As Range Set ExcelApp = CreateObject("Excel.Application") ExcelApp.Visible = False Set ExcelWB = ExcelApp.Workbooks.Add Set ExcelWS = ExcelWB.Sheets(1) StrFnd = "shall,will,must,may,should,provide" For i = 0 To UBound(Split(StrFnd, ",")) With ActiveDocument.Range With .Find .ClearFormatting .Replacement.ClearFormatting .Text = Split(StrFnd, ",")(i) .Replacement.Text = "" .Forward = True .Wrap = wdFindStop .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False .Execute End With Do While .Find.Found j = j + 1 Set Rng = .Duplicate With Rng .Expand Unit:=wdSentence If Asc(.Characters.Last.Text) < 33 Then .End = .End - 1 If .Characters.Last.Next.Text = LCase(.Characters.Last.Next.Text) Then .MoveEnd wdSentence, 1 End If While .Words.First.Previous.Text Like "[eig]" .MoveStart wdSentence, -1 Wend If .Characters.Last.Text Like "[" & vbCr & Chr(11) & vbTab & "]" Then .End = .End - 1 StrOut = Trim(.Text) End With ExcelWS.Cells(j, 2).Value = StrOut StrOut = "" If .Paragraphs.First.Range.ListParagraphs.Count = 1 Then StrOut = .Paragraphs.First.Range.ListFormat.ListString End If If Not .Paragraphs.First.Range.ListFormat.ListString Like "[0-9]*" Then While (Not .Paragraphs.First.Range.ListFormat.ListString Like "[0-9]*") And _ (Not .Paragraphs.First.Range.Words.First Like "Appendix*") .MoveStart wdParagraph, -1 Wend If .Paragraphs.First.Range.Words.First Like "Appendix*" Then StrOut = "Appendix " & Split(.Paragraphs.First.Range.Text, " ")(1) & StrOut Else StrOut = .Paragraphs.First.Range.ListFormat.ListString & StrOut End If End If ExcelWS.Cells(j, 1).Value = StrOut .Collapse wdCollapseEnd '.Start = Rng.End .Find.Execute Loop End With Next With ExcelApp .Visible = True .DisplayAlerts = False ExcelWB.SaveAs "C:\users\" & Environ("UserName") & "\desktop\shredder.csv", 6 .DisplayAlerts = True End With Set ExcelWS = Nothing: Set ExcelWB = Nothing: Set ExcelApp = Nothing: Set Rng = Nothing Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#5
|
|||
|
|||
Paul,
Sorry to Necromancy this thread, but I am working off your Macro and have encountered an issue. I am attempting to use your macro for the same thing the original poster was, however, I have noticed that if the document is converted from a .PDF the numbering for the headers gets messed up. So instead of actually outputting the items to Excel the Word document simply freezes up and quits responding. Is there a macro I could create that will re-format the .PDF to .DOC document so that this macro will work on the file? I could not find a way to send a private message. Thank you for the reply! |
#6
|
||||
|
||||
I suggest you start a new thread for this. Without actually seeing the problem document, it can be difficult for anyone to diagnose the issue. Accordingly, it would be helpful if you could attach a document to a post with some representative data (delete anything sensitive). You can attach documents via the paperclip symbol on the 'Go Advanced' tab at the bottom of this screen.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Layout: Numbered long headings on level 10 or higher | Katarina | Word | 3 | 05-07-2014 02:56 PM |
Issue with Numbered Lists in Headings | yllakay | Word | 2 | 09-24-2012 10:28 AM |
numbered headings | Caroline | Word | 5 | 03-14-2011 09:09 AM |
time capturing | aligahk06 | Excel | 0 | 04-18-2010 11:53 PM |
XML marking of Auto-numbered Headings | crose | Word | 0 | 12-17-2009 09:55 PM |