Microsoft Office Forums

Go Back   Microsoft Office Forums > >

Reply
 
Thread Tools Display Modes
  #1  
Old 05-06-2014, 01:35 PM
jbvalen jbvalen is offline Capturing numbered headings Windows 7 64bit Capturing numbered headings Office 2010 64bit
Novice
Capturing numbered headings
 
Join Date: May 2014
Posts: 2
jbvalen is on a distinguished road
Default Capturing numbered headings

I have a macro that captures statments like "shall" in a word document and dumps them into a .csv file. Right now it captures the sentence and sometimes the heading number of a paragraph (not always0. I really want the macro to not only capture the sentence, but also the paragraph or subparagrap numbers and associate it with each sentence. Below is what I have right now. Any suggestions


Code:
Sub ExtractSentence() 
    Dim ExcelApp As Object 
    Dim ExcelWB As Object
    Dim aRange As Range 
    Dim intRowCount As Integer 
    Dim MyColumn As String, Lastrow As Long 
 
    Set ExcelApp = CreateObject("Excel.Application") 
    ExcelApp.Visible = True 
    Set ExcelWB = ExcelApp.Workbooks.Add 
    intRowCount = 1 
    Set aRange = ActiveDocument.Range 
    With aRange.Find 
        Do 
            .Text = "shall" 
            .Execute 
            If .Found Then 
                aRange.Expand Unit:=wdSentence 
                aRange.Copy 
                aRange.Collapse wdCollapseEnd 
                ExcelWB.Sheets(1).Cells(intRowCount, 1).Select 
                ExcelWB.Sheets(1).Paste 
                intRowCount = intRowCount + 1 
            End If 
        Loop While .Found 
        intRowCount = intRowCount + 1 
    End With 
    Set aRange = ActiveDocument.Range 
    With aRange.Find 
        Do 
            .Text = "will" 
            .Execute 
            If .Found Then 
                aRange.Expand Unit:=wdSentence 
                aRange.Copy 
                aRange.Collapse wdCollapseEnd 
                ExcelWB.Sheets(1).Cells(intRowCount, 1).Select 
                ExcelWB.Sheets(1).Paste 
                intRowCount = intRowCount + 1 
            End If 
        Loop While .Found 
        intRowCount = intRowCount + 1 
    End With 
    Set aRange = ActiveDocument.Range 
    With aRange.Find 
        Do 
            .Text = "Must" 
            .Execute 
            If .Found Then 
                aRange.Expand Unit:=wdSentence 
                aRange.Copy 
                aRange.Collapse wdCollapseEnd 
                ExcelWB.Sheets(1).Cells(intRowCount, 1).Select 
                ExcelWB.Sheets(1).Paste 
                intRowCount = intRowCount + 1 
            End If 
        Loop While .Found 
        intRowCount = intRowCount + 1 
    End With 
    Set aRange = ActiveDocument.Range 
    With aRange.Find 
        Do 
            .Text = "Should" 
            .Execute 
            If .Found Then 
                aRange.Expand Unit:=wdSentence 
                aRange.Copy 
                aRange.Collapse wdCollapseEnd 
                ExcelWB.Sheets(1).Cells(intRowCount, 1).Select 
                ExcelWB.Sheets(1).Paste 
                intRowCount = intRowCount + 1 
            End If 
        Loop While .Found 
        intRowCount = intRowCount + 1 
    End With 
    Set aRange = ActiveDocument.Range 
    With aRange.Find 
        Do 
            .Text = "provide" 
            .Execute 
            If .Found Then 
                aRange.Expand Unit:=wdSentence 
                aRange.Copy 
                aRange.Collapse wdCollapseEnd 
                ExcelWB.Sheets(1).Cells(intRowCount, 1).Select 
                ExcelWB.Sheets(1).Paste 
                intRowCount = intRowCount + 1 
            End If 
        Loop While .Found 
        intRowCount = intRowCount + 1 
    End With 
    Set aRange = Nothing 
    fileextstr = ".csv": fileformatnum = 6 
    ExcelApp.DisplayAlerts = False 
    ExcelWB.SaveAs "C:\users\joe.valenzuela\desktop\shredder.csv" & fileextstr, fileformatnum 
    ExcelApp.DisplayAlerts = True 
End Sub

Last edited by macropod; 05-06-2014 at 04:00 PM. Reason: Added code tags & formatting
Reply With Quote
  #2  
Old 05-06-2014, 04:05 PM
macropod's Avatar
macropod macropod is offline Capturing numbered headings Windows 7 32bit Capturing numbered headings Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,962
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

In what way are the paragraph or subparagraph numbers associated with each sentence? Do they actually prefix the paragraph or could they be in some other paragraph? Are these numbers created via Word's auto paragraph numbering tools, or are they manually-typed?

It would also be helpful if you could attach a document to a post with some representative data (delete anything sensitive). You do this via the paperclip symbol on the 'Go Advanced' tab at the bottom of this screen.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #3  
Old 05-07-2014, 06:14 AM
jbvalen jbvalen is offline Capturing numbered headings Windows 7 64bit Capturing numbered headings Office 2010 64bit
Novice
Capturing numbered headings
 
Join Date: May 2014
Posts: 2
jbvalen is on a distinguished road
Default

In response to the your questions: the sentences are within a Paragraph or subparagraph (so they are not necessarily in direct position with a sentance). they are, however, unique (no duplicate paragraphs). Typically these are created via word‘s style headings. Sometimes, however, these documents are converted to a PDF and I have to convert them back to word (where they are just simple text in front of the paragraph). Attached is a sample document that was released as a word document
Attached Files
File Type: doc Attachment J-6A PWS Template 1July2013.doc (350.5 KB, 22 views)
Reply With Quote
  #4  
Old 05-07-2014, 06:52 PM
macropod's Avatar
macropod macropod is offline Capturing numbered headings Windows 7 32bit Capturing numbered headings Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,962
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

None of the target paragraphs in your attachment is numbered, though some have bullet points. The only numbered paragraphs are the headings - and they use list-level numbering. There are also no numbered paragraphs associated with the Appendices. Assuming all your documents are formatted that way, try:
Code:
Sub Demo()
Application.ScreenUpdating = False
Dim ExcelApp As Object, ExcelWB As Object, ExcelWS As Object
Dim StrFnd As String, StrOut As String, i As Long, j As Long
Dim Rng As Range
Set ExcelApp = CreateObject("Excel.Application")
ExcelApp.Visible = False
Set ExcelWB = ExcelApp.Workbooks.Add
Set ExcelWS = ExcelWB.Sheets(1)
StrFnd = "shall,will,must,may,should,provide"
For i = 0 To UBound(Split(StrFnd, ","))
  With ActiveDocument.Range
    With .Find
      .ClearFormatting
      .Replacement.ClearFormatting
      .Text = Split(StrFnd, ",")(i)
      .Replacement.Text = ""
      .Forward = True
      .Wrap = wdFindStop
      .MatchCase = False
      .MatchWholeWord = False
      .MatchWildcards = False
      .MatchSoundsLike = False
      .MatchAllWordForms = False
      .Execute
    End With
    Do While .Find.Found
      j = j + 1
      Set Rng = .Duplicate
      With Rng
        .Expand Unit:=wdSentence
        If Asc(.Characters.Last.Text) < 33 Then .End = .End - 1
        If .Characters.Last.Next.Text = LCase(.Characters.Last.Next.Text) Then
          .MoveEnd wdSentence, 1
        End If
        While .Words.First.Previous.Text Like "[eig]"
          .MoveStart wdSentence, -1
        Wend
        If .Characters.Last.Text Like "[" & vbCr & Chr(11) & vbTab & "]" Then .End = .End - 1
        StrOut = Trim(.Text)
      End With
      ExcelWS.Cells(j, 2).Value = StrOut
      StrOut = ""
      If .Paragraphs.First.Range.ListParagraphs.Count = 1 Then
        StrOut = .Paragraphs.First.Range.ListFormat.ListString
      End If
      If Not .Paragraphs.First.Range.ListFormat.ListString Like "[0-9]*" Then
        While (Not .Paragraphs.First.Range.ListFormat.ListString Like "[0-9]*") And _
          (Not .Paragraphs.First.Range.Words.First Like "Appendix*")
          .MoveStart wdParagraph, -1
        Wend
        If .Paragraphs.First.Range.Words.First Like "Appendix*" Then
          StrOut = "Appendix " & Split(.Paragraphs.First.Range.Text, " ")(1) & StrOut
        Else
          StrOut = .Paragraphs.First.Range.ListFormat.ListString & StrOut
        End If
      End If
      ExcelWS.Cells(j, 1).Value = StrOut
      .Collapse wdCollapseEnd
      '.Start = Rng.End
      .Find.Execute
    Loop
  End With
Next
With ExcelApp
  .Visible = True
  .DisplayAlerts = False
  ExcelWB.SaveAs "C:\users\" & Environ("UserName") & "\desktop\shredder.csv", 6
  .DisplayAlerts = True
End With
Set ExcelWS = Nothing: Set ExcelWB = Nothing: Set ExcelApp = Nothing: Set Rng = Nothing
Application.ScreenUpdating = True
End Sub
Note: In addition to streamlining your code, I've done a fair bit of work to ensure the code captures complete sentences and their references.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #5  
Old 05-04-2017, 10:17 AM
lvalx lvalx is offline Capturing numbered headings Windows 10 Capturing numbered headings Office 2010 64bit
Novice
 
Join Date: May 2017
Posts: 3
lvalx is on a distinguished road
Default

Paul,

Sorry to Necromancy this thread, but I am working off your Macro and have encountered an issue. I am attempting to use your macro for the same thing the original poster was, however, I have noticed that if the document is converted from a .PDF the numbering for the headers gets messed up. So instead of actually outputting the items to Excel the Word document simply freezes up and quits responding. Is there a macro I could create that will re-format the .PDF to .DOC document so that this macro will work on the file? I could not find a way to send a private message. Thank you for the reply!
Reply With Quote
  #6  
Old 05-04-2017, 05:03 PM
macropod's Avatar
macropod macropod is offline Capturing numbered headings Windows 7 64bit Capturing numbered headings Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,962
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

I suggest you start a new thread for this. Without actually seeing the problem document, it can be difficult for anyone to diagnose the issue. Accordingly, it would be helpful if you could attach a document to a post with some representative data (delete anything sensitive). You can attach documents via the paperclip symbol on the 'Go Advanced' tab at the bottom of this screen.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Layout: Numbered long headings on level 10 or higher Katarina Word 3 05-07-2014 02:56 PM
Capturing numbered headings Issue with Numbered Lists in Headings yllakay Word 2 09-24-2012 10:28 AM
Capturing numbered headings numbered headings Caroline Word 5 03-14-2011 09:09 AM
time capturing aligahk06 Excel 0 04-18-2010 11:53 PM
XML marking of Auto-numbered Headings crose Word 0 12-17-2009 09:55 PM

Other Forums: Access Forums

All times are GMT -7. The time now is 06:35 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2024 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft