Microsoft Office Forums

Go Back   Microsoft Office Forums > >

Reply
 
Thread Tools Display Modes
  #1  
Old 10-26-2010, 08:52 AM
markos97 markos97 is offline Table of contents Windows XP Table of contents Office 2000
Novice
Table of contents
 
Join Date: Oct 2010
Location: near Posen
Posts: 1
markos97 is on a distinguished road
Default Table of contents

Hello!


I want to create TOC of a word document. Someone on a polish forum help
me with this, but partially. In this moment "my" code output the following
results: words, frequency of occurence, and page number.
Code:
Option Explicit
Option Compare Text
Sub Test()
'based on: http://www.vbaexpress.com/kb/getarticle.php?kb_id=727
'based on: http://www.authorsden.com/categories/article_top.asp?catid=20&id=39307
    Dim lngRes                    As Long
    lngRes = WordCountAndPages(ThisDocument)
    Application.ScreenRefresh
    MsgBox "There were " & CStr(lngRes) & " different words ", vbOKOnly, "Finished"
End Sub
Function WordCountAndPages(SourceDoc As Word.Document, _
                           Optional ByVal sExcludes = "[the][a][of][is][to][for][by][be][and][are]", _
                           Optional ByVal lmaxwords As Long = 50000) As Long

    ' "[ale][ani][aby][do][od][czy][za][ze][przed][po]" 'excludes for example for polish
    On Error GoTo WordCountAndPages_Error

    Dim NewDoc                    As Word.Document
    Dim TmpRange                  As Word.Range
    Dim aWord                     As Object
    '----------------------------------------
    Dim tmpName                   As String
    Dim strSingleWord             As String

    Dim lngCurrentPage            As Long
    Dim lngPageCount              As Long
    Dim lngWordNum                As Long                  'Number of unique words
    Dim lngttlwds                 As Long                  'Total words in the document
    Dim j                         As Long
    Dim k                         As Long
    Dim w                         As Long
    Dim bTmpFound                 As Boolean               'Temporary flag

    ReDim arrWordList(1 To 1) As String                    'Array to hold unique words
    ReDim arrWordCount(1 To 1) As Long                     'Frequency counter for unique words
    ReDim arrPageW(1 To 1) As String                       'Pages unique words

    lngCurrentPage = 1   'we started to count from a first page to next pages

    With SourceDoc
        If ActiveDocument.FullName <> .FullName Then SourceDoc.Activate    ' because below selection
        Set TmpRange = .Range
        ' document's page count (maybe must refresh)
        lngPageCount = .Content.ComputeStatistics(wdStatisticPages) 'we counted number of pages
    End With
    '--------------
    ' The item in the Words collection includes both the word and the spaces after the word
    ' The Count property for this collection in a document returns the number of items in the main story only.
    ' Also, the Count property includes punctuation and paragraph marks in the total.
    lngttlwds = TmpRange.Words.Count                       '  SourceDoc.Words.Count
    '---------------------
    System.Cursor = wdCursorWait

    Do Until lngCurrentPage > lngPageCount
        If lngCurrentPage = lngPageCount Then
            TmpRange.End = SourceDoc.Range.End             'last page (there won't be a next page)
        Else
            'Find the beginning of the next page
            'Must use the Selection object. The Range.Goto method will not work on a page
            Selection.GoTo wdGoToPage, wdGoToAbsolute, lngCurrentPage + 1  'page, next page
            'Set the end of the range to the point between the pages
            TmpRange.End = Selection.Start
        End If
        '------------------------------------
        For Each aWord In TmpRange.Words
            ' 160 is non breaking space

                strSingleWord = LCase(Replace(Trim(aWord.Text), Chr(160), "")) 'lower character to upper character
                'maybe remove lower case form code, like this: (Replace(Trim(aWord.Text), Chr(160), ""))
                Select Case True
                Case Len(strSingleWord) = 1  'number of characters in single word
                ' if =1 then ignore "a" "&" if = 2 ignore for example: "an" "of" "is" "to" "by" "be"
                Case strSingleWord < "a" Or strSingleWord > "z"    'maybe remove this line form code
                Case InStr(1, sExcludes, "[" & strSingleWord & "]", vbTextCompare)
                Case Else
                    bTmpFound = False
                    For j = 1 To lngWordNum
                        If StrComp(arrWordList(j), strSingleWord, vbTextCompare) = 0 Then
                            arrWordCount(j) = arrWordCount(j) + 1
                            If (arrPageW(j) & "," Like "*," & CStr(lngCurrentPage) & ",*") = False Then
                            arrPageW(j) = arrPageW(j) & "," & CStr(lngCurrentPage)
                            End If
                            bTmpFound = True
                            Exit For
                        End If
                    Next j
                    If Not bTmpFound Then
                        lngWordNum = lngWordNum + 1
                        ReDim Preserve arrWordList(1 To lngWordNum)
                        ReDim Preserve arrWordCount(1 To lngWordNum)
                        ReDim Preserve arrPageW(1 To lngWordNum)
                        arrWordList(lngWordNum) = strSingleWord
                        arrWordCount(lngWordNum) = 1
                        arrPageW(lngWordNum) = arrPageW(lngWordNum) & "," & CStr(lngCurrentPage)
                    End If
                    If lngWordNum > lmaxwords - 1 Then
                        MsgBox "Too many words.", vbOKOnly
                        Exit For
                    End If
            End Select
            lngttlwds = lngttlwds - 1
            StatusBar = "Remaining: " & lngttlwds & ", Unique: " & lngWordNum
        Next aWord
        '------------------------------------
        lngCurrentPage = lngCurrentPage + 1      'move to the next page
        TmpRange.Collapse wdCollapseEnd          'go to the next page
    Loop
    '------------------------------------
    If lngWordNum > 0 Then
        tmpName = SourceDoc.AttachedTemplate.FullName    'output results
        Set NewDoc = Application.Documents.Add(Template:=tmpName, NewTemplate:=False)

        Selection.ParagraphFormat.TabStops.ClearAll
        Application.ScreenUpdating = False
        With Selection
            For j = 1 To lngWordNum
                .TypeText Text:=arrWordList(j) & vbTab & CStr(arrWordCount(j)) & vbTab & Mid(arrPageW(j), 2) & vbNewLine
            Next j
        End With

        Set TmpRange = NewDoc.Range
        TmpRange.ConvertToTable Separator:=wdSeparateByTabs
        With NewDoc.Tables(1)
            .Sort ExcludeHeader:=False, _
                  FieldNumber:="Kolumna 1", SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
                  FieldNumber2:="", _
                  FieldNumber3:="", _
                  CaseSensitive:=False, LanguageID:=wdPolish, IgnoreDiacritics:=False
            .Borders(wdBorderTop).LineStyle = wdLineStyleNone
            .Borders(wdBorderLeft).LineStyle = wdLineStyleNone
            .Borders(wdBorderBottom).LineStyle = wdLineStyleNone
            .Borders(wdBorderRight).LineStyle = wdLineStyleNone
            .Borders(wdBorderHorizontal).LineStyle = wdLineStyleNone
            .Borders(wdBorderVertical).LineStyle = wdLineStyleNone
            .Borders(wdBorderDiagonalDown).LineStyle = wdLineStyleNone
            .Borders(wdBorderDiagonalUp).LineStyle = wdLineStyleNone
        End With
    End If
    WordCountAndPages = lngWordNum
    '---------------------
WordCountAndPages_Exit:
    On Error Resume Next
    Set aWord = Nothing
    Set TmpRange = Nothing
    Set NewDoc = Nothing
    System.Cursor = wdCursorNormal
    Application.ScreenUpdating = True

    Exit Function

WordCountAndPages_Error:
    MsgBox "Błąd : ( " & Err.Number & " ) " & Err.Description & vbCrLf & _
           "Procedura  : " & "WordCountAndPages", vbExclamation
    Resume WordCountAndPages_Exit
End Function
I also need another column with page numbers, where the page number looks like that:
1-3, 5, 6-8. Now the page numbers looks like this: 1,2,3,5,6,7,8. When the
page numbers are consecutive (sequential), beetwen a page numbers I need to
add a hyphen (dashes). Solution for this problem, I found at below page,
but for an awk language:
http://docstore.mik.ua/orelly/unix/s...12/combine.idx
and also:
http://docstore.mik.ua/orelly/unix3/sedawk/ch12_02.htm

In another column I also need a number of page numbers. For example, when a
some word appear on pages 1,3,5,7, in output results in another column will
be 4.
And the last case: my code found all words in document. I need also search
for the keywords, for example this code do it:
http://microsoft-personal-applicatio...3/Default.aspx

I have also a some problems with some lines of my code:
strSingleWord < "a" Or strSingleWord > "z" 'fist character of word must
be a letter, this line ignore a special word, like "3w" "4you"
"3ware" "3love" "2day" , and so on...

I have also problem with some words, like: "@SCHL" "m&m", and so
on..., something line of code ignore this words.

Anyone help me with this?

You can also visit my thread at polish forum:
http://www.coderscity.pl/ftopic35919.html

Thanks for your attention!
Regards

Last edited by markos97; 10-27-2010 at 06:31 AM.
Reply With Quote
Reply

Tags
index, numbers, page

Thread Tools
Display Modes


Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom Table of Contents NJPhillips Word 1 06-11-2010 12:39 PM
Table of contents Table of contents after formatting! erika Word 1 06-10-2009 10:40 AM
Table of contents automation through VB 6.0 Developer Word 0 06-09-2009 06:30 AM
Table of Contents Formatting Rick5150 Word 1 03-16-2009 11:10 AM

Other Forums: Access Forums

All times are GMT -7. The time now is 11:25 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2024 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft