#1
|
|||
|
|||
Macro to highlight unused definitions and undefined terms
Hello everyone,
I work on contracts and I’m looking for help in designing/drafting a macro. The contracts that I work on tend to contain a definitions section which attaches specific meanings to words. Those definitions can then be used in clauses throughout the contract. E.g.: Definition “Building” means the building at 1 New Street; Use The Building shall be accessible on a 24/7 basis. Drafting conventions There are a number of different conventions used to define definitions, but we can assume: - each word of a definition will be capitalised and the complete term will be enclosed in quote marks; and - each use of the defined term will be an exact match of the words in the definition and the capitalisation, but won’t be enclosed in quote marks. The problem When drafting long and complex documents you generally run into two problems: 1. You create definitions which are subsequently not used. E.g. “Blue Building” means the blue coloured building at 2 New Street; then this definition isn’t used elsewhere in the document. 2. You use words throughout the contract which should be definitions (i.e. they follow the same conventions as a definition) but are not actually defined. E.g. Continuing with the example above: “The Building and the Surrounding Zone shall be maintained to an adequate standard.” Building is defined (as above) but “Surrounding Zone”, whilst capitalised, doesn’t have a definition. The Requirement I want to: A. Create a macro which places in square brackets ([ ]) and highlights definitions which have been created and not used. B. Create a macro which generates a report of all capitalised words which don’t have a corresponding definition. I don’t have a coding background, but I’ve taught myself some VBA by hacking around and I have developed some useful tools. However, an efficient way to achieve the above has eluded me so far! For (A) I thought that it would be possible to use Word’s search function to loop through each instance of “[wildcard]” and test if a result occurs more than once. If it doesn’t, highlight it. However, I can’t seem to find a way to test if a result occurs more than once. This might also be defeated if a closing quotation mark is missing. For (B) I’m very lost on this one. It could: (i) extract a specified definitions section into a column in Excel; (ii) extract all capitalised words into a separate column (“Second Column”); (iii) deduplicate the Second Column; (iv) highlight all entries in the Second Column which do not have a matching entry in the other column. However, this strikes me as something that is likely to break on large documents. In addition, I’m not sure how to handle definitions which contain more than one word (but it could be assumed that any strings of capitalised words form part of the same definition). I know this way would result in false positives, but it should be fairly simple to manually remove those after the list is prepared. Any help or guidance this would be greatly appreciated. |
#2
|
|||
|
|||
For A) you can use a wildcard search for
(“)([A-Z])(*)( {1,}) which should find any smart opening quotation mark followed by a Capital letter, other text then one or More spaces. You can then search for subsequent words by moving the range using the .movestartuntil, until a character in the range cset:= "[ABCDEFGHIJKLMNOPQRSTUVXYZ]" is found ("[A-Z]" won't work unlike in find/replace) and then moving the end of the range using .moveenduntil cset=" " The critical bit in the above search is to enter the opening quote by typing 0147 on the numeric keypad with the alt key held down. (ALT 0147). The equivalent closing quote is ALT 0148 Add the word to a scripting dictionary so that you can check easily if a term has already been found and if not add it to a second scripting dictionary. Hopefully the above should get you on your way but if you are still flummoxed I'll try and cobble together some code. |
#3
|
||||
|
||||
You might try the second macro in: https://www.msofficeforums.com/word/...html#post32031. That code creates a list of the quoted terms, including their page references. You need only check those that occur on just the one page.
That said, having defined terms like “Building” and “Blue Building” in the same document is poor practice and the macro in the above link won't differentiate 'Building' references from 'Blue Building' references. Coding for such differentiation would require a lot more work. As for expressions such as 'Surrounding Zone', for which no definition is provided, if you have a separate document that lists such terms so their presence can be checked for, you could use the first macro in the above link then compare the entries in its output for the presence of items not listed in the table produced by the second macro. Otherwise, you'll undoubtedly run into problems coding a macro to ignore words starting sentences - but only when they're not part of such an expression.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#4
|
|||
|
|||
That was an entertaining puzzle
The code below should be in the right ball park for what you described above. As Macropod mentioned there can be edge cases and I've not attempted to resolve these. The following text will help you understand what these edge cases are as I developed the code using the text. Have fun Quote:
Code:
Option Explicit Public Const open_smart_quote As String = "“" ' Smart quote entered using ALT+0147 Public Const not_open_smart_quote As String = "([!“])" Public Const close_smart_quote As String = "”" ' Smart quote entered using ALT+0148 Public Const capitalised_word As String = "([A-Z])(*)" ' Any sequence of characters starting with a Capital letter Public Const end_of_capitalised_word As String = "([^13 ,.-:;])" ' Possible characters that end a definition if closing smart quote omitted Public Const breaking_space As String = " " Public Const lcase_letters As String = "abcdefghijklmnopqrstuvwxyz" Public Const ucase_letters As String = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" Public search_doc As Word.Document Public quoted_definitions As Scripting.Dictionary Sub annotate_definitions_in_a_doc() Set search_doc = ActiveDocument Set quoted_definitions = New Scripting.Dictionary compile_list_of_quoted_definitions check_that_definitions_are_used highlight_unused_definitions highlight_capitalised_phrases_with_no_definition End Sub Sub compile_list_of_quoted_definitions() Dim search_range As Word.range Set search_range = search_doc.StoryRanges(wdMainTextStory) With search_range.find .ClearFormatting .Replacement.ClearFormatting .MatchWildcards = True .Wrap = wdFindStop .text = open_smart_quote & capitalised_word & end_of_capitalised_word .Execute Do Until search_range.text = vbNullString update_search_range_to_whole_capitalised_phrase search_range update_search_range_to_capitalised_phrase_text search_range do_actions_for_quoted_text search_range search_range.Move unit:=wdWord .Execute Loop End With End Sub Sub do_actions_for_quoted_text(this_range As Word.range) If Not this_range.Next(unit:=wdCharacter).text = close_smart_quote Then ' comment out the line below if you don't want comments in the document search_doc.Comments.Add range:=this_range.Duplicate, text:="Missing closing quote" ' Uncomment the next line to add a closing smart quote if one is missing. ' this_range.InsertAfter text:=close_smart_quote End If quoted_definitions.Add key:=this_range.text, Item:=True End Sub Sub update_search_range_to_whole_capitalised_phrase(this_range As Word.range) ' We didn't include the close_smart_quote in the search but if present ' it will be the second to last charcter in the search range ' so we will only extend the range if no close_smart_quote is present in this_range.text If InStr(this_range.text, close_smart_quote) = 0 Then ' The last character of this_range will be a character from end_of_capitalised_word. ' We only want to extend the range if the last character is a breaking_space ' Two seperate if statemdnts are used because this provides greater flexibility ' if we need to make tweaks later on. If this_range.Characters.Last.text = breaking_space Then Do While InStr(ucase_letters, this_range.Next(unit:=wdWord).Characters.First.text) > 0 this_range.MoveEnd unit:=wdWord, Count:=1 Loop End If End If End Sub Sub update_search_range_to_capitalised_phrase_text(this_range As Word.range) ' This subroutine is used on two occasions with slightly different requirements ' consequently we test for the relevant characters before adjusting the range ' as in the second use the characters definitiely won't be included in the range If this_range.Characters.First.text = open_smart_quote Then this_range.MoveStart unit:=wdCharacter, Count:=1 ' Eliminate the start_smart_quote End If ' check if they are present before adjusting the end of the range. If InStr(end_of_capitalised_word, this_range.Characters.Last.text) > 0 Then this_range.MoveEnd unit:=wdCharacter, Count:=-1 ' Eliminate the terminating character End If If this_range.Characters.Last.text = close_smart_quote Then this_range.MoveEnd unit:=wdCharacter, Count:=-1 ' If present remove the close_smart_quote End If ' this_range should now encompass only the text of the definition End Sub Sub check_that_definitions_are_used() Dim definition As Variant Dim search_range As Word.range ' We can now use the scripting.dictionary of defined terms to check if they ' actually occur in the document ' if we don't find a definition used in the document then the item value ' of the definition in the dictionary will be set to false. For Each definition In quoted_definitions.keys Set search_range = search_doc.StoryRanges(wdMainTextStory) With search_range.find .ClearFormatting .Replacement.ClearFormatting .MatchWildcards = True .MatchCase = True .Wrap = wdFindStop .text = not_open_smart_quote & definition & end_of_capitalised_word .Execute If Not .Found Then quoted_definitions(definition) = False End If End With Next End Sub Sub highlight_unused_definitions() Dim definition As Variant Dim search_range As Word.range Dim search_text As String Dim replace_text As String replace_text = "[\1\2]" For Each definition In quoted_definitions.keys If Not quoted_definitions(definition) Then Set search_range = search_doc.StoryRanges(wdMainTextStory) search_range.Select search_text = "(" & open_smart_quote & ")" & "(" & definition & ")" ' & end_of_capitalised_word With search_range.find .ClearFormatting .Replacement.ClearFormatting .MatchWildcards = True .Wrap = wdFindStop .text = search_text .Replacement.text = replace_text .Replacement.Highlight = True .Execute replace:=wdReplaceOne Set search_range = Nothing ' if preferred you could add here the code to add a comment rqather than highlight the text End With End If Next End Sub Sub highlight_capitalised_phrases_with_no_definition() Dim search_range As Word.range Set search_range = search_doc.StoryRanges(wdMainTextStory) With search_range.find .ClearFormatting .Replacement.ClearFormatting .MatchWildcards = True .Wrap = wdFindStop .text = capitalised_word & end_of_capitalised_word .Execute Do Until search_range.text = vbNullString update_search_range_to_whole_capitalised_phrase search_range update_search_range_to_capitalised_phrase_text search_range do_actions_for_capitalised_phrase search_range search_range.Move unit:=wdWord .Execute Loop End With End Sub Sub do_actions_for_capitalised_phrase(this_range As Word.range) If Not quoted_definitions.Exists(this_range.text) Then search_doc.Comments.Add range:=this_range.Duplicate, text:="Capitalised phrase with no corresponding definition" ' alternatively, if a highlihght if preferred over a comment ' this_range.HighlightColorIndex=wdRed End If End Sub |
#5
|
|||
|
|||
There's always a niggle.
I cut and pasted the test text back into my word document to ensure that the smart quotes came out right. As a consequence I had two copies of the text in my document. The code then promptly fell over as I'd forgotten to add a test for duplicate definitions. You will need to replace the subroutine below in place of the equivalent subroutine in the code I initially provided. Code:
Sub do_actions_for_quoted_text(this_range As Word.range) ' Check to see if a definition we have just found already exists If quoted_definitions.Exists(this_range.text) Then search_doc.Comments.Add range:=this_range, text:="Duplicate definition?" Exit Sub End If If Not this_range.Next(unit:=wdCharacter).text = close_smart_quote Then ' comment out the line below if you don't want comments in the document search_doc.Comments.Add range:=this_range.Duplicate, text:="Missing closing quote" ' Uncomment the next line to add a closing smart quote if one is missing. ' this_range.InsertAfter text:=close_smart_quote End If quoted_definitions.Add key:=this_range.text, Item:=True End Sub |
#6
|
|||
|
|||
Hi, i tried inserting the code into Microsoft Word Visual Basic Editor and i get the error:
Constants, fixed-length strings, arrays, user-defined types, and Declare statements not allowed as Public members of an object module How do i get around this? Where should i insert the code? |
#7
|
|||
|
|||
The error message implies you have pasted the code into a class module rather than a normal module. If you want to keep the code in a class module you will have to change the constant definitions from Public to Private.
|
#8
|
|||
|
|||
Thanks Slaycock.
I have amended the code to be used with a normal.dotm file. Now i have this error which is attached in the screenshot. How should i solve this? |
#9
|
|||
|
|||
You need to add a reference to the scripting runtime. In the VBA IDE goto Tools.References and ensure that the check box for the Microsoft Scripting Runtime is ticked.
|
#10
|
|||
|
|||
Yes that worked. Now i get this error for the following line of code:
|
#11
|
|||
|
|||
Another error for the first macro
|
#12
|
|||
|
|||
This is for the third macro
|
#13
|
|||
|
|||
I'm sorry you are having problems. I put the code into a new word document with the sample text, added the reference to Scripting.Runtime and it works perfectly for me.
If you want to report the code here we can check if you copied it correctly. |
#14
|
|||
|
|||
Thanks for helping. What do you mean by adding the reference to Scripting.Runtime?
Code:
Option Explicit Public Const open_smart_quote As String = "“" ' Smart quote entered using ALT+0147 Public Const not_open_smart_quote As String = "([!“])" Public Const close_smart_quote As String = "”" ' Smart quote entered using ALT+0148 Public Const capitalised_word As String = "([A-Z])(*)" ' Any sequence of characters starting with a Capital letter Public Const end_of_capitalised_word As String = "([^13 ,.-:;])" ' Possible characters that end a definition if closing smart quote omitted Public Const breaking_space As String = " " Public Const lcase_letters As String = "abcdefghijklmnopqrstuvwxyz" Public Const ucase_letters As String = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" Public search_doc As Word.Document Public quoted_definitions As Scripting.Dictionary Sub annotate_definitions_in_a_doc() Set search_doc = ActiveDocument Set quoted_definitions = New Scripting.Dictionary compile_list_of_quoted_definitions check_that_definitions_are_used highlight_unused_definitions highlight_capitalised_phrases_with_no_definition End Sub Sub compile_list_of_quoted_definitions() Dim search_range As Word.Range Set search_range = search_doc.StoryRanges(wdMainTextStory) With search_range.Find .ClearFormatting .Replacement.ClearFormatting .MatchWildcards = True .Wrap = wdFindStop .Text = open_smart_quote & capitalised_word & end_of_capitalised_word .Execute Do Until search_range.Text = vbNullString update_search_range_to_whole_capitalised_phrase search_range update_search_range_to_capitalised_phrase_text search_range do_actions_for_quoted_text search_range search_range.Move unit:=wdWord .Execute Loop End With End Sub Sub do_actions_for_quoted_text(this_range As Word.Range) ' Check to see if a definition we have just found already exists If quoted_definitions.Exists(this_range.Text) Then search_doc.Comments.Add Range:=this_range, Text:="Duplicate definition?" Exit Sub End If If Not this_range.Next(unit:=wdCharacter).Text = close_smart_quote Then ' comment out the line below if you don't want comments in the document search_doc.Comments.Add Range:=this_range.Duplicate, Text:="Missing closing quote" ' Uncomment the next line to add a closing smart quote if one is missing. ' this_range.InsertAfter Text:=close_smart_quote End If quoted_definitions.Add Key:=this_range.Text, Item:=True End Sub Sub update_search_range_to_whole_capitalised_phrase(this_range As Word.Range) ' We didn't include the close_smart_quote in the search but if present ' it will be the second to last charcter in the search range ' so we will only extend the range if no close_smart_quote is present in this_range.text If InStr(this_range.Text, close_smart_quote) = 0 Then ' The last character of this_range will be a character from end_of_capitalised_word. ' We only want to extend the range if the last character is a breaking_space ' Two seperate if statemdnts are used because this provides greater flexibility ' if we need to make tweaks later on. If this_range.Characters.Last.Text = breaking_space Then Do While InStr(ucase_letters, this_range.Next(unit:=wdWord).Characters.First.Text) > 0 this_range.MoveEnd unit:=wdWord, Count:=1 Loop End If End If End Sub Sub update_search_range_to_capitalised_phrase_text(this_range As Word.Range) ' This subroutine is used on two occasions with slightly different requirements ' consequently we test for the relevant characters before adjusting the range ' as in the second use the characters definitiely won't be included in the range If this_range.Characters.First.Text = open_smart_quote Then this_range.MoveStart unit:=wdCharacter, Count:=1 ' Eliminate the start_smart_quote End If ' check if they are present before adjusting the end of the range. If InStr(end_of_capitalised_word, this_range.Characters.Last.Text) > 0 Then this_range.MoveEnd unit:=wdCharacter, Count:=-1 ' Eliminate the terminating character End If If this_range.Characters.Last.Text = close_smart_quote Then this_range.MoveEnd unit:=wdCharacter, Count:=-1 ' If present remove the close_smart_quote End If ' this_range should now encompass only the text of the definition End Sub Sub check_that_definitions_are_used() Dim definition As Variant Dim search_range As Word.Range ' We can now use the scripting.dictionary of defined terms to check if they ' actually occur in the document ' if we don't find a definition used in the document then the item value ' of the definition in the dictionary will be set to false. For Each definition In quoted_definitions.keys Set search_range = search_doc.StoryRanges(wdMainTextStory) With search_range.Find .ClearFormatting .Replacement.ClearFormatting .MatchWildcards = True .MatchCase = True .Wrap = wdFindStop .Text = not_open_smart_quote & definition & end_of_capitalised_word .Execute If Not .Found Then quoted_definitions(definition) = False End If End With Next End Sub Sub highlight_unused_definitions() Dim definition As Variant Dim search_range As Word.Range Dim search_text As String Dim replace_text As String replace_text = "[\1\2]" For Each definition In quoted_definitions.keys If Not quoted_definitions(definition) Then Set search_range = search_doc.StoryRanges(wdMainTextStory) search_range.Select search_text = "(" & open_smart_quote & ")" & "(" & definition & ")" ' & end_of_capitalised_word With search_range.Find .ClearFormatting .Replacement.ClearFormatting .MatchWildcards = True .Wrap = wdFindStop .Text = search_text .Replacement.Text = replace_text .Replacement.Highlight = True .Execute Replace:=wdReplaceOne Set search_range = Nothing ' if preferred you could add here the code to add a comment rqather than highlight the text End With End If Next End Sub Sub highlight_capitalised_phrases_with_no_definition() Dim search_range As Word.Range Set search_range = search_doc.StoryRanges(wdMainTextStory) With search_range.Find .ClearFormatting .Replacement.ClearFormatting .MatchWildcards = True .Wrap = wdFindStop .Text = capitalised_word & end_of_capitalised_word .Execute Do Until search_range.Text = vbNullString update_search_range_to_whole_capitalised_phrase search_range update_search_range_to_capitalised_phrase_text search_range do_actions_for_capitalised_phrase search_range search_range.Move unit:=wdWord .Execute Loop End With End Sub Sub do_actions_for_capitalised_phrase(this_range As Word.Range) If Not quoted_definitions.Exists(this_range.Text) Then search_doc.Comments.Add Range:=this_range.Duplicate, Text:="Capitalised phrase with no corresponding definition" ' alternatively, if a highlihght if preferred over a comment ' this_range.HighlightColorIndex=wdRed End If End Sub |
#15
|
|||
|
|||
Scripting.Dictionaries don't exist as part of VBA. They are provided by the Scripting.Runtime. To use scripting.dictionaries in VBA you either have to add a reference to the scripting.runtime to VBA so VBA knows what scripting.dictionaries are and can create them, or you create a late bound object e.g. CreateObject("Scripting.Dictionary") whereby VBA will ask any libraries registered in Windows if they know what a scripting.dictionary is and can they very kindly create one.
When developing macros its best to add a library reference so that you get full advantage of any intellisense provided for the references objects and methods. This is why I provided the instructions on how to register the scripting.runtime. If you are unable to register the Scripting.runtime then make the following changes. Code:
Public quoted_definitions as Scripting.Dictionary Code:
Public quoted_definitions As Object Code:
Set quoted_definitions =New Scripting.Dictionary Code:
Set quoted_definitions = CreateObject("Scripting.Dictionary") |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Macro that deletes unused styles | thejollyroger | Word VBA | 14 | 12-04-2017 09:00 AM |
Best Practice for Indexing Multiple Word Terms and Sub-Terms | jhy001 | Word | 4 | 11-06-2017 02:08 PM |
Macro Question: Need help making a macro to highlight the first word in every sentence | LadyAna | Word | 1 | 12-06-2014 10:39 PM |
Wild card to highlight capitalised terms? | bertietheblue | Word | 2 | 02-08-2013 04:44 PM |
find - reading highlight - highlight all / highlight doesn't stick when saved | bobk544 | Word | 3 | 04-15-2009 03:31 PM |