Hello,
I currently use a REGEX-based macro in MS Word to extract all uppercase words and generate a separate document listing these words in a table. This has been particularly helpful for creating acronym lists, such as those used in appendices. However, I would like to modify the existing macro to reduce "false positives"—specifically, words that are in uppercase but are not actual acronyms.
To illustrate, I have attached two sample documents along with their respective outputs:
1. "Sample Document with Acronyms v01"
Contents: This document contains
10 uppercase acronyms highlighted in green. There is also a mixed-case acronym ("IoT," standing for "Internet of Things") marked in yellow. The current macro does not recognize mixed-case acronyms.
Process:- Open the document.
- Select "View" from the Office ribbon.
- Navigate to "Macros" > "View Macros" and run "ExtractAcronymsToNewDocument."
- The macro generates a new Word document listing all found uppercase words in a table, with columns for "Acronym," "Definition" (left blank), and "Page" (indicating the first occurrence).
- Outcome: This method works well but misses the mixed-case acronym ("IoT").
2. "Sample Document with Acronyms v02"
Contents: This is a copy of v01 with added words (
"TABLE OF CONTENTS," "PARA #1," "PARA #2") marked in red. These uppercase headers are not acronyms but are extracted by the macro.
Process: Running the macro results in a new document listing the 10 green acronyms, but also includes the 4 additional uppercase header words (
false positives).
Issues with the Current Macro:
Over-Inclusion: The macro extracts all uppercase words, including headers and other non-acronym text.
Under-Inclusion: It does not extract acronyms that use mixed-case formatting, such as "IoT."
Questions:
Q1: How can the REGEX macro be adjusted to only include uppercase words within parentheses, potentially reducing false positives?
Q2: Is there a way to modify the macro to extract mixed-case acronyms like "IoT," in addition to the standard uppercase acronyms?
Even a solution that addresses only Q1 would be highly valuable, as the current macro generates an overwhelming number of false positives in lengthy documents.
Thank you for considering my request, and I appreciate any guidance you can provide!