#1
|
|||
|
|||
Auto index all words?
Hi all, firstly my apologies, I'm sure google would have the answers for me but I have no idea on the correct terminology to use.
I have used OCR to digitize and then manually reformatted an old legal document. I've figured out how to use headings to create a table of contents, which is working as expected. Now I want to auto-create a comprehensive alphabetical word list at the end of the document, with page numbers of where each word appears. I always thought this was an 'index', but google (and/or Word) seems to disagree somewhat. Whatever it is actually called, I'm guessing that Word probably has a feature to generate one with little effort, and would probably be smart enough to ignore common prepositions and conjunctions etc (although I could manually delete those from the full list if required). It's also likely that the relevant page numbers will change as the document continues to be improved, so an automated, 'updateable' solution is essential. A simple list of the page numbers where each word appears would be a good start, but it would be fantastic if I could get a result that shows the heading under which each instance appears, as well as the page number(s), so that readers can easily pick out the appearances of the word most relevant to their enquiry. An example might look like this: Licence 2. Constitution of the Trust Fund...7 10. Powers, Duties and Obligations of the Trustee...15 10. Powers, Duties and Obligations of the Trustee...18 Vesting 1. Definitions...5 11. The Income of the Fund...20 11. The Income of the Fund...21 12. The Period of the Trust and Termination Thereof...21 14. Variation of Trust...23 Any ideas how can I achieve such a thing? |
#2
|
|||
|
|||
Every word, every place, will result in an index that is pretty much useless.
The Index tool is what you want to use and you can use it with a concordance that has your entire document. Indices - Complex Documents The part about headings requires a bit more work. See, though from the above: Quote:
|
#3
|
|||
|
|||
Thanks Charles, I will investigate those avenues.
Just to be clear though, this is a case in which being able to locate every instance of a word is important to the investigating reader. The use of section headings would be a big help in locating particular instances, of course, but often it is the way a term is used elsewhere in the document that can really throw sand in the gears! |
#4
|
|||
|
|||
So, you want page numbers for each "the," "a," "and." etc?
Look into the concordance, but do take a look at John McGhie's comments. I am familiar with transcripts indexed this way and it can be useful, yes. The Concordance feature should help. |
#5
|
||||
|
||||
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#6
|
|||
|
|||
Thanks again.
No, I don't want "and" and "the" etc, but I can remove those manually from the list if Word isn't equipped to do it automatically. I do want pretty much all other words though - even seemingly innocuous and pointless ones like "prior" and "was" and "theretofor" actually only appear in a small number of places each, and could easily be the core of some argument or at least a guiding star for locating a particular phrase of contention. The more comprehensive the index the more useful it will be, there is little advantage in removing anything other than to save paper and toner in the printed copies. I have succeeded with the concordance document, thank you, and now have a large, standard index with a list of page numbers for each word. My planned next step, when I have another couple of hours to devote to it, is to copy the index back out of Word and back into Excel, and edit the table to create a separate line for each word/page pair. Then it should be simple to sort by page number, add a third column with the section heading that each page number belongs to, then sort alphabetically again. That should give me a result something like my example in post #1: Licence … 2. Constitution of the Trust Fund … 7 Licence … 10. Powers, Duties and Obligations of the Trustee … 15 Licence … 10. Powers, Duties and Obligations of the Trustee … 18 Vesting … 1. Definitions … 5 Vesting … 11. The Income of the Fund … 20 Vesting … 11. The Income of the Fund … 21 Vesting … 12. The Period of the Trust and Termination Thereof … 21 Vesting … 14. Variation of Trust … 23 Unfortunately this will no longer be a "live" index so I'll have to repeat the process any time I make further changes affecting the layout, but it seems like what I am doing here is outside of Word's design so perhaps that's just the price I need to pay. It shouldn't be too onerous, I hope. Thanks again for the links and input, and if anyone can think of a better way to get there (especially if it preserves the ability to update automatically) I'd love to hear it! |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to create an index by adding words in batch (rather than one at a time) | Swarup | Word | 7 | 10-12-2018 01:55 PM |
How to index words in endnotes | ClaireB | Word | 1 | 11-11-2013 06:05 PM |
Index: only complete words | Nongkhai | Word | 9 | 02-04-2013 06:39 PM |
Index with unwanted words and a picture | Verdande | Word | 3 | 05-15-2012 03:35 PM |
Index words in the text showing code | alpruett | Word | 0 | 06-29-2010 09:51 AM |