Thanks for the explanation. That is a very different requirement to indexing every word in a document. In the case of a Names and Places index, it would make more sense to plan to index only words that start with a capital letter. This would remove a lot of the unwanted words immediately.
You would also want to do a lot of work to manage multiple words that form a name eg
'Alan Smith' should be indexed as "Smith, Alan" instead of two separate terms
'Rip Van Winkle' should be indexed as "Van Winkle, Rip"
'Stratford-on-Avon' should be indexed as a single term
'Stratform-upon-Avon' should use the same indexing term as the other variant
'Richard the Lionheart' should be indexed as a single term etc
The concordance file can handle the correct assignations but the rules a macro would need to include would be very complex to manage this automatically.
Personally, I wouldn't be approaching this task from the direction that you are taking. The index you end up with would be meaningless unless you manually fix all the compound names to index them as a group.
I would think that you should be adding to your concordance file manually. Concordance files typically include all indexable terms and may include many terms that don't actually appear in 'this' document but might appear in similar documents. You might be able to automate the initial population of a concordance file eg by importing a comprehensive list from other sources such as a postcode list or a family tree.
__________________
Andrew Lockton
Chrysalis Design, Melbourne Australia
|