Quote:
Originally Posted by macropod
Why would you build a tool for that, when Word already has a built-in comment facility?
|
Let me briefly explain our goal so hopefully the tool 'hopefully

' makes more sense. Imagine if our goal is to catalog 'Species' specific information in everyone of our documents. Well species are tied to Genus, which is tied to Family, to infraorder, to order, to class, to phylum, to kingdom. So when a piece of text says that 'Dolomedes Tenebrosus only live in climates of 80 degrees and higher' we want to capture Animals->Arthropods->Arachnids->Spiders->True Spiders->Nursery Web Spiders->Fishing Spiders->Dolomedes Tenebrosus->'only live in climates of 80 degrees and higher.' We then assign an ID to that tracking and can see in which of our 40,000 other documents that exact same statement/chain is located. That way, when one day we find out that they can actually live in climates of 60 degrees and higher, instead of going through 40,000 documents manually looking for that statement, we have a library built that tells us where each piece of this text is located.
With that being said, our team was currently looking at each piece of text from a word document, and then going into excel and typing in all of those classes over and over again, even though, more than likely that one specific document typically only discusses Infraorder/Family related information and lower. So now I have it set up that they stay in Word without going into excel. When they first open the document they declare the kingdom, phylum, class, and order that the document starts with via user form. Now, keep in mind, the only thing that is likely fully declared on the document is the species text and maybe the species name. Everything higher than that still needs to be logged, but is not specifically declared, so it's up to the inputter to intelligently comprehend and label that hierarchy.
With everything up to Order already established, they highlight the first piece of species specific information they want to capture, hit alt-q, and a user for pops up. The UF undertands they want to record this information, that is is tied to Animals->Arthropods->Arachnids->Spiders but it doesn't yet know what this piece of text is tied to for infraorder, family, genus or even species yet. So the UF displays infraorder options considering that order = spiders. They choose and it automatically updates to to show Family options for infraorder = True Spiders all the way down to finally choosing the species. At this point, now that the program understands that the text being captures belongs Dolomedes Tenebrosus -> Fishing Spider -> etc all the way up to kingdom:animals it can no assign an ID to that piece of text. Then 8 comments pop on the screen, one for each new change to the hierarchy the user has made. Now more than likely the next piece of text will be related to another species which falls under Family:Nursery Web spiders... so when they highlight that piece of text and hit alt-e, it opens the same user form but only at the Genus level, they choose Water Spiders for the Genus, and Medes Nocturnus for the species. This time, only three new comments show up on the screen: the genus change, the species change and the text that they have highlighted.
They then continue through the entire document without switching screens and need only to highlight text they look to capture and use arrow keys in the userform to quickly move in and out of the classes of hierarchy they are working with. When they are finished with their 40 page document they then hit a button which exports this to the main library. which lead to your next comment.
**Edit** to clarify, the tool i created utilizes Word's comment facility, and inputs those comments on the document. The tool does the above though to intelligently add multiple comments to classify the highlighted text
Thanks for this link, I have been able to create something very similar and the export to excel works well.
Quote:
Perhaps you could explain why you think you'd need to recreate the document. Sure, you'd need to update the workbook, but why recreate the document? Surely you'd simply add the comments to the existing document, rather than making another one?
|
I think by now you have figured out that we are not actually cataloging animal species and text attributed to them. We are working on something else that follows that general address/tagging system. With that being said, there is an incredible amount of ambiguity within these documents, and no two documents are alike. Nothing is specified clearly and it is up to the intelligence of the inputter to understand what and how to define something. You can imagine how much text is on a 40 page document, times 40,000 documents, leads to an immense amount of indexing of the text. When managers review the work, and they have questions about it, it will be much easier for them load the document with all the comments loaded on them to clearly understand the line of thinking which occurred to generate the address for that piece of text. They can see the rest of the text surrounding that one piece of text and gain context.
Quote:
Instead of re-processing all documents, the only ones that would need re-processing are those with a save date later than the last date you updated the Excel workbook. Having identified those documents, it then becomes a matter of deleting any existing comments for them, then re-populating the workbook with the new details. Quite straightforward, really.
|
So hopefully my responses to your questions thus far have provided some additional details and scope to what we are trying to do. As I mentioned before we already have 40k original approved documents that do not change at all saved to our intranet. If a change is made to one document, then we will have 40,001 documents. With that being said, if I want a manger to be able to pull up an inputter's document with all the comments on them, then I believe there is only two ways of doing it:
1) pull a document from the intranet, put all the comments on it, run the extraction tool, and then save the document in it's entirety to the local shared drive with the comments saving in place naturally as well.
2) pull the document from the intranet, put all the comments on it, run the extraction tool (this time the extraction tool also save all the comments and their location relative to that specific document in a text file) then delete the local version of the document with the comments loaded on it. Then when a manager wants too look at the comment formatted document, they click a link in excel for that item, it pulls down the document from the intranet automatically, it finds the text file associated to the comments for that document, it reads all the comments and locations for those comments in that text file and it loads all those comments inside of the document automatically.
Option 1 definitely works and is much simpler. However, option 1 means that we will be saving 40k documents in their entirety locally when we already have them saved on the intranet. This just seems quite redundant and a waste of space.
----------------------
Ok, so I know that was a lot to take in, and I really appreciate your input thus far. I also appreciate what you do on this forum considering it seems like you have single-handily answered almost everyone's post on here to date, so I definitely value your input. I'm looking forward to hearing back from you!
Cheers
-michael