Recreating a document with comments?

mrlemmer11 · #1 06-25-2015, 08:09 AM

Hello All,

So, my company has roughly 40,000 documents ( some 1 page, some 100 pages long) all stored on the intranet.

Our team need to go thru all of these documents and obtain certain information from them. So we start by going to the intranet, pulling one form down and opening it.

I have already built a tool that through keystrokes, applies different comments where someone indicates.

After they are done, all the comments and their associations are extracted into excel.

My question is this.... lets say a week later, someone decides they want to review/change some of those comments. How can I recreate that Document with all the comments located in order? Obviously I the user can just save the form with their comments on it... however, as stated above we are talking about 40,000 documents, some 100 pages long already stored on a server somewhere. I don't think it's practical to resave all of these comment-applied documents again and use up all that space.

My idea is to write the .scope / .range / .comments(x) / .author all to a text file. Then when someone wants to recreate the document with the comments on it, they pull down the original document, hit an applyComments macro and wahla. However, I can't figure out how to have the macro/vba understand where to put the comments inside of the document. Order isn't a prob, it's just where in the document is the issue. I thought about searching the documents text for a match to the .range of the comment, however, some of these comments won't have a range... so that won't work.

I thought about bookmarks.... .comments(28) goes immediately after bookmark28... but I run into the same problem... how do the boomarks know where to go?

As i'm writing this, im thinking that I may be able to count the characters in between each comment and build it that way. Since the original document is not allowed to be changed or edited without creating a new file for public release I don't need to worry about a fresh copy of it having its characters change... so maybe something like from start of document to .comment(1) = 309 characters. from .comment(1) to .comment(2) = 48 characters. Then when I write to file each comments attributes, I also save how many characters in between each one, then on a fresh document download, on launch of the macro, it can use that info to populate....I'm kinda thinking that is my best route as of right now.

----

So with that, I ask your thoughts on opinions on how I can accomplish this. Confused or need more info? Reply and ask away and I shall answer. Thanks in advance everyone!

macropod · #2 06-25-2015, 09:43 PM

Quote:

Originally Posted by mrlemmer11

I have already built a tool that through keystrokes, applies different comments where someone indicates.

Why would you build a tool for that, when Word already has a built-in comment facility?

Quote:

Originally Posted by mrlemmer11

After they are done, all the comments and their associations are extracted into excel.

You can get code to do this for Word's comments, at: http://answers.microsoft.com/en-us/o...3-3131ab68809c

Quote:

Originally Posted by mrlemmer11

My question is this.... lets say a week later, someone decides they want to review/change some of those comments. How can I recreate that Document with all the comments located in order?

Perhaps you could explain why you think you'd need to recreate the document. Sure, you'd need to update the workbook, but why recreate the document? Surely you'd simply add the comments to the existing document, rather than making another one?

Quote:

Originally Posted by mrlemmer11

Obviously I the user can just save the form with their comments on it... however, as stated above we are talking about 40,000 documents, some 100 pages long already stored on a server somewhere. I don't think it's practical to resave all of these comment-applied documents again and use up all that space.

Instead of re-processing all documents, the only ones that would need re-processing are those with a save date later than the last date you updated the Excel workbook. Having identified those documents, it then becomes a matter of deleting any existing comments for them, then re-populating the workbook with the new details. Quite straightforward, really.

mrlemmer11 · #3 06-26-2015, 11:19 AM

Quote:

Originally Posted by macropod

Why would you build a tool for that, when Word already has a built-in comment facility?

Let me briefly explain our goal so hopefully the tool 'hopefully

' makes more sense. Imagine if our goal is to catalog 'Species' specific information in everyone of our documents. Well species are tied to Genus, which is tied to Family, to infraorder, to order, to class, to phylum, to kingdom. So when a piece of text says that 'Dolomedes Tenebrosus only live in climates of 80 degrees and higher' we want to capture Animals->Arthropods->Arachnids->Spiders->True Spiders->Nursery Web Spiders->Fishing Spiders->Dolomedes Tenebrosus->'only live in climates of 80 degrees and higher.' We then assign an ID to that tracking and can see in which of our 40,000 other documents that exact same statement/chain is located. That way, when one day we find out that they can actually live in climates of 60 degrees and higher, instead of going through 40,000 documents manually looking for that statement, we have a library built that tells us where each piece of this text is located.

With that being said, our team was currently looking at each piece of text from a word document, and then going into excel and typing in all of those classes over and over again, even though, more than likely that one specific document typically only discusses Infraorder/Family related information and lower. So now I have it set up that they stay in Word without going into excel. When they first open the document they declare the kingdom, phylum, class, and order that the document starts with via user form. Now, keep in mind, the only thing that is likely fully declared on the document is the species text and maybe the species name. Everything higher than that still needs to be logged, but is not specifically declared, so it's up to the inputter to intelligently comprehend and label that hierarchy.

With everything up to Order already established, they highlight the first piece of species specific information they want to capture, hit alt-q, and a user for pops up. The UF undertands they want to record this information, that is is tied to Animals->Arthropods->Arachnids->Spiders but it doesn't yet know what this piece of text is tied to for infraorder, family, genus or even species yet. So the UF displays infraorder options considering that order = spiders. They choose and it automatically updates to to show Family options for infraorder = True Spiders all the way down to finally choosing the species. At this point, now that the program understands that the text being captures belongs Dolomedes Tenebrosus -> Fishing Spider -> etc all the way up to kingdom:animals it can no assign an ID to that piece of text. Then 8 comments pop on the screen, one for each new change to the hierarchy the user has made. Now more than likely the next piece of text will be related to another species which falls under Family:Nursery Web spiders... so when they highlight that piece of text and hit alt-e, it opens the same user form but only at the Genus level, they choose Water Spiders for the Genus, and Medes Nocturnus for the species. This time, only three new comments show up on the screen: the genus change, the species change and the text that they have highlighted.

They then continue through the entire document without switching screens and need only to highlight text they look to capture and use arrow keys in the userform to quickly move in and out of the classes of hierarchy they are working with. When they are finished with their 40 page document they then hit a button which exports this to the main library. which lead to your next comment.

**Edit** to clarify, the tool i created utilizes Word's comment facility, and inputs those comments on the document. The tool does the above though to intelligently add multiple comments to classify the highlighted text

Quote:

You can get code to do this for Word's comments, at: http://answers.microsoft.com/en-us/o...3-3131ab68809c

Thanks for this link, I have been able to create something very similar and the export to excel works well.

Quote:

Perhaps you could explain why you think you'd need to recreate the document. Sure, you'd need to update the workbook, but why recreate the document? Surely you'd simply add the comments to the existing document, rather than making another one?

I think by now you have figured out that we are not actually cataloging animal species and text attributed to them. We are working on something else that follows that general address/tagging system. With that being said, there is an incredible amount of ambiguity within these documents, and no two documents are alike. Nothing is specified clearly and it is up to the intelligence of the inputter to understand what and how to define something. You can imagine how much text is on a 40 page document, times 40,000 documents, leads to an immense amount of indexing of the text. When managers review the work, and they have questions about it, it will be much easier for them load the document with all the comments loaded on them to clearly understand the line of thinking which occurred to generate the address for that piece of text. They can see the rest of the text surrounding that one piece of text and gain context.

Quote:

Instead of re-processing all documents, the only ones that would need re-processing are those with a save date later than the last date you updated the Excel workbook. Having identified those documents, it then becomes a matter of deleting any existing comments for them, then re-populating the workbook with the new details. Quite straightforward, really.

So hopefully my responses to your questions thus far have provided some additional details and scope to what we are trying to do. As I mentioned before we already have 40k original approved documents that do not change at all saved to our intranet. If a change is made to one document, then we will have 40,001 documents. With that being said, if I want a manger to be able to pull up an inputter's document with all the comments on them, then I believe there is only two ways of doing it:
1) pull a document from the intranet, put all the comments on it, run the extraction tool, and then save the document in it's entirety to the local shared drive with the comments saving in place naturally as well.
2) pull the document from the intranet, put all the comments on it, run the extraction tool (this time the extraction tool also save all the comments and their location relative to that specific document in a text file) then delete the local version of the document with the comments loaded on it. Then when a manager wants too look at the comment formatted document, they click a link in excel for that item, it pulls down the document from the intranet automatically, it finds the text file associated to the comments for that document, it reads all the comments and locations for those comments in that text file and it loads all those comments inside of the document automatically.

Option 1 definitely works and is much simpler. However, option 1 means that we will be saving 40k documents in their entirety locally when we already have them saved on the intranet. This just seems quite redundant and a waste of space.

----------------------

Ok, so I know that was a lot to take in, and I really appreciate your input thus far. I also appreciate what you do on this forum considering it seems like you have single-handily answered almost everyone's post on here to date, so I definitely value your input. I'm looking forward to hearing back from you!

Cheers
-michael

macropod · #4 06-29-2015, 05:04 AM

If you have the comments, plus the inputters' details and the file names & paths in the Excel workbook and you want to find a given set of comments by a given inputter, surely you'd only need to go through the workbook looking for matches between the comments you're looking for and the inputter's details? Only then, if you need to see the context, would you need to open any files and then only for those for the matching comments and inputter. I can't see any reason to re-process all 40k files just to do that.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Help recreating certain font	abc	Word	1	01-19-2015 07:16 PM
Guidance recreating a 28 page document	20GT	Word	22	11-10-2014 01:35 AM
Can't see comments on front page of Word document	Josaster	Word	0	10-15-2012 06:37 AM
How to Get Rid of Comments?	freschij	Outlook	0	08-02-2011 12:47 PM
Comments	davidcs	Word	0	01-05-2010 12:55 AM