I am trying to clean up text in Word before pasting into an HTML editor, and I keep getting what I believe are remnants of cross-reference information.
Attached is a sample document with the text, a1 b2. The numbers are superscript.
Notice how the superscript 2 has extra information once pasted into the HTML editor. I don't know how to find this information in Word where I can remove it.
Here's what I'm seeing when the text is pasted into HTML:
Code:
<p>a1 b<a name="_Ref111185707">2</a></p>
How can I remove this _Ref... information before copying from Word?
I was able to see the problem after extracting the .docx file, and looking at word/document.xml where the Bookmark and _Ref information is apparent. I'm attaching a screenshot of this file where I've highlighted the problem area.