#1
|
|||
|
|||
Can't Read .htm Files
I am saving a web page as .htm (HTML only), and attempting to read it in Word 2007. I used to be able to do read it by right clicking on the file, and choosing to open with Word. At this point, I could extract information using VBA. Today, when I try this, Word opens and tells me it has 0 characters. If I now try to save it as a web page, the length of the saved file is still substantial; moreover, I can open the saved file in IE and it shows a lot of content. Obviously, the information is still present in Word even though it refuses to display it. Can any one explain what's going on? How can I get at the content with VBA? I've observed that this is not occurring at a different web site; there I can follow the same steps and see the page when I open it in Word. Using a hex editor, I've compared a file which opens successfully in Word with one that doesn't At least the first 200 characters seem to be consistent -- I haven't checked them all. |
#2
|
|||
|
|||
There are a lot of situations where Word is unable to display the content of a web page. However, I can't say I've ever seen that message (has 0 characters) before.
As a workaround, you can try selecting and copying the web page while it is displayed in your web browser, then pasting it into a blank Word document. You could also try saving the save page as a Text file (with a .txt extension) and then opening it in Word. Your existing VBA code is not likely to work though. |
#3
|
|||
|
|||
Can't Read .htm Files
I've finally figured out what the problem is, by examining the structure of the html code. The code that Word will not read has multiple <html> and <body> tags, improperly nested and terminated. The code that works is not perfect, either -- it contains two termination tags for both <html> and <body>. I guess I didn't realize just how sloppy a site's code can be and still be displayed adequately. Apparently Word's rendering of html is less tolerant of errors than IE's. I've been able to get Word to read these files by rewriting them to eliminate the offending multiple tags. Thanks for all your comments and suggestions.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Can no longer read and save WORD documents as TXT files | bobk544 | Word | 2 | 11-24-2013 07:22 AM |
Opening files with MS Word using URL with IIS 6 for read AND write. | wyattbiker | Word | 0 | 03-21-2013 07:32 AM |
Problems with files. | davhar | Word | 4 | 12-22-2010 05:40 AM |
Read-Only office files problem HELP | joshg@found.ksu.edu | Office | 0 | 02-19-2010 02:19 PM |
License and Activation, General... Plus, can Office 2000 read/edit Office 2003 files? | Orlandes | Office | 9 | 09-24-2009 07:13 PM |