Microsoft Office Forums

Go Back   Microsoft Office Forums > >

Reply
 
Thread Tools Display Modes
  #1  
Old 11-06-2023, 06:58 PM
Xavier Xavier is offline Saved file as Webpage, filtered, and now can't open original file as RTF Windows XP Saved file as Webpage, filtered, and now can't open original file as RTF Office 2016
Advanced Beginner
Saved file as Webpage, filtered, and now can't open original file as RTF
 
Join Date: Jul 2023
Posts: 52
Xavier is on a distinguished road
Default Saved file as Webpage, filtered, and now can't open original file as RTF

I am working on a very large document (a court transcript) which I hope to post online.

It has been in a Word Document, but as it was scanned in there were numerous changes I had to make to clean up the OCR.

I have done that (which has taken ages) and saved it, then because I wanted to see what it would look like as a Word-created webpage, I saved it (under a different filename) to a different folder. It looked pretty crappy as all the bolds, page breaks, and other formatting seems to have been stripped out of it.



I figured this was no big deal, because I could open up the previous saved version, before I saved it as HTML, and just try a different way.

But when I open up the other versions, even in RTF mode, they still look like the webpage version, which is really frustrating.

Does anyone have any ideas to fix this?
Reply With Quote
  #2  
Old 11-07-2023, 04:30 AM
Charles Kenyon Charles Kenyon is offline Saved file as Webpage, filtered, and now can't open original file as RTF Windows 11 Saved file as Webpage, filtered, and now can't open original file as RTF Office 2021
Moderator
 
Join Date: Mar 2012
Location: Sun Prairie, Wisconsin
Posts: 9,140
Charles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant future
Default

Word is most comfortable working in its own file format.

When you save as a filtered web page you shrink the size of the web page by stripping out most of Word's document structure.
If you re-open that webpage in Word, it is starting from scratch and acts like it is a converted file, not a Word file.

Word is not designed to be a webpage creator/editor.
It is a word processor.

PDF files can be edited in Word, sort of…
How was the file created originally, and by which program? It could have been created from a scan or a picture taken by a phone camera. Those are pictures of words saved as pdfs. Just as you can have a picture of a car. You can see the car in the picture, but you can't change the timing of the engine in that picture. You can't change the order of text or otherwise edit it with a picture of text. Word can open such a file, but it can't edit it. You have a Word file that contains a picture of text rather than text.

In that case, you need to convert the picture to text. This is a process known as optical character recognition. This is built into Adobe Acrobat (but not the free Acrobat Reader) and is also in Office OneNote. Most scanner software comes with an OCR component as well.
How to OCR a PDF in OneNote
Once translated into text, it can be edited in Word but there will still be formatting anomalies.

If you simply want to write on the document (but not in it) you can add a Text Box floating on top of the document layer, whether or not it has been put through the OCR process.

It looks like you have gone through the OCR process and have text rather than a picture.

Web pages or Word documents that have been saved as PDF will not need the OCR process, they retain their text, although not all their Word structure and formatting. Documents created as PDF from other programs will likely be even more problematic.

Finally, documents converted from pdf (or really any other format) to Word can be tough to edit because the conversion process never has a one-to-one matching of how formatting is done under the hood. This means that a converted document will seldom be formatted in Word in a way that uses Word features well for that formatting. An example is multiple section breaks to change margins, where in Word you would simply change the paragraph indent. Margins and Indents in Word. Another example is that Word formatting of text is best done using Styles and those will not be used. It will all be direct formatting. That can make a huge difference in how easy it is to edit. The Importance of Styles in Microsoft Word.

If possible, find the file from which the pdf was created and edit that file, using the program that created it. Then if you need it in Word format and it is not, convert it directly to Word. This will cut out one conversion process and make for fewer editing problems.

When I really need the document in Word format and intend to do much editing, I create a new Word file and paste the content into it as plain text. Then I format it to match the original using Styles for the formatting as much as possible. This takes time; for me, it is worth it and saves a lot of frustration.

================================================== ============================
Cross-posted at:Changed format to Webpage, how to change it back? - Microsoft Community Hub
For cross-posting etiquette, please read: A Message to Forum Cross-Posters
Reply With Quote
  #3  
Old 11-07-2023, 06:45 PM
Xavier Xavier is offline Saved file as Webpage, filtered, and now can't open original file as RTF Windows XP Saved file as Webpage, filtered, and now can't open original file as RTF Office 2016
Advanced Beginner
Saved file as Webpage, filtered, and now can't open original file as RTF
 
Join Date: Jul 2023
Posts: 52
Xavier is on a distinguished road
Default

Thanks for your considered response which I found interesting.

It didn't quite answer my questions, and just in case anyone in the future has a similar issue to mine, I discovered that I needed to change the view back to print layout rather than Weblayout (accessible in the View tab). Because I was in Weblayout when working on it, it retained that setting whenever I opened up a new Word document.

In relation to your other comments, yes, I would love to have had the original PDF's but unfortunately I am dealing with more than 9,000 pages (four times the size of a bible) of typed content from the 1980s, so not much chance of that.

I've been saving it as an RTF file, but the next step when I finish checking all the OCR will be to try and find a good way to convert it all to HTML that retains most of the Word formatting. I've been using Convertio but I'm sure there are better options around.
Reply With Quote
  #4  
Old 11-07-2023, 08:10 PM
Charles Kenyon Charles Kenyon is offline Saved file as Webpage, filtered, and now can't open original file as RTF Windows 11 Saved file as Webpage, filtered, and now can't open original file as RTF Office 2021
Moderator
 
Join Date: Mar 2012
Location: Sun Prairie, Wisconsin
Posts: 9,140
Charles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant futureCharles Kenyon has a brilliant future
Default

Just keep the Word file as backup in case you want to edit it in the future.
.pdf and .rtf do not retain the Word structure.
Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Saved File Corrupted and Unable to Open sabrinapetros Word 0 10-31-2015 01:23 AM
Saved Word files do not show up in documents list, although "open file location" says they're there Earwicker Word 0 10-20-2015 07:40 AM
Saved file as Webpage, filtered, and now can't open original file as RTF Unable to open 2010 file saved in 2007 format Plas Project 3 10-10-2013 11:23 AM
Word file lost! Only opens old saved file, template melaniprkin Word 1 04-24-2013 05:37 PM
Saved file as Webpage, filtered, and now can't open original file as RTF excel cannot open the file .xlsx because the file format or file extension is not val teddysika Excel 1 11-22-2012 06:06 AM

Other Forums: Access Forums

All times are GMT -7. The time now is 11:31 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2024 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft