View Single Post
 
Old 04-04-2024, 10:09 PM
DruidCtba DruidCtba is offline Windows 11 Office 2021
Novice
 
Join Date: Apr 2024
Posts: 4
DruidCtba is on a distinguished road
Default

One thing I noticed, buddy Charles Kenyon, it seems like there was a mix-up of text with the original structure of a PDF document in Image, after using ABBYY Finereader OCR and transferring it to WORD (saving it as a WORD document). When I'm in WORD, near those "black smudges next to some pages," WORD shows a drag icon, and if I press DEL at that point, it deletes the whole drawing, leaving only the text.

What I thought, sorry if I'm wrong, is that in WORD the text is in TIMES NEW ROMAN font, which isn't bad, and so here, I would solve the issues of the pages with two columns, not sure if that's the right term, to describe some double pages and others single, and then they would tell me how to change these double pages into single pages, as that alone would be a step forward in fixing this ebook.

The ebook's source is from the internet, I assume someone scanned the book entirely in image format and decided to make it available like that, which is why I used ABBYY Finereader, as I find its OCR one of the best I've dealt with. It has three areas on the screen, one on the left showing all original pages as images, on the right the OCR result, which turned out very well, and below, a larger part (zoomed in), but I'm not sure what part it shows there.

I thought that by saving the document after OCR as a new PDF, I would have a text PDF, but unfortunately not, it is a Text PDF, but with the same structure as the characters in the image. I swear I'm still wondering why on earth a program would do this, since I believed it would only save the text part, like a book you buy online, or at least that the program would give me that option when saving the OCR PDF, but I couldn't find that option in the program.

But thank you very much for your response, it clarified a lot of what I already suspected, but never thought I couldn't change the double pages into single pages, that somehow by saving it in WORD, it would create a pattern that WORD itself couldn't handle. In reality, they are double pages that I don't have access to editing.

Regards,

José Roberto Chaurais.

Last edited by DruidCtba; 04-04-2024 at 10:10 PM. Reason: translation from Portuguese to English
Reply With Quote