#1
|
|||
|
|||
Indention below paragraph mark...
Hello friends,
I making ocr of scanned document and indention and paragraphs must be as in the original document but I have some problems in doing this. So, after making ocr below some paragraph which's length is not standard size(full size), below I need indention but cannot think how it is possible. Could you help with vba or some other ways accomplishing this? |
#2
|
||||
|
||||
See the Cleaning up Text Pasted from Websites, E-mails, PDFs etc. 'Sticky' thread at the top of the Word forum: https://www.msofficeforums.com/word/...-pdfs-etc.html
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
|||
|
|||
Thanks but it does not work as my word document is not from pdf or web site. It is from tesseract OCR. This OCR I quiet good recognizing characters but with some errors.
So, I need a new layout (indention) in paragraph where previous paragraph is no full length). Also, I need deleting paragraph marks only for the page last paragraphs.(where the page ends new page starts). Thanks, in advanced... |
#4
|
||||
|
||||
The source is irrelevant. In your attachment, every line has been rendered as a separate paragraph - which is exactly the same as sometimes happens with data extracted from PDFs via OCR. As long as your document remains formatted that way, you will not be able to get the layout you desire.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#5
|
|||
|
|||
Could this be done by chracarter counting so if the characters on line is less then for example 55, then make the new paragraph indention??
|
#6
|
||||
|
||||
You could use a wildcard Find/Replace to do something like that, inserting an empty paragraph after every line containing less than 56 characters. For example:
Find = ^13[!^13]{1,55}^13 Replace = ^&^p
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#7
|
|||
|
|||
ANd how to implement is less than 56 characters and ends with "."
Is it possible?? |
#8
|
||||
|
||||
How would that be relevant? Paragraphs never end with '.' - they only ever end with paragraph breaks.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#9
|
|||
|
|||
Not the paragraph but the word. (which last 55 symbol is the ".")
|
#10
|
||||
|
||||
And what about sentences that have less than 56 characters? Or sentences with abbreviations ending in '.' but those abbreviations don't end the sentence? You really haven't explained what you're trying to achieve.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#11
|
|||
|
|||
I need: If the sentences has less then 56 characters and last charachter is ".", then insert empthy paragraph below. If not I have another find/replace feature (([a-z])^13; \1) and i will use it.
|
#12
|
||||
|
||||
There is no reliable way for Find/Replace (or VBA) to know how long a sentence is.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#13
|
|||
|
|||
When I OCR some pdf's, if the source is in good quality, the paragraph indentions(new paragraphs, layouts) starts as an empty paragraph for this instances I know how to use find and replace, but sometimes (lines paragraphs) are not separated from below or above paragraphs so the only possibility that I could differentiate it is the lines which is not full length (less then 56 characters and ends with ".") therefore I need some find/replace code for that purposes, Is it possible?
|
#14
|
||||
|
||||
I already gave you the Find/Replace code for lines less than 56 characters. Use it.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Paragraph mark added on the top of every second page | aptbs00 | Word | 6 | 09-14-2018 03:53 PM |
Replace space with paragraph mark | jeffreybrown | Word VBA | 8 | 08-22-2018 03:31 PM |
can't delete paragraph mark at end of document | kb | Word | 10 | 10-06-2017 02:32 PM |
Final paragraph mark | Caroline | Word | 2 | 02-22-2011 10:39 AM |
Adding a paragraph mark by style? | Jazz43 | Word | 0 | 02-14-2011 06:08 AM |