![]() |
|
#1
|
|||
|
|||
![]() HOW do I sort text that isn’t in Table or List form? OR, to extract specific text from paragraphs? I have been tasked to extract the names of books and authors from a huge Word document that is in paragraph format, not a list nor a table. The document was made by copying and pasting text from a website into a Word document. The Word document consists of paragraphs of short summaries of books and stories by many different authors. I don’t need the description of the works, just the name of the books and the authors. The plan is to create an index showing users where on the website a specific work may be found and to identify the authors whose work the site contains. Is there a way to do this other than by manually copying and pasting pertinent sections? Many thanks. |
#2
|
|||
|
|||
![]()
If the file has been prepared correctly, there will be separate styles for booknames and authors … even without that, there's a chance there might be some recognisable structure. Can you post a few sample entries?
|
#3
|
|||
|
|||
![]()
Text pasted from web pages, or imported or scanned, is often formatted like junk. Web pages, though, may contain styles, maybe.
If I were the one constructing this document, I would have formatted the names using styles. Try clicking on a book name or author and press Shift-F1 to display styles. At the bottom of the pane click on the option to show the style source. If there is a distinctive (character) style for these items, it will show at the top of the pane. If there is such a style, you can search for it and the Find will select the next set of text formatted using that style. You can record a macro to copy that text and find the next. |
#4
|
|||
|
|||
![]()
I was wondering whether any textual consistency, such as {name of author}. {name of book} might, just perhaps, make the content amenable to grep and a text editor. Then OP could add style tags, move the content back to Word and use search/replace to apply styles.
|
#5
|
|||
|
|||
![]()
And hope the style is not also used for other (non-author) text...
Using Word since 1986 - ah, Word 3. Those were the days. |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
extract text with formula | s7y | Excel | 7 | 06-05-2013 06:18 AM |
![]() |
Rattykins | Word VBA | 4 | 06-27-2012 10:02 PM |
![]() |
donlincolnmsof | Word VBA | 12 | 06-19-2012 05:21 PM |
Extract numbers from a text string | aleale97 | Excel | 4 | 02-10-2011 10:33 AM |
Can I Extract a Page from Word and Make a New DOCX File? | tatihulot | Word | 1 | 06-20-2010 11:38 PM |