#1
|
|||
|
|||
Extract Between the Parens
Attached is SCHEMA content sample and an example of the needed output.
Need the ELEMENT NAME extracted out and the associated TYPE if one is present. The word file also includes the other 2 mods I used to get content shaved down to this point.. If there's a better, faster way to accomplish this with one step rather than now being on the 3rd step - I'm open to any edits or chunking altogether if needed.. The extracted output can go into a Word table (inside or outside of the source) or thrown into a new Excel file. Thanks! |
#2
|
||||
|
||||
Quote:
Try: Code:
Sub Demo() Application.ScreenUpdating = False Dim MyData As DataObject With ActiveDocument.Range .Font.Hidden = True With .Find .ClearFormatting .Replacement.ClearFormatting .Format = True .Font.Hidden = True .Replacement.Font.Hidden = False .Forward = True .MatchWildcards = True .Wrap = wdFindContinue .Text = "\<xs:element name[!\>]@\>" .Replacement.Text = "^&" .Execute Replace:=wdReplaceAll .Format = False .Text = "m[ai][nx]Occurs[!\>]@\>" .Replacement.Text = ">" .Execute Replace:=wdReplaceAll .Text = "\<xs:element name=([!\>]@)type=([!\>]@)\>" .Replacement.Text = "\1^t\2^p" .Execute Replace:=wdReplaceAll .Text = "\<xs:element name=([!\>]@)\>" .Replacement.Text = "\1^t^p" .Execute Replace:=wdReplaceAll .Text = " ([^13^t])" .Replacement.Text = "\1" .Execute Replace:=wdReplaceAll .Text = Chr(34) .Replacement.Text = "" .Execute Replace:=wdReplaceAll End With .Copy Set MyData = New DataObject MyData.GetFromClipboard .Text = vbNullString .Text = MyData.GetText With .Find .Text = "^13[ ^t]*^13" .Replacement.Text = "^p" .Execute Replace:=wdReplaceAll .Text = "[^13]{2,}" .Replacement.Text = "^p" .Execute Replace:=wdReplaceAll End With If .Characters.First <> vbCr Then .InsertBefore vbCr .InsertBefore "NAME" & vbTab & "TYPE" End With ActiveDocument.Range.ConvertToTable vbTab, 2 Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
|||
|
|||
Quote:
Forgive me, it's been a long time, I've forgotten: Where I go to add a refc to the MS Forms Object? I was thinking it was under File>Options>?? and there was a long list of items that could be check-marked from a list... (what I meant by the "2 mods" was that the example file that I had attached included 2 modules visible in the VBA side, that showed how I had come the current snip of content) |
#4
|
||||
|
||||
In the VBE - Tools|References.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#5
|
|||
|
|||
Ah! Found my checkbox list -- but not finding it in the list for some reason -- looked in both the MS area and Microsoft (see attachment)
Any ideas? |
#6
|
|||
|
|||
Here's the ones I DO have turned on from previous projects (I moved them to the top a long time ago so I could quickly see what was active) (see attached)
|
#7
|
||||
|
||||
Not sure why it's missing. Try:
Code:
Sub Demo() Application.ScreenUpdating = False With ActiveDocument.Range .Font.Hidden = True With .Find .ClearFormatting .Replacement.ClearFormatting .Format = True .Font.Hidden = True .Replacement.Font.Hidden = False .Forward = True .MatchWildcards = True .Wrap = wdFindContinue .Text = "\<xs:element name[!\>]@\>" .Replacement.Text = "^&" .Execute Replace:=wdReplaceAll .Format = False .Text = "m[ai][nx]Occurs[!\>]@\>" .Replacement.Text = ">" .Execute Replace:=wdReplaceAll .Text = "\<xs:element name=([!\>]@)type=([!\>]@)\>" .Replacement.Text = "\1^t\2^p" .Execute Replace:=wdReplaceAll .Text = "\<xs:element name=([!\>]@)\>" .Replacement.Text = "\1^t^p" .Execute Replace:=wdReplaceAll .Text = " ([^13^t])" .Replacement.Text = "\1" .Execute Replace:=wdReplaceAll .Text = Chr(34) .Replacement.Text = "" .Execute Replace:=wdReplaceAll End With End With With ActiveDocument.Range .Cut .PasteSpecial DataType:=wdPasteText With .Find .MatchWildcards = True .Wrap = wdFindContinue .Text = "^13[ ^t]*^13" .Replacement.Text = "^p" .Execute Replace:=wdReplaceAll .Text = "[^13]{2,}" .Replacement.Text = "^p" .Execute Replace:=wdReplaceAll End With End With With ActiveDocument.Range If .Characters.First <> vbCr Then .InsertBefore vbCr .InsertBefore "NAME" & vbTab & "TYPE" .ConvertToTable vbTab, 2 End With Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#8
|
|||
|
|||
Well, tried to run it on my 94 pg doc and got Run-time error 98 - (with just a box but no highlighted error to know where)
Thought, hmm- maybe it's overwhelmed, so I scaled the word file down to 2 pgs Got it again, and this time it provided a highlighted area.. (see attached) |
#9
|
||||
|
||||
OK, try:
Code:
Sub Demo() Application.ScreenUpdating = False With ActiveDocument.Range .Font.Hidden = True With .Find .ClearFormatting .Replacement.ClearFormatting .Format = True .Font.Hidden = True .Replacement.Font.Hidden = False .Forward = True .MatchWildcards = True .Wrap = wdFindContinue .Text = "\<xs:element name[!\>]@\>" .Replacement.Text = "^&" .Execute Replace:=wdReplaceAll DoEvents .Text = "*^13" .Replacement.Text = "^p" .Execute Replace:=wdReplaceAll DoEvents .Format = False .Text = "m[ai][nx]Occurs[!\>]@\>" .Replacement.Text = ">" .Execute Replace:=wdReplaceAll DoEvents .Text = "\<xs:element name=([!\>]@)type=([!\>]@)\>" .Replacement.Text = "\1^t\2^p" .Execute Replace:=wdReplaceAll .Text = "\<xs:element name=([!\>]@)\>" .Replacement.Text = "\1^t^p" .Execute Replace:=wdReplaceAll DoEvents .Text = " ([^13^t])" .Replacement.Text = "\1" .Execute Replace:=wdReplaceAll .Text = Chr(34) .Replacement.Text = "" .Execute Replace:=wdReplaceAll .Text = "^13[ ^t]*^13" .Replacement.Text = "^p" .Execute Replace:=wdReplaceAll .Text = "[^13]{2,}" .Replacement.Text = "^p" .Execute Replace:=wdReplaceAll DoEvents End With If .Characters.First <> vbCr Then .InsertBefore vbCr .InsertBefore "NAME" & vbTab & "TYPE" .ConvertToTable vbTab, 2 End With Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#10
|
|||
|
|||
Well, no errors -- seemed to run through without a hitch with one huge exception:
All content in the Word doc is gone. Not sure where it went. I forgot to mention this was also occurring with the previous runs.. ? Would it help to use something like this to have it dump the results into a new docx? (this was old code I used a long time ago to take something from one Word doc and dump it into a new doc) Code:
Sub CopyHighlightedTextToADifferentDoc() 'THIS FINDS HIGHLIGHTED TEXT AND PLACES A COPY OF IT ON A SEPARATE NEW DOCUMENT IN LIST FORM FOR REVIEW Dim ThisDoc As Document Dim ThatDoc As Document Dim r As Range Set ThisDoc = ActiveDocument Set r = ThisDoc.Range Set ThatDoc = Documents.Add With r.Find .Text = "" .Highlight = True Do While .Execute(Forward:=True) = True ThatDoc.Range.InsertAfter r.Text & vbCrLf r.Collapse 0 Loop End With End Sub 'All highlighted text in the ActiveDocument are copied (with an added paragraph mark to separate them) to a new document. |
#11
|
||||
|
||||
All versions of the code have run just fine on the content in the attachment to your first post, generating the same 2-column output table as the final output. If you're not getting any output, adding a new document and sending that to it isn't going to help. Evidently, your real data differ from what you posted.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#12
|
|||
|
|||
Hmm, no that doesn't make sense -- I even tried downloading the actual file that I have attached in the first post and saved it to a diff location - then pasted the last batch of code into a new Module 3 -- ran it and everything within the file disappeared and it shows it to be a 1 page document.
Alternatively, I tried moving the code to the main ThisDocument area and tried to run it again and the exact same result occurred -- everything was deleted and only a single white page was left behind. Where did you place the code when you tested with the attached file? Can you attach the file you used to test with - where it successfully extracted? Do you think it has something to do with that MS Forms object not being present? This is really odd -- makes no sense |
#13
|
|||
|
|||
Here's what the file looks like after I run the code (see attached)
|
#14
|
||||
|
||||
Try changing:
With .Find to: With Selection.Find
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#15
|
||||
|
||||
This, too, seems to work OK:
Code:
Sub Demo() Application.ScreenUpdating = False With ActiveDocument.Range With .Find .ClearFormatting .Replacement.ClearFormatting .Format = False .Forward = True .Wrap = wdFindContinue .MatchWildcards = True .Text = "\>*^13" .Replacement.Text = ">^p" .Execute Replace:=wdReplaceAll DoEvents .Text = "m[ai][nx]Occurs[!\>]@\>" .Replacement.Text = ">" .Execute Replace:=wdReplaceAll DoEvents .Text = "\<xs:element name=([!\>]@)type=([!\>]@)\>" .Replacement.Text = "\1^t\2^p" .Execute Replace:=wdReplaceAll .Text = "\<xs:element name=([!\>]@)\>" .Replacement.Text = "\1^t^p" .Execute Replace:=wdReplaceAll DoEvents .Text = " ([^13^t])" .Replacement.Text = "\1" .Execute Replace:=wdReplaceAll .Text = Chr(34) .Replacement.Text = "" .Execute Replace:=wdReplaceAll .Text = "^13[ ^t]*^13" .Replacement.Text = "^p" .Execute Replace:=wdReplaceAll .Text = "[^13]{2,}" .Replacement.Text = "^p" .Execute Replace:=wdReplaceAll DoEvents End With If .Characters.First <> vbCr Then .InsertBefore vbCr .InsertBefore "NAME" & vbTab & "TYPE" .ConvertToTable vbTab, 2 End With Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
Tags |
extract, parens |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extract name and address. | donlincolnmsof | Word VBA | 1 | 11-10-2017 12:59 PM |
VBA code to extract strings | twozedz | Excel Programming | 1 | 05-15-2016 06:00 AM |
Extract tables as images | didijaba | Word VBA | 2 | 05-06-2014 06:14 PM |
Extract Photographs | The Gap | PowerPoint | 4 | 04-22-2010 07:00 AM |
Extract from String using Wildcard | whousedmy | Word | 0 | 05-21-2009 01:35 AM |