View Single Post
 
Old 08-25-2019, 12:59 AM
donlincolnmsof donlincolnmsof is offline Windows 7 64bit Office 2003
Advanced Beginner
 
Join Date: Oct 2011
Posts: 36
donlincolnmsof is on a distinguished road
Default Extract data from a Word file.

Hello

I'm looking for a macro that will extract data from an HTML file, Here is the code that did the job, but in the HTML file the search code is changed and now the macro doesn't work. The macro worked pretty fast, if any one can fix this i would really appreciate it, attached is the input file with raw data and the output file that shows what it should look like.

Below are the unique search keys where the data appears.

<span itemprop="streetAddress">
<a href="/name/
<dt class="col-md-4">Phone Number</dt>
<dt class="col-md-4">Email Address

Thanks.


Macro that was written earlier
===============================

Dim oSource As Document
Dim oDoc As Document
Dim oRng As Range, oParaRng As Range
Dim lngP As Long
Dim sName As String, sAdd As String, sPhone As String, sExtract As String
Set oSource = ActiveDocument
Set oRng = oSource.Range
Set oDoc = Documents.Add
oDoc.Range.Font.Name = "Courier New"
oDoc.Range.Font.Size = 10
With oRng.Find
Do While .Execute(FindText:="<div class=" & Chr(34) & "c-people-result__address" & Chr(34) & ">")
oRng.MoveEnd wdParagraph, 2
oRng.MoveStart wdParagraph, -4
For lngP = 1 To oRng.Paragraphs.Count
Select Case lngP
Case 1
Set oParaRng = oRng.Paragraphs(lngP).Range
oParaRng.End = oParaRng.End - 1
sName = Trim(Replace(oParaRng.Text, "</a>", ""))
sExtract = sName
Case 4
Set oParaRng = oRng.Paragraphs(lngP).Range
oParaRng.End = oParaRng.End - 1
oParaRng.MoveStartUntil ">"
oParaRng.Start = oParaRng.Start + 1
sAdd = Replace(oParaRng.Text, "</div>", "")
sExtract = sExtract & vbTab & sAdd
Case 5
Set oParaRng = oRng.Paragraphs(lngP).Range
oParaRng.End = oParaRng.End - 1
oParaRng.MoveStartUntil "("
sPhone = Replace(oParaRng.Text, "</div>", "")
sExtract = sExtract & vbTab & sPhone
End Select
Next lngP
oDoc.Range.InsertAfter Trim(sExtract) & vbCr
oRng.Collapse 0
Loop
End With
Attached Files
File Type: doc input file 0825.doc (122.0 KB, 11 views)
File Type: doc output data 0825.doc (23.5 KB, 11 views)
Reply With Quote