View Single Post
 
Old 06-28-2019, 12:38 PM
donlincolnmsof donlincolnmsof is offline Windows 7 64bit Office 2003
Advanced Beginner
 
Join Date: Oct 2011
Posts: 36
donlincolnmsof is on a distinguished road
Default Extract data from HTML File.

Hello

I'm looking for a macro that will extract data from an HTML file, Here is the code that did the job, but in the HTML file the search code is changed and now the macro doesn't work. The macro worked pretty fast, if any one can fix this i would really appreciate it, attached is the input file with raw data and the output file that shows what it should look like.

Thanks.

Code:
Application.ScreenUpdating = False
Dim StrOut As String, wdDoc As Document
With ActiveDocument.Range
  With .Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .Text = "^34\>[!\<]@\</a\>^13[ ]@\</h3\>*address^34\>[!\<]@\</div\>*phone^34\>[!\<]@\</div\>"
    .Replacement.Text = ""
    .Forward = True
    .Wrap = wdFindStop
    .Format = False
    .MatchWildcards = True
    .Execute
  End With
  Do While .Find.Found
    StrOut = StrOut & Trim(Split(Split(.Text, "</a>")(0), vbCr)(1)) & vbTab
    StrOut = StrOut & Split(Split(Split(.Text, "</a>")(1), "</div>")(0), ">")(2) & vbTab
    
    If InStr(.Text, "<span>") = 0 Then
      StrOut = StrOut & Split(Split(Split(.Text, "</a>")(1), "</div>")(1), ">")(1)
    End If
    StrOut = StrOut & vbCr
    
    

.MoveStart wdCharacter, InStr(.Text, Split(Split(Split(.Text, "</a>")(1), "</div>")(0), ">")(2))
.Collapse wdCollapseStart

    .Find.Execute
  Loop
End With
Set wdDoc = Documents.Add
wdDoc.Range.Text = StrOut
Application.ScreenUpdating = True
 
 
Dim I As Integer
For I = 1 To 100   ' Loop 100 times.
   Beep   ' Sound a tone.
Next I
Attached Files
File Type: doc sample output.doc (19.0 KB, 14 views)
File Type: doc input file.doc (290.0 KB, 14 views)

Last edited by macropod; 06-28-2019 at 06:09 PM. Reason: Added code tags
Reply With Quote