Thread: PDF to Excel
View Single Post
 
Old 12-24-2013, 06:25 PM
macropod's Avatar
macropod macropod is online now Windows 7 32bit Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 22,363
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

Quote:
Originally Posted by shanemarkley View Post
There are a couple fields that are still off (mainly the first item under each heading)
Easily enough fixed:
Code:
Sub ParsePDFData()
Application.ScreenUpdating = False
With ActiveDocument.Range
  .Paragraphs.First.Range.Delete
  .Paragraphs.First.Range.Text = "Code" & vbTab & "Product" & vbTab & "Size" & vbTab & "Sales Start" & vbTab & "Sales End" & vbTab & "Price" & vbTab & "Old Price" & vbCr
  With .Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .Forward = True
    .Wrap = wdFindContinue
    .Format = False
    .MatchWildcards = True
    .Text = "^13[!^13]@^13[!^13]@^13[!^13]@^13^12^13[!^13]@^13"
    .Replacement.Text = "^p"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "[ ]{1,}^13"
    .Execute Replace:=wdReplaceAll
    .Text = "^13{2,}"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "^13[0-9]{1,2}/[0-9]{1,2}/[0-9]{4} Page [0-9]{1,4}*notice.^13"
    .Replacement.Text = ""
    .Execute Replace:=wdReplaceAll
    .Text = "(^13)([A-Z][!^13]@)^13([A-Z0-9][!$^13]@)^13([0-9]{1,}) ([!$]@$[0-9.]{4,}>)"
    .Replacement.Text = "\1\4 \2 \3 \5"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "(^13[0-9]{1,}>)([!$]@)($[!^13]{1,})"
    .Replacement.Text = "\1^t\2^t^t^t^t\3"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "([0-9]{1,2}/[0-9]{1,2}/[0-9]{2,4}) ([0-9]{1,2}/[0-9]{1,2}/[0-9]{2,4}) ^t^t^t"
    .Replacement.Text = "^t^t\1^t\2"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "(EACH )(^t)"
    .Replacement.Text = "\2\1"
    .Execute Replace:=wdReplaceAll
    .Text = "([0-9.]{2,5}[ ML]{2,4})(^t)"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "($[0-9.]{4,}) ($[!^13]{1,})"
    .Replacement.Text = "\1^t\2"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "^t[ ]{1,}"
    .Replacement.Text = "^t"
    .Execute Replace:=wdReplaceAll
    .Text = "[ ]{1,}^t"
    .Execute Replace:=wdReplaceAll
    DoEvents
  End With
  .Copy
End With
Call Export
Application.ScreenUpdating = True
End Sub
 
Sub Export()
Dim xlApp As Object, xlWkBk As Object
Set xlApp = CreateObject("Excel.Application")
With xlApp
  .Visible = True
  .ScreenUpdating = False
  Set xlWkBk = .Workbooks.Add
  With xlWkBk.Sheets(1)
    .Range("A1").PasteSpecial Paste:=-4163 'xlPasteValues
    .Columns.AutoFit
    .Columns("A:A").ColumnWidth = 8
    .Range("A2").Select
    .Columns("C:C").HorizontalAlignment = xlRight
  End With
  With .ActiveWindow
      .SplitColumn = 0
      .SplitRow = 1
      .FreezePanes = True
  End With
  .ScreenUpdating = True
End With
Set xlWkBk = Nothing: Set xlApp = Nothing
End Sub
Note: There's a few extra enhancements to deal with wrapped lines in the source that come out as disjointed paragraphs in the text file. Plus there's a header row for the output.
Quote:
My ultimate goal is to run a script that will pull all of the updated prices for each item and copy it to Cost/Inventory Sheet. Is this something that you could help me write as well?
Does that mean you're really only concerned with the items that have the two prices?
Quote:
I am thinking it would go something like this:

1. Search through the outputted text for the item "code". Ex. Jim Beam would be "4079".
2. Once the item is found in the outputted text, see if there is any changes in the "Price" field based on the price in the Cost/Inventory Sheet.
3. If that price is changed, update the price in the Cost/Inventory Sheet and highlight it to show there was a change.
4. Repeat this process for each item in the inventory
As coded, the macro sends its output to a new Excel file. Obviously some changes would be required to output the results to a file that already exists. Indeed, in such a scenario it might be better to run the macro from Excel, let Excel automate a Word session for the parsing, then just do the necessary updating.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote