Microsoft Office Forums

Go Back   Microsoft Office Forums > >

 
 
Thread Tools Display Modes
Prev Previous Post   Next Post Next
  #9  
Old 12-15-2013, 01:35 AM
macropod's Avatar
macropod macropod is offline PDF to Excel Windows 7 32bit PDF to Excel Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 22,512
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

I'd suggest saving the PDF as a text file, then opening it Word and running the following Word macro:
Code:
Sub ParsePDFData()
Application.ScreenUpdating = False
With ActiveDocument.Range
  .Paragraphs.First.Range.Delete
  .Paragraphs.First.Range.Delete
  With .Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .Forward = True
    .Wrap = wdFindContinue
    .Format = False
    .MatchWildcards = True
    .Text = "^13[!^13]@^13[!^13]@^13[!^13]@^13^12^13[!^13]@^13"
    .Replacement.Text = "^p"
    .Execute Replace:=wdReplaceAll
    .Text = "^13[!^13]@^13[!^13]@^13[!^13]@^13^12^13"
    .Replacement.Text = "^p"
    .Execute Replace:=wdReplaceAll
    .Text = "[ ]{1,}^13"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "^13{2,}"
    .Replacement.Text = "^p"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "(^13[0-9]{1,}>)([!$]@)($[!^13]{1,})"
    .Replacement.Text = "\1^t\2^t^t\3"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "[0-9]{1,2}/[0-9]{1,2}/[0-9]{2,4}"
    .Replacement.Text = "^t^&"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "(EACH )(^t)(^t$)"
    .Replacement.Text = "\2\1\3"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "([0-9]{3,}^t[!^t]@^t)([!0-9])"
    .Replacement.Text = "\1^t^t\2"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "($[0-9.]{4,}) ($[!^13]{1,})"
    .Replacement.Text = "\1^t\2"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "^t[ ]{1,}"
    .Replacement.Text = "^t"
    .Execute Replace:=wdReplaceAll
    DoEvents
    .Text = "[ ]{1,}^t"
    .Replacement.Text = "^t"
    .Execute Replace:=wdReplaceAll
  End With
  .Copy
End With
Call Export
Application.ScreenUpdating = True
End Sub
 
Sub Export()
Dim xlApp As Object, xlWkBk As Object
Set xlApp = CreateObject("Excel.Application")
xlApp.Visible = True
xlApp.ScreenUpdating = False
Set xlWkBk = xlApp.Workbooks.Add
With xlWkBk.Sheets(1)
  .Range("A1").PasteSpecial Paste:=-4163 'xlPasteValues
  .Columns.AutoFit
  .Columns("A:A").ColumnWidth = 8
  .Range("A1").Select
End With
xlApp.ScreenUpdating = True
Set xlWkBk = Nothing: Set xlApp = Nothing
End Sub
The result will be an Excel worksheet containing the data, all nicely aligned. As coded, the only substantive difference is that the red values and cross-out values are shifted one column to the right, so that all the current prices are in the same column.

Note: With 800 pages of data to process, the code will take some time to complete.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]

Last edited by macropod; 12-15-2013 at 04:15 AM. Reason: Enhanced XL output
Reply With Quote
 

Tags
adobe, conversion, pdf



Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF to Excel [Excel 2007] Building Power Point Slides from data in an Excel Table bremen22 Excel Programming 1 08-07-2013 11:01 AM
Paste special an Excel range into Outlook as an Excel Worksheet charlesh3 Excel Programming 3 02-04-2013 04:33 PM
PDF to Excel Excel 2011 can't open old Excel 98 or Excel X files FLJohnson Excel 8 05-09-2012 11:26 PM
Excel 2007 custom ribbon not showing in Excel 2010 Paulzak Excel 2 02-17-2012 06:35 PM
PDF to Excel saving data in excel 2010 from excel 2003 johnkcalg Excel 1 02-06-2012 07:33 PM

Other Forums: Access Forums

All times are GMT -7. The time now is 06:47 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft