![]() |
|
|
|
#1
|
|||
|
|||
|
Thank you for your response. All of the conversion formats I have seen thus far are too inconsistent even to write code to clean up. PDF to Excel is the program I found that comes the closest, but it puts the retail pricing and sales pricing in different columns randomly. I attached an example of what I am referring to.
The ideal situation is to include VBA code to download a fresh copy of teh PDF every month, but that is going to be more of a nice-to-have feature vs a necessity for this project. |
|
#2
|
||||
|
||||
|
This you call "random"? Actually this looks cleaner than I expected. I do see what you mean about it not putting the important data in consistent columns, but your program should be able to sort that out with very little difficulty; it's just a matter of determining the pattern and explaining it to your program.
Let's see what we can figure out. Each table starts with the word "Code", always in column 1; so your program can find the start of each table by searching for the next appearance of "Code" in col 1—or, if you determine that it sometimes appears in other columns you can look elsewhere too. In the same row, the other column headings are consistent even though they're in varying columns, so your program can determine where to find the Code, the Price and the other data. The code and the description are sometimes in separate columns and sometimes combined, but that's easy to figure out. And the column for the sale price has no header; but the column is missing only when there is no sale, and when it's present it's always between Sales End and Price. Figuring out the length of the table is the only tricky part, and that only slightly; in most cases the end of the table is marked by a blank cell in the Code column, but in two of the tables there are sections where the Code and description are in the next column. Maybe if both columns are empty, that's the end of a table? No, in more than one table the footnote ("Retail price include...") is up against the table with no blank space intervening. Ah, here we go: In the Price column, "Page n" always appears at the end of the table. Ok, let me play with this and come up with a way to turn this data into something more rational. Or maybe this gives you the right idea without a demo? |
|
#3
|
||||
|
||||
|
There, take a look at that. It takes a bit of work to write this sort of thing, and of course every time the publisher changes the layout you may have to adjust your work, but this is the general idea; and to my way of thinking it's worth the effort if you're going to have to run it multiple times.
|
|
#4
|
|||
|
|||
|
Bob, that is definitely what I am looking for! Could you let me know the code you used to clean it up that way so I could try running it through the entire file to see if it works? Thank you again very much for your help with this!
|
|
| Tags |
| adobe, conversion, pdf |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
[Excel 2007] Building Power Point Slides from data in an Excel Table
|
bremen22 | Excel Programming | 1 | 08-07-2013 11:01 AM |
| Paste special an Excel range into Outlook as an Excel Worksheet | charlesh3 | Excel Programming | 3 | 02-04-2013 04:33 PM |
Excel 2011 can't open old Excel 98 or Excel X files
|
FLJohnson | Excel | 8 | 05-09-2012 11:26 PM |
| Excel 2007 custom ribbon not showing in Excel 2010 | Paulzak | Excel | 2 | 02-17-2012 06:35 PM |
saving data in excel 2010 from excel 2003
|
johnkcalg | Excel | 1 | 02-06-2012 07:33 PM |