Hello, need some advise:
Working on Canadian Gazettes which are in a PDF format, have a structured filename and number some 14700. The filenames do give the of issue date, but NO indication of page number. Typical on the 2nd page is a page number, (2, ii, or carried on from the last issue (typical)) but being up to 150 years old some are unreadable. Each issue averages 43 pages, but not all are complete as some have duplicated and/or missing pages, and there are missing issues. I have not found an easy method of identifying which issues carry on the numbering sequence, so the last/next issue can be a guess. Typically, there are 70 issues in a year with at least 52 using the sequence.
Currently I do not think I need 2/3s of these, but the annual indexes list page numbers for indexed items that I am interested in, so I need page numbers to find the PDF file. Because of the missing/extra pages I acknowledge that I will not be able get the PDF to open to the right page.
What I need a structure/method that allows me to find the issue/PDF page given a page number and will adjust as corrected page numbers are entered. I also understand that including calculated fields can create problems.
What I Have now is:
StartPage Integer Actual first page number.
NumberofPages Byte That are in the PDF, as given by a PDF Summary report
Correction Byte A value (Typical is 0), that sums the missing or duplicate pages.
StartPage can be calculated by the summing these three fields for last “standard” issue.
Correction can be calculated by next issues StartPage minus this issues StartPage and NumberofPages.
Is there a better way of doing this, then marking which issues have checked data and recalculating the issues after the now corrected issue?
Thanks for looking Neil