Page 1 of 2 12 LastLast
Results 1 to 15 of 22
  1. #1
    diegomarino is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Feb 2018
    Posts
    434

    Import tables from PDF

    hi, i have a pdf in which there are one of two tables that i need to import.

    I've made already a procedure that:

    1) using excel with macro i import the table i need (then i make some data polish)
    2) i import the excel in access

    Now, there are few problems:



    A) table names sometimes changes and i have to make it manually
    B) since sometimes i need to import more than one table i can have some problems.

    Is there any way to import a pdf table directly in access?

  2. #2
    Micron is offline Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,425
    Not that I know of, but it's interesting that you can read a pdf and add it to a spreadsheet. It would seem that if you have code to do that, you might be able to modify it so that it works in Access.

    If you are importing into Access in step 2, why can't you just create a new import spec and add it to your existing Access table? Then the table name won't matter.
    Or perhaps use Automation to get Access to bring the data over. Automation might even be able to run your Excel code and in doing so, pass the name of the Access table so that it has the name you want it to have.
    The more we hear silence, the more we begin to think about our value in this universe.
    Paraphrase of Professor Brian Cox.

  3. #3
    diegomarino is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Feb 2018
    Posts
    434
    thanks,
    for "table name" i meant the name of the table in PDF, not in access.
    I was wondering if there were a way to import data from PDF to MS access DIRECTLY.

  4. #4
    Micron is offline Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,425
    I was wondering if there were a way to import data from PDF to MS access DIRECTLY.
    I'll reiterate:
    "It would seem that if you have code to do that, you might be able to modify it so that it works in Access."
    Maybe someone here has code that will do that in Access and eventually they may chime in. Or you could post what you have and perhaps that will prevent you getting a lot of questions about your object names and such. Or you could post your db and wb and eliminate a lot of questions that way.
    The more we hear silence, the more we begin to think about our value in this universe.
    Paraphrase of Professor Brian Cox.

  5. #5
    Micron is offline Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,425
    using google, I found this article, but it is for a Mac.
    youtube.com/watch?v=gMYdBfl61is
    That link led me to this
    https://www.nairaland.com/4304376/ex...anned-pdf-form
    I got 108 hits on my search though.
    Some suggest putting the pdf table into a csv because that seems doable. Then using the csv to create a table.
    The more we hear silence, the more we begin to think about our value in this universe.
    Paraphrase of Professor Brian Cox.

  6. #6
    diegomarino is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Feb 2018
    Posts
    434
    thanks both.
    The code i use is in excel, but i just registered a macro importing a pdf file, that is allowed in excel, so i don't think it will be helpful in access.

    this is the routine in excel, don't look at table references that i messed up.
    I think maybe i should post it in an excel forum, cause i'm thinking it's impossible to jump directly in access

    Code:
    Sub ImportaDati()   Dim qr As Queries
    ' ImportaDati Macro
       Dim FL As String
       Dim Num As Integer
       
       Dim qry As WorkbookQuery 'elimino le connessioni precedenti
    
    
       For Each qry In ThisWorkbook.Queries
          
          qry.Delete
    
    
       Next qry
    
    
         FL = Range("L1")
        
      If Range("L3") = "autostrade" Then
       
          ActiveWorkbook.Queries.Add Name:="Table006 (Page 2) (2)", Formula:= _
          "let" & Chr(13) & "" & Chr(10) & "    Origine = Pdf.Tables(File.Contents(""C:\Users\" & Environ("username") & "\OneDrive - Indipendente\Fisco\telepass\tele\" & FL & """), [Implementation=""1.3""])," & Chr(13) & "" & Chr(10) & "    Table006 = Origine{[Id=""Table006""]}[Data]," & Chr(13) & "" & Chr(10) & "    #""Intestazioni alzate di livello"" = Table.PromoteHeaders(Table006, [PromoteAllScalars=true])," & Chr(13) & "" & Chr(10) & "    #""Modificato tipo"" = Table.Tra" & _
          "nsformColumnTypes(#""Intestazioni alzate di livello"",{{""DATA"", type text}, {""ORA USC."", type text}, {""SRV"", type text}, {""DESCRIZIONE"", type text}, {""Column5"", type text}, {""CLASSE"", type text}, {""IMPORTO"", type number}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Modificato tipo"""
          ActiveWorkbook.Worksheets.Add
           With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
               "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table006 (Page 2) (2)"";Extended Properties=""""" _
               , Destination:=Range("$A$1")).QueryTable
               .CommandType = xlCmdSql
               .CommandText = Array("SELECT * FROM [Table006 (Page 2) (2) ]")
               .RowNumbers = False
               .FillAdjacentFormulas = False
               .PreserveFormatting = True
               .RefreshOnFileOpen = False
               .BackgroundQuery = True
               .RefreshStyle = xlInsertDeleteCells
               .SavePassword = False
               .SaveData = True
               .AdjustColumnWidth = True
               .RefreshPeriod = 0
               .PreserveColumnInfo = True
               .ListObject.DisplayName = "Table006__Page_2"
               .Refresh BackgroundQuery:=False
           End With
           
           On Error GoTo Extsub ' vedo se c'è anche un'altra tabella che si chiama table 007 e viene dopo la principale che ho appena importato sopra
           
           ActiveWorkbook.Queries.Add Name:="Table007 (Page 2) (2)", Formula:= _
          "let" & Chr(13) & "" & Chr(10) & "    Origine = Pdf.Tables(File.Contents(""C:\Users\" & Environ("username") & "\OneDrive - Indipendente\Fisco\telepass\tele\" & FL & """), [Implementation=""1.3""])," & Chr(13) & "" & Chr(10) & "    Table006 = Origine{[Id=""Table006""]}[Data]," & Chr(13) & "" & Chr(10) & "    #""Intestazioni alzate di livello"" = Table.PromoteHeaders(Table006, [PromoteAllScalars=true])," & Chr(13) & "" & Chr(10) & "    #""Modificato tipo"" = Table.Tra" & _
          "nsformColumnTypes(#""Intestazioni alzate di livello"",{{""DATA"", type text}, {""ORA USC."", type text}, {""SRV"", type text}, {""DESCRIZIONE"", type text}, {""Column5"", type text}, {""CLASSE"", type text}, {""IMPORTO"", type number}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Modificato tipo"""
          ActiveWorkbook.Worksheets.Add
           With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
               "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table006 (Page 2) (2)"";Extended Properties=""""" _
               , Destination:=Range("$A$1")).QueryTable
               .CommandType = xlCmdSql
               .CommandText = Array("SELECT * FROM [Table007 (Page 2) (2)]")
               .RowNumbers = False
               .FillAdjacentFormulas = False
               .PreserveFormatting = True
               .RefreshOnFileOpen = False
               .BackgroundQuery = True
               .RefreshStyle = xlInsertDeleteCells
               .SavePassword = False
               .SaveData = True
               .AdjustColumnWidth = True
               .RefreshPeriod = 0
               .PreserveColumnInfo = True
               .ListObject.DisplayName = "Table007__Page_2"
               .Refresh BackgroundQuery:=False
           End With
           
       ElseIf Range("L3") = "parcheggi" Then
              ActiveWorkbook.Queries.Add Name:="Table006 (Page 2) (2)", Formula:= _
            "let" & Chr(13) & "" & Chr(10) & "    Origine = Pdf.Tables(File.Contents(""C:\Users\" & Environ("username") & "\OneDrive - Indipendente\Fisco\telepass\tele\" & FL & """), [Implementation=""1.3""])," & Chr(13) & "" & Chr(10) & "    Table001 = Origine{[Id=""Table001""]}[Data]," & Chr(13) & "" & Chr(10) & "    #""Intestazioni alzate di livello"" = Table.PromoteHeaders(Table001, [PromoteAllScalars=true])," & Chr(13) & "" & Chr(10) & "    #""Modificato tipo"" = Table.Tran" & _
            "sformColumnTypes(#""Intestazioni alzate di livello"",{{""DATA/ORA SOSTA"", type text}, {""DESCRIZIONE"", type text}, {""Column3"", type text}, {""IMPORTO"", type number}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Modificato tipo"""
        ActiveWorkbook.Worksheets.Add
        With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
            "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table001 (Page 1)"";Extended Properties=""""" _
            , Destination:=Range("$A$1")).QueryTable
            .CommandType = xlCmdSql
            .CommandText = Array("SELECT * FROM [Table006 (Page 2)]")
            .RowNumbers = False
            .FillAdjacentFormulas = False
            .PreserveFormatting = True
            .RefreshOnFileOpen = False
            .BackgroundQuery = True
            .RefreshStyle = xlInsertDeleteCells
            .SavePassword = False
            .SaveData = True
            .AdjustColumnWidth = True
            .RefreshPeriod = 0
            .PreserveColumnInfo = True
            .ListObject.DisplayName = "Table006__Page_2"
            .Refresh BackgroundQuery:=False
        End With
    
    
          
       End If

  7. #7
    Micron is offline Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,425
    I hope you know the etiquette if you cross post. If not, that means you post a link in each forum to advise anyone so that they don't waste time making the same suggestions.

    I suspect you don't know what I meant by automation because you seem to be dismissing the idea even though it is possible to control one Office application from another.
    The more we hear silence, the more we begin to think about our value in this universe.
    Paraphrase of Professor Brian Cox.

  8. #8
    diegomarino is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Feb 2018
    Posts
    434
    don't worry, for excel i posted in an italian forum.

    Your suspect is ok, cause i don't know what automation is, i thought you meant the use of VBA.
    I think i have to study this automation, then

    EDIT. Wait a minute, i'm thinking about another approach: open the PDF in edge control , copy the data on the table and add it to a local table, as i do for websites. I think this is promising

  9. #9
    Micron is offline Very Inert Person
    Windows 10 Access 2016
    Join Date
    Jun 2014
    Location
    Ontario, Canada
    Posts
    13,425
    great site for automation with Excel and Access
    http://www.accessmvp.com/KDSnell/EXCEL_MainPage.htm

    Sounds like you have a promising idea there - as long as everyone who would use the db doesn't mind using Edge. I guess I would put up with that if I had to, but I much prefer to not use it.
    The more we hear silence, the more we begin to think about our value in this universe.
    Paraphrase of Professor Brian Cox.

  10. #10
    diegomarino is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Feb 2018
    Posts
    434
    thanks
    i'll keep the post update in case i find a good solution.
    Actually, i really love edge control and all the stuff you can do that, it helps a lot when you have bad sources or no downloadable sources and you have to rely on data shown on a site.

    PS looking at your useful links i realized that i do know automation, but it's not the answer unfortunely. However they are really really useful for other needs that i have

  11. #11
    madpiet is offline Expert
    Windows 10 Office 365
    Join Date
    Feb 2023
    Posts
    566
    Now, there are few problems:

    A) table names sometimes changes and i have to make it manually
    B) since sometimes i need to import more than one table i can have some problems.

    Where do the table names change? Are the changes limited to a few variations? (Then you use a "translation" table in PowerQuery an run a bunch of Renames of your columns, or strip off the column names altogether and just use the column number to map it.

    More than one table from the same PDF file? PDFs have a Tables collection, I think. (I still think I'd do this with PowerQuery... Ken Snell is great at Excel, especially PowerQuery... can you post the structure of the tables you need to import? (Or a few rows, if the data isn't confidential?) It's just easier to mock up that way.

  12. #12
    diegomarino is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Feb 2018
    Posts
    434
    hi, the table name changed this week for the first time, they are usually Table006 and table007 when they are two (rarely but can happen), now i got table001.
    powerquery is a good option , but i've to dig in.

    here an ex.

    Click image for larger version. 

Name:	Screenshot 2024-05-28 212532.png 
Views:	37 
Size:	56.6 KB 
ID:	51824

  13. #13
    madpiet is offline Expert
    Windows 10 Office 365
    Join Date
    Feb 2023
    Posts
    566
    Quote Originally Posted by diegomarino View Post
    hi, the table name changed this week for the first time, they are usually Table006 and table007 when they are two (rarely but can happen), now i got table001.
    powerquery is a good option , but i've to dig in.

    here an ex.

    Click image for larger version. 

Name:	Screenshot 2024-05-28 212532.png 
Views:	37 
Size:	56.6 KB 
ID:	51824
    Some days I swear I forget how to read! To answer your question, "can I just do all this in Access?" Maybe you can, but it would be infinitely easier to use Excel/PowerQuery to find / read / union the tables you're interested in than to try it in Access. PowerQuery doesn't use VBA at all (so you're not stuck with converting everything to text and then parsing etc)... but it's only available in Excel or PowerBI, and PowerBI can't really export, so... that leaves Excel. In Excel it would be trivial if the column names in your tables are consistent. If not, (and the column order is consistent), you can strip the column names (unpromote, then remove). Append all the tables together, then re-add the column names.

    If you can post a dummy PDF, I can try it for you.

    From Excel, Get Data, From File, From PDF (pretty much like you did.)
    Remove Top 1 rows, remove rightmost column (keep first 5).
    Remove Bottom 1 rows.

    I'm not 100% sure on the table names part. Are they consistent in your PDF files which ones you keep?
    Last edited by madpiet; 05-28-2024 at 07:55 PM.

  14. #14
    diegomarino is offline Competent Performer
    Windows 10 Access 2016
    Join Date
    Feb 2018
    Posts
    434
    Dati_patrimonio_marzo2024.pdfwell, i'll give you some other example. They change the structure just right now, curse them

    i need the tables like that, there are a few of that, and i need tham to put all data togheter, but i can do this part by myself.

    if you can make a powerquery to extract all these tables reaaly thanks, the main goal however is to make it automatic, with the less work possible


  15. #15
    madpiet is offline Expert
    Windows 10 Office 365
    Join Date
    Feb 2023
    Posts
    566

    Reading a bunch of tables with PowerQuery

    Transform PDFS Italian.zipWhich ones? Is it always the same tables?

    Here's an example of one, just to prove it can be done. (You can specify as many tables as you want, but you have to make sure the column names match and are in the correct order).

    Then once you get all that appended to one Excel file, using PQ, you can connect to that Excel file and import the contents.

Page 1 of 2 12 LastLast
Please reply to this thread with any new information or opinions.

Similar Threads

  1. Replies: 6
    Last Post: 09-08-2019, 03:18 PM
  2. CSV Import to multiple tables
    By dniezby in forum Programming
    Replies: 2
    Last Post: 03-22-2017, 06:25 AM
  3. Replies: 2
    Last Post: 02-28-2017, 01:34 AM
  4. Import XML into one-to-many tables
    By Jay Myhre in forum Import/Export Data
    Replies: 10
    Last Post: 02-25-2012, 06:42 PM
  5. Import 1 spreadsheet into two tables
    By Matthieu in forum Import/Export Data
    Replies: 4
    Last Post: 02-03-2010, 08:19 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums