Results 1 to 9 of 9
  1. #1
    KO_Baystate is offline Novice
    Windows 8 Access 2010 32bit
    Join Date
    Feb 2014
    Posts
    5

    extract email from large text file


    I am trying to find a way to extract an email from a large text file that is an output from our email system. I would like to be able to extract the email address using a query or collection of queries. I have been able to extract all of the text that contains the @ symbol. From their I created a query expression:IE: Mid([field1],InStrRev([field1]," ")) that captures some but not everything I need. The "testdata" file is the raw text file and "testdata1" is the output after I imported the text file into Access, ran a query to only include lines with the "@" symbol and then ran the expression IE: Mid([field1],InStrRev([field1]," "))


    Thanks

    Kevin
    Attached Files Attached Files

  2. #2
    orange's Avatar
    orange is offline Moderator
    Windows XP Access 2003
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,849
    What is the objective of all this parsing? What are you going to do with the output?

  3. #3
    KO_Baystate is offline Novice
    Windows 8 Access 2010 32bit
    Join Date
    Feb 2014
    Posts
    5
    Hi there,

    I work for a promotional products company that sends out a newsletter to subscribed customers only. We sometimes get responses from the email newsletters providing alternative email addresses that we would like to add to our list email list. I can assure you that this isn't to spam or provide newsletters to people that don't want them. I don't know what assurances I could provide you to make this seem like this isn't for malicious purposes. To give everyone a better idea of what i mean by promotional products here is the website for the business i work for:http://www.baystate.com/

  4. #4
    orange's Avatar
    orange is offline Moderator
    Windows XP Access 2003
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,849
    No, I wasn't trying to see if this was spam related. I wondered if you were vetting email addresses with clients or similar. Or putting these into a table -- I was wondering where other things like Name, Address,Phone would come from or be matched.
    Are you a programmer --vba?

  5. #5
    KO_Baystate is offline Novice
    Windows 8 Access 2010 32bit
    Join Date
    Feb 2014
    Posts
    5
    Thanks for your response. The reason I thought you might be hinting at spamming or something similar is that I have seen the topic come up in other forums that have touched on this subject. I would describe the purpose as vetting emails to some extent but I don't think it will be matched with other information in any other tables. The proposed output will be compared to other emails already existing in the table and will be added if they aren't already existing. I haven't done programming for a very long time but I tried to approach this problem using vba. Even so, I am interested in finding a solution using Access if possible. I have changed the expression I was using slightly to Expr1: Mid([field1],InStrRev([field1]," ",InStr([field1],"@")+1)). this expression works okay but i'd like to be able to remove all the text from the in front of and behind the email address. The email addresses are bracketed by these characters <, >, (, ), ", .

  6. #6
    orange's Avatar
    orange is offline Moderator
    Windows XP Access 2003
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,849
    I created a function that will pull email addresses out of an input string. It uses regular expressions (regex) which is based on pattern matching. I am including it here, along with a test routine I created that uses some of your test data with a few adjustments. I've added extra email addresses separated by spaces within some of the test data records.

    If you decide to use (or adjust to your needs) the function, you need to have a reference to the Microsoft VBScript Regular Expression library. Regex is great for pattern matching, but there is nothing wrong with the Mid(),InStr(),InStrRev(),Left(),Right(), Trim() functions either.

    I could not find a good example using Access vba where there was a search and extract of matches to a pattern. So this will serve as a sample.

    Code:
    '---------------------------------------------------------------------------------------
    ' Procedure : Emailfinder
    ' Author    : Jack
    ' Date      : 18/02/2014
    ' Purpose   : This will get valid email addresses from a atring. [Valid == matches the pattern]
    '  Found "best regex pattern for email validation" at
    '  http://blog.trojanhunter.com/2012/09/26/the-best-regex-to-validate-an-email-address
    ' Pattern:  [-0-9a-zA-Z.+_]+@[-0-9a-zA-Z.+_]+\.[a-zA-Z]{2,4}
    '
    ' This routine will return multiple email addresses when there are multiples in the string.
    '
    ''                                ---------------------------------------
    '*** Requires a reference to the  Microsoft VBScript Regular Expressions  library
    '                                 _______________________________________
    '---------------------------------------------------------------------------------------
    '
    Function Emailfinder(T As String) As String
              Dim MyRE As Object
    10        Set MyRE = New Regexp
              Dim MyMatches As MatchCollection
              Dim MyResult As String
              'set the email pattern
    20        On Error GoTo Emailfinder_Error
    30        Emailfinder = ""  'set to empty string
    40        MyRE.Pattern = "[-0-9a-zA-Z.+_]+@[-0-9a-zA-Z.+_]+\.[a-zA-Z]{2,4}"
    50        MyRE.Global = True
    60        MyRE.ignorecase = True
    70        Set MyMatches = MyRE.Execute(T)
    80        If MyMatches.Count > 0 Then
    90            For Each MyMatch In MyMatches
    100               MyResult = MyResult & MyMatch.Value & vbCrLf
    110           Next
    120           Emailfinder = MyResult  ' one or more valid email addresses
    130       Else
    140           Emailfinder = ""    'empty string
    150       End If
    160       On Error GoTo 0
    170       Exit Function
    
    Emailfinder_Error:
    
    180       MsgBox "Error " & Err.number & " (" & Err.Description & ") in procedure Emailfinder of Module Module1"
    End Function
    Here is the test routine I used that includes some of your test data.
    Code:
    Sub sometest()
        Dim s(14) As String 'test data
        Dim i As Integer
       On Error GoTo sometest_Error
    
        s(0) = "h="
        s(1) = " 1="
        s(2) = " 10pt;>.</span></p>=jim.XX@samoa.net.au<p= ==mmgf  hank@gmail.com gerry@Att.net"
        s(3) = "(301)277-="
        s(4) = " <test@oceanfire.com;bill@gmail.com>"
        s(5) = "<test@oceanfire.com>;"
        s(6) = "Test@ promotioncorner.com"
        s(7) = "<test@oceanfire.com>"
        s(8) = "<20140206130446.BB05F2995@diligence.cnchost.com>"
        s(9) = "<test@promotioncorner.com>"
        s(10) = "<test@wadecorporategifts.com>"
        s(11) = "<0Lh8Cl-1VPCF507ZX-00oYM5@mx.perfora.net>"
        s(12) = "<test@vectorpromo.com>"
        s(13) = "-0500"
        s(14) = " Test@ promotioncorner.com"
    
        For i = 0 To UBound(s)
            If Emailfinder(s(i)) <> "" Then
                Debug.Print Emailfinder(s(i)) & "  " & i
            End If
        Next i
    
       On Error GoTo 0
       Exit Sub
    
    sometest_Error:
    
        MsgBox "Error " & Err.number & " (" & Err.Description & ") in procedure sometest of Module Module1"
    End Sub
    Output"
    Code:
    jim.XX@samoa.net.au
    hank@gmail.com
    gerry@Att.net
      2
    test@oceanfire.com
    bill@gmail.com
      4
    test@oceanfire.com
      5
    test@oceanfire.com
      7
    20140206130446.BB05F2995@diligence.cnchost.com
      8
    test@promotioncorner.com
      9
    test@wadecorporategifts.com
      10
    0Lh8Cl-1VPCF507ZX-00oYM5@mx.perfora.net
      11
    test@vectorpromo.com
      12
    Good luck with your project
    Last edited by orange; 02-19-2014 at 07:31 AM. Reason: Forgot to show the output??????

  7. #7
    KO_Baystate is offline Novice
    Windows 8 Access 2010 32bit
    Join Date
    Feb 2014
    Posts
    5
    Thanks Orange!!

    This looks like it will work!!

  8. #8
    orange's Avatar
    orange is offline Moderator
    Windows XP Access 2003
    Join Date
    Sep 2009
    Location
    Ottawa, Ontario, Canada; West Palm Beach FL
    Posts
    16,849
    No problem. Happy to help. Post back if you need something.

  9. #9
    ItsMe's Avatar
    ItsMe is offline Sometimes Helpful
    Windows 7 64bit Access 2010 32bit
    Join Date
    Aug 2013
    Posts
    7,862
    Nice use of the string array Orange

Please reply to this thread with any new information or opinions.

Similar Threads

  1. Extract text from each word document in a file
    By mercapto in forum Programming
    Replies: 11
    Last Post: 03-12-2013, 10:29 PM
  2. Replies: 13
    Last Post: 11-07-2012, 03:14 PM
  3. Replies: 7
    Last Post: 10-31-2012, 04:58 PM
  4. Extract file attachment from email to Access
    By bet in forum Import/Export Data
    Replies: 3
    Last Post: 02-02-2011, 07:35 PM
  5. How to extract names from 1 large field
    By Tepin in forum Access
    Replies: 1
    Last Post: 12-04-2006, 01:14 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums