Overview
Yes, Mailparser allows you to extract data from PDF email attachments.
You can use this to pull:
- Text content
- Table data
- Specific values from PDF documents
Parsing PDFs works similarly to parsing data from the email body.
How It Works
Mailparser processes the email and its attachments together.
- The PDF is ingested with the email
- You set your parsing rule to use the attachment as the data source
- The PDF content is converted into text, which you can then filter and extract
How to Use It
Step 1: Send an Email with a PDF Attachment
Forward or send an email with a .pdf file to your Mailparser inbox.
Step 2: Create a Parsing Rule
- Open your Mailparser inbox
- Go to the Rules tab
- Click Create New Rule
Step 3: Set Data Source to Attachment
- In your rule, set the data source to:
Attachment
Step 4: Extract Text from the PDF
- Choose File content (Text)
- Choose the Filter by Type: PDF
Step 5: Refine and Extract Data
Apply additional filters to isolate the data you need, such as:
- Specific fields
- Numbers or values
- Table rows
You can chain multiple filters just like you would with email content.
Extract Table Data
If your PDF contains structured tables:
- Choose File content (Table Cells)
- Choose the Filter by Type: PDF
This will return rows and columns that can be further filtered.
Tips
- Use Filter by type if your email contains multiple attachments
- Start by extracting all text, then refine step by step
- Ensure your PDFs have a consistent layout for best results
- Encrypted or image-only PDFs may not parse correctly
When to Use This Feature
Use PDF parsing when:
- Important data is only available in attachments
- You receive invoices, reports, or confirmations as PDFs
- You want to automate data extraction from documents
Summary
Mailparser supports parsing data from PDF attachments by converting them into text and applying filters.
By selecting Attachment as the data source and using the appropriate filters, you can extract structured data from PDFs just like you would from an email.