Skip to main content

Import PDF files into Xero, with line-item detail

Xero is good at attaching PDFs to records. It is not good at reading them.

HubDoc, Xero's built-in capture tool, pulls header-level data from PDFs: vendor, total, date, invoice number. It doesn't extract line items, and Xero has publicly said line-item extraction isn't on the roadmap.

EntryRocket reads the full content of your PDFs and imports the data into Xero as structured records: invoices, bills, or bank statement lines, with every line item intact. Not OCR. Not "best guess". The exact data, every time.

100%

accuracy on data extraction. Not OCR, not "best guess".

Full detail

line items, not just headers

<1 min

per file, every time after that

Why 100% accuracy on PDFs is even possible

No OCR. No guessing. The exact data.

PDF extraction has a reputation for being unreliable, because most PDF tools on the market rely on OCR. OCR guesses at characters from a visual layer, which means misread digits, transposed amounts, and confidence scores instead of certainty. That's why those tools build a human review step into every document.

EntryRocket doesn't use OCR. Your reader parses the PDF's underlying text and structure directly, the same way the software that produced the PDF stored the data originally. There's no guessing involved, so there's no review step needed. The output matches the input, every time.

That's the difference between "we extracted roughly what the document says" and "we extracted exactly what the document says".

Where native tools stop

Where native Xero tools stop

Xero's PDF story has three layers:

1. Attach a PDF to a record

Useful for audit trail. Doesn't extract any data.

2. HubDoc data capture

Extracts vendor, total, date, and invoice number. No line items. No detail on complex bank statements or payment processor reports.

3. Third-party line-item extractors (Dext, AutoEntry, Datamolino)

These do extract line items, typically priced per document, with manual review built into the workflow.

Each of those has a place. EntryRocket fits a different one: recurring PDFs with known structure, where you want the full content imported without a per-document review step.

How EntryRocket works

What EntryRocket does with PDFs

Your reader is built around the exact PDF layout you receive. It reads every line, maps each one to the right Xero account, and creates the corresponding records. Because the reader knows the structure of your specific PDF, it doesn't need human review for each document.

How it works

Three steps, then it's automatic

1

Send us a sample PDF

Send us a sample PDF you're currently processing by hand.

2

We build your reader

We build your reader in 2-3 business days, trained on your file's specific layout.

3

You email PDFs in

From then on, you email PDFs in. Xero has the structured data within a minute.

What gets created

Depending on the PDF and your workflow, your reader can create:

  • Bank statement lines from PDF bank statements
  • Bills from supplier PDF invoices
  • Invoices from sales PDFs
  • Batch deposits from payment processor PDFs
  • Contacts for new vendors or customers
Common use cases
  • Bank statements from banks without a Xero direct feed
  • Credit card statements with line-level detail
  • Payment processor payout statements from PayPal, Stripe, Square, or regional processors
  • Supplier invoices that arrive as PDFs and need line items in Xero
  • Platform statements from marketplaces and booking platforms

For payment processors specifically, see also PayPal to Xero and Stripe to Xero.

FAQ

Frequently asked questions

About importing PDF files into Xero with EntryRocket.

View all FAQs →
Does EntryRocket do OCR? +

No, and that's the point. EntryRocket reads the text layer of computer-generated PDFs directly, which is why extraction is 100% accurate. OCR tools (HubDoc, Dext, AutoEntry, and others) guess at characters from a visual layer and need manual review to catch their mistakes. EntryRocket doesn't need that review step.

What counts as a computer-generated PDF? +

A PDF produced by software: bank statements, payment processor reports, e-commerce exports, ERP outputs, accounting exports. If you can select and copy text from the PDF, it's computer-generated and EntryRocket can read it. If your files are scans of paper receipts or photographs, OCR tools like HubDoc, Dext, or AutoEntry are the right fit.

Isn't HubDoc free with Xero? Why would I need this? +

HubDoc is excellent for receipt capture with header data: vendor, total, date, invoice number. Xero has confirmed line-item extraction is not on the HubDoc roadmap. If you need line-item detail from PDF statements or invoices, EntryRocket fills that gap.

What PDF layouts do you support? +

Your reader is built for your specific layout. That's why setup is 2-3 business days instead of instant: the reader is tailored to the file you actually receive, which is also why accuracy is deterministic rather than probabilistic.

What if the PDF layout changes? +

Small changes are usually handled automatically. Major layout changes require a reader update, which we do as part of ongoing support.

Stop rekeying PDFs into Xero

Send us a sample PDF and we'll tell you whether a reader is a fit, and what it would do with your file.