BeanHub Inbox: Automate Your Financial Document Management with AI-Powered Email Processing

April 21, 2025

beancount

automation

llm

new-feature

Great news for all BeanHub users and AI-powered automation enthusiasts! We’re thrilled to announce the beta release of our latest feature, BeanHub Inbox, a game-changing tool that uses AI to streamline the management of your invoices, receipts, and other financial documents. With BeanHub Inbox, you can forward emails to a dedicated address like janedoe+myrepo@inbox.beanhub.io, and our open-source email processing engine will automatically extract key values and archive them for you. Best of all, you can run the extraction locally with our open-source tools, keeping you in full control of your data.

Here’s a demo video of BeanHub’s new open-source tool for extracting values from email files using an LLM:

Why BeanHub Inbox?

Automation has been at the heart of BeanHub’s mission, from BeanHub Connect to Direct Connect. But managing invoices and receipts manually remains a tedious and error-prone task. Consider the risk of an audit—critical financial documents can easily get lost if not proactively organized. Our users have shared amazing experiences using AI to enhance their BeanHub workflows. We’ve seen firsthand how powerful it can be.

BeanHub Inbox is our step toward a fully automated accounting book powered by AI—one that practically runs itself. Yet, we remain committed to our core value of data ownership. That’s why we’ve designed BeanHub Inbox to give you complete control, with an open-source email processing engine you can run locally if you choose.

BeanHub Inbox in Action

It’s very easy to use BeanHub Inbox. First, you can find the new “Inbox” item in the left-hand side menu of your repository. In the email list page, you will see a textbox with your inbox email address for the repository.

For example, our username is beanhub, and the repository name is sample, therefore, the inbox email address will be

beanhub+sample@inbox.beanhub.io

All BeanHub Inbox email addresses follow the pattern:

<username>+<repository_name>@inbox.beanhub.io

You can also add tags to the email address separated by the + symbol like this:

beanhub+sample+SaaS+GitHub@inbox.beanhub.io

All emails sent to this address will be attached with SaaS and GitHub tags. We will talk about the tags more later.

Now, let’s say you have received an email receipt from CircleCI. You can forward it directly to this email address.

Proton Mail email draft UI with beanhub+sample+CircleCI@inbox.beanhub.io as the To column

After a short while, usually less than one minute, your email will appear in the email list. You can view them there or delete them as you want. Other than showing the email on the dashboard, we will also make Git commit automatically for you with the email as a .eml file written to inbox-data/default/{{ id }}.eml by default.

Diff shows new .eml file added by BeanHub

You can change where to archive received emails by changing the .beanhub/inbox.yaml configuration file. By default, if no such config file is provided, a default one looks like this will be used:

inbox:
# rules for archiving received emails to "inbox-data/default/{{ id }}.eml"
- action:
    output_file: inbox-data/default/{{ id }}.eml

inputs:
# read email files for the import pipeline
- match: inbox-data/default/*.eml

imports:
# extract values into CSV file at "import-data/inbox/default.csv"
- actions:
  - extract:
      output_csv: import-data/inbox/default.csv

You can add your own and archive files differently based on matching rules. Like, for example, for receipts from GitHub, we would like to archive to inbox-data/github/{{ id }}.eml and for receipts from CircleCI, we would like to archive to inbox-data/circleci/{{ id }}.eml, then we can write the inbox rule like this:

inbox:
- match:
    # match GitHub tag
    tags: ["GitHub"]
  action:
    # output to github folder
    output_file: inbox-data/github/{{ id }}.eml
- match:
    # match CircleCI tag
    tags: ["CircleCI"]
  action:
    # output to circle folder
    output_file: inbox-data/circleci/{{ id }}.eml
- action:
    # write all others to the default folder
    output_file: inbox-data/default/{{ id }}.eml

# ... input & import rules

We open-sourced the engine for processing emails as beanhub-inbox, and we use the same library in BeanHub CLI and BeanHub server. The same config file affects not only how the BeanHub server archives your email files but also how your local commands archive files with BeanHub CLI. To dump email files from BeanHub to your local file system, you need to ensure that you have BeanHub CLI installed. You can do so by running

pip install beanhub-cli@3.0.0b2

As you’ve noticed, this is a beta release, so we added @3.0.0b2 at the end. The official release will come later once it’s ready. Or, if you prefer uv (we recommend it highly), you can also run

uv pip install beanhub-cli@3.0.0b2

Then, you can run the following to log into your BeanHub account

bh login

Next, run the following to dump emails to your local file system

bh inbox dump

The command is smart enough to use your local .beanhub/inbox.yaml rules to find the output file paths, check those paths locally, and pull only the missing ones from the server.

$ bh inbox dump
[04:38:49] INFO     The inbox doc at .beanhub/inbox.yaml does not exist, use default config                  main.py:161
           INFO     The email 01JS566CJFQZFH04CREW4BW5HP archive output path                                 main.py:134
                    inbox-data/default/01JS566CJFQZFH04CREW4BW5HP.eml already exists, skip
           INFO     The email 01JS56P9K55DRW7NE5T2E3NWRE archive output path                                 main.py:134
                    inbox-data/default/01JS56P9K55DRW7NE5T2E3NWRE.eml already exists, skip
           INFO     The email 01JS56SBWSAR3VWT1GRHK5A2B7 archive output path                                 main.py:134
                    inbox-data/default/01JS56SBWSAR3VWT1GRHK5A2B7.eml already exists, skip
           INFO     The email 01JS56X0S64VP5GDK54D1WYMNC archive output path                                 main.py:134
                    inbox-data/default/01JS56X0S64VP5GDK54D1WYMNC.eml already exists, skip
           INFO     The email 01JS574AX1GC0H8SZVC4J9WAVB archive output path                                 main.py:134
                    inbox-data/default/01JS574AX1GC0H8SZVC4J9WAVB.eml already exists, skip
[04:38:50] INFO     Created dump e1f33938-dfa8-4bcd-8c2c-8b217da29c2f with public_key                        main.py:427
                    NsTSCQyAEQRzNI0KTVp8GUz7z6DBtgnkEY4zRaWrKG8=, email_count=2,
                    workdir=/home/fangpen/workspace/beanhub/sample, waiting for updates ...
[04:38:54] INFO     Decrypting downloaded file ...                                                           main.py:469
           INFO     Writing email 01JS570HGPGR465WPDENS1BGMR to                                            file_io.py:35
                    inbox-data/default/01JS570HGPGR465WPDENS1BGMR.eml
           INFO     Writing email 01JS573NTQPHPCSFYDXJHJ6AWJ to                                            file_io.py:35
                    inbox-data/default/01JS573NTQPHPCSFYDXJHJ6AWJ.eml
           INFO     done                                                                                     main.py:490

And then you can see your email files in the local folder:

$ tree inbox-data/default/
inbox-data/default
├── 01JS566CJFQZFH04CREW4BW5HP.eml
├── 01JS56P9K55DRW7NE5T2E3NWRE.eml
├── 01JS56SBWSAR3VWT1GRHK5A2B7.eml
├── 01JS56X0S64VP5GDK54D1WYMNC.eml
├── 01JS570HGPGR465WPDENS1BGMR.eml
├── 01JS573NTQPHPCSFYDXJHJ6AWJ.eml
└── 01JS574AX1GC0H8SZVC4J9WAVB.eml

1 directory, 7 files

The beauty of our system is that it emphasizes data ownership for the users. We keep all the data sources in their original format and process them from there. That way, we can always use your preferred tool to process them again if you are not happy with the result.

Now, with BeanHub CLI’s new AI-powered bh inbox extract feature, we can process those email files and output the result into a CSV file. The extract feature relies on the Ollama server running locally. So, be sure you have the Ollama API server accessible from your machine. You can download the Ollama application here. If it’s not running at localhost or the standard port, please remember to set OLLAMA_HOST value to point it to the right host like this:

export OLLAMA_HOST=192.168.123.25:11434

Please note that different LLM models have different hardware requirements. Some models even require a tremendous amount of memory to run. By default, we are using the Phi-4 model. Most modern computers should be able to run it, and the only major difference would be the speed.

As mentioned previously, a default inbox config file will be used if not provided. The default input rules and import rules look like this:

inputs:
# read email files for the import pipeline
- match: inbox-data/default/*.eml

imports:
# extract values from email files defined in input into CSV file at "import-data/inbox/default.csv"
- actions:
  - extract:
      output_csv: import-data/inbox/default.csv

The input rules determine which email files to process, and the import rules match email files defined in the input rules and perform actions like extracting against the file. With the default inbox config and Ollama, you can run the following to see the magic of AI extracting value from email files for you.

bh inbox extract

Since running an LLM model is a computing-intensive process, the extract command will check the output CSV file and only run the LLM to extract the values provided if the row with the same ID (currently, it would be the filename of the eml files) does not exist. So, if you rerun it, it will skip the one you ran previously. Surely, you can delete the CSV file or a particular role in the CSV file to make it run the LLM model again. To use a model other than Phi-4, you can pass it as --model or -m for short. We have tried some options, small ones that work well, like DeepCoder and QWQ. Here’s an example of running an extract with DeepCoder:

bh inbox extract -m deepcoder

Please check Ollama’s model list to learn more about what models are available. We recommend small ones with parameter count like 14B to 32B is pretty much good enough for our simple value extraction purpose.

Currently, the extract command has a fixed list of columns and the corresponding prompts. Here are the definitions:

- description: True if this email is for a transaction such as an invoice or receipt, otherwise False
  name: valid
  required: true
  type: bool
- description: The summary of the transaction in a short sentence
  name: desc
  required: false
  type: str
- description: Name of the merchant who sent the email if available
  name: merchant
  required: false
  type: str
- description: Transaction amount as a decimal string value, do not include dollar sign and please follow the regex format
  name: amount
  required: false
  type: decimal
- description: Tax amount as a decimal string value, do not include dollar sign and please follow the regex format
  name: tax
  required: false
  type: decimal
- description: Id of transaction, such as invoice number or receipt number
  name: txn_id
  required: false
  type: str
- description: The date of transaction if available, in YYYY-MM-DD format
  name: txn_date
  required: false
  type: date

We will open it up for you to customize in the future. Please note that the extract is only supported locally by BeanHub CLI now. The server-side processing upon receiving a new email will come shortly.

Organize Your Emails with Tags And Forwarding Rules

As we mentioned, BeanHub Inbox makes email organization effortless with tags. For example, modify your email address to janedoe+myrepo+CircleCI@inbox.beanhub.io to tag an email with CircleCI. You can add multiple tags to categorize emails from different vendors. We recommend setting up automatic forwarding rules in your email provider to send invoices and receipts to tagged addresses, making them easier to manage later. Here’s the example with Proton Mail:

Screenshot of Proton Mail for creating forwarding rule to BeanHub Inbox

More AI Superpower Is Coming Soon!

This is just the beginning for AI Superpower you see for BeanHub. Here’s what’s on the horizon:

Attachment Archiving: We’re working on rules or prompt-based archiving attachments from emails to specific paths.
OCR and Attachment Extraction: Soon, we’ll support extracting values from PDF attachments, including OCR for image-based documents.
MCP Service: We’re planning an open-source MCP (Model Context Protocol) service for BeanHub and Beancount to make it possible to ask AI agents questions about your book or even let them improvie it.

Our development pace is relentless. The Inbox exists in our product roadmap for a long time, a user email few weeks ago asked for it, and we’ve already just shipped the first iteration. We can’t wait for your feedback after trying it out. Email us at support@beanhub.io to help shape BeanHub’s future.

Upcoming Price Change

To reflect the growing value of BeanHub’s features, we’re adjusting our pricing. Starting May 15, 2025, the Pro plan will increase from $15/month (or $11/month billed annually) to $20/month (or $15/month billed annually). Current subscribers will retain their existing rates and enjoy BeanHub Inbox and future features at no extra cost. We provide a one-month risk-free trial. Sign up now, and don’t miss the chance to lock in the current price before the change!

We’re Also Considering Raising Funds to Accelerate

BeanHub has been bootstrapped so far, but we see immense potential in using AI to revolutionize personal and business financial management. We’re exploring funding options but will only partner with investors who share our vision of an open, AI-powered financial platform. We’re also considering raising funds directly from our users. If you’re interested or have ideas, reach out to us at fangpen@launchplatform.com.

Frequently Asked Questions

Q: Do I need a Git repository to use BeanHub Inbox and bh inbox dump?
A: No, a Git repository is optional. You can create a Connect-only repository and use the dedicated email address <your-username>+<repo-name>@inbox.beanhub.io to pull emails locally with bh inbox dump.

Q: I don’t trust BeanHub with my financial data but want a dedicated email address for forwarding. Is that possible?
A: Yes! We’re exploring options to encrypt incoming emails with a user-provided GPG or age public key, so only you can decrypt them locally with your private key. Let us know your interest at support@beanhub.io.

Q: What about emails with end-to-end GPG encryption?
A: We can’t process GPG-encrypted email content server-side, but we’re considering updating beanhub-inbox to decrypt emails locally with your private key during extraction. Open a GitHub issue at beanhub-inbox or email support@beanhub.io to request this feature.

Q: Do I need a BeanHub account to use the AI-powered extraction?
A: No. Our beanhub-inbox and beanhub-cli are open-source. Only features such as the dedicated email address and the dump command needs an account. As long as you have .eml files in your local folder and Ollama running locally, it should work regardless if you have a BeanHub account or not.

Q: Is there a quota limit for emails or AI computation?
A: We currently have an undisclosed internal limit to prevent abuse. We’ll establish clear quotas once we gather more usage data, ensuring they’re generous for typical use cases. Stay tuned for updates.

Q: What if someone sends unsolicited emails to my BeanHub Inbox address?
A: You can disable email reception in the repository settings page to reject all incoming emails. We’re also considering a whitelist feature to restrict senders. Share your thoughts at support@beanhub.io.

Q: Can emails be sorted by their original sent date instead of the forwarded timestamp?
A: Yes, we’re working on this. If you’d like to see it prioritized, email support@beanhub.io.

Q: Can I extract and archive attachments to a specific path?
A: Not yet, but we’re developing this feature for beanhub-inbox. Let us know your preferences at support@beanhub.io.

Q: Can I extract values from attachments or perform OCR on image-based PDFs?
A: This is in development. We’ll soon support extracting values from PDF attachments and OCR for image-based documents. Stay tuned!

Q: Can I choose my preferred AI model?
A: Locally, you can use any Ollama-compatible model. Server-side extraction (coming soon) will likely use Phi-4 for its efficiency and open-source nature. We’re open to integrating with OpenAI, Anthropic, or Grok APIs if users demand it—email support@beanhub.io to vote.

Q: Can I delete my email data from BeanHub?
A: Yes, delete emails directly from the dashboard. Some cached files may remain on our servers for a few days but will be removed afterward.

Q: Can I stop BeanHub from committing emails to Git?
A: Not yet, but we’re planning to add a flag in the inbox config or dashboard to control this. Let us know if you need it at support@beanhub.io.

Q: Can I write custom extraction prompts for specific values?
A: Not yet, but we’re planning to support custom extraction rules in the inbox config file.

Q: Can I integrate BeanHub Inbox with beanhub-import for Beancount transactions?
A: Absolutely! You can create an extractor in beanhub-extract to generate Beancount transactions from CSV outputs. We plan to streamline this integration in the future.

Q: Do I need to use Beancount or plaintext accounting to benefit from BeanHub Inbox?
A: No, we designed and built the BeanHub Inbox feature to run independently without any involvement from Beancount or plaintext accounting books. You can sign up for BeanHub solely to archive your invoices and receipts, which is perfectly fine. In the future, we will introduce additional options in BeanHub Inbox, allowing you to customize beyond just processing invoices and receipts. Ideally, you should be able to process any email using the same open-source system.

Q: Will you build a specialized LLM for email extraction and Beancount processing?
A: We’re considering post-training our own models and potentially open-sourcing them, budget permitting. Email data and Beancount books are ideal for synthetic training, and we’d love to explore this with sufficient funding.

BeanHub Inbox: Automate Your Financial Document Management with AI-Powered Email Processing

Why BeanHub Inbox?

BeanHub Inbox in Action

Organize Your Emails with Tags And Forwarding Rules

More AI Superpower Is Coming Soon!

Upcoming Price Change

We’re Also Considering Raising Funds to Accelerate

Frequently Asked Questions

Fang-Pen Lin

Got a question? Send us a message and we'll respond as soon as possible

BeanHub Inbox: Automate Your Financial Document Management with AI-Powered Email Processing

Why BeanHub Inbox?

BeanHub Inbox in Action

Organize Your Emails with Tags And Forwarding Rules

More AI Superpower Is Coming Soon!

Upcoming Price Change

We’re Also Considering Raising Funds to Accelerate

Frequently Asked Questions

Fang-Pen Lin

Sign up for our newsletter and stay informed about BeanHub's updates

Got a question? Send us a message and we'll respond as soon as possible