blog

BeanHub Import - One small step closer to fully automating transaction importing

May 27, 2024
beancount
import
csv
bank
new-feature

We all love plain-text accounting. It’s based on open-source tools, so you never need to worry about vendor lock-in. Beancount is among the most popular options, along with others, such as ledger or hledger. You can visit Plain Text Accounting to learn more.

Unlike plain-text accounting, proprietary software users always worry about whether the company will shut down and the tool they use will no longer be supported. For example, the popular budgeting app Mint was shut down, and millions of its users were forced to migrate to Credit Karma instead.

Importing data from bank or credit card accounts is the most painful, time-consuming, and error-prone part of using plain-text accounting. After I started using Beancount to manage my financial data, I always dreamt of making it fully automated. I envision that every day, I wake up, sip tea, and open my Beancount app, and it would already have all the transactions imported without me doing anything. Software should serve us, not the other way around.

Although we are not there yet, we are one small step closer to that dream today. We are glad to announce that the new BeanHub Import feature is available. Also, we are finally leaving the beta stage.

In addition to the awesome new import feature, to make BeanHub more accessible, we added a private repository for free users with a 1,000-entry limit for that repository! Here’s the updated version of the pricing:

New pricing shows free users now also have one private repository with 1,000 entries limit

BeanHub Import

There are already countless open-sourced Beancount import tools out there. But none of them meet all of our requirements. The problems of existing import tools are:

  • Extract logic coupled with transaction generation, making it unreusable
  • Need to write code in Python or other programming languages
  • Duplication due to unawareness of previous imported transactions
  • Cannot merge data from different file sources

To solve the problems, we build our own import tool. Instead of making one import tool extract transactions and generate Beancount transactions simultaneously, we built extractors solely for extracting CSV files into a standardized structure. We built another tool to import transactions extracted from the extractors based on pre-defined rules. To get an idea of how our import system works, here’s the flow diagram:

Flow diagram of how beanHub-import works

In a nutshell, we are adopting a data pipeline approach for processing and importing transactions from bank account CSV files into Beancount transactions. We also adopted our previous open-source tools, beancount-parser, and beancount-black, for parsing and formatting the Beancount transactions. Therefore, our importing tool is fully aware of existing transactions in your Beancount files and knows where to update, remove, or insert them. We also introduce a new import-id metadata key for each generated transaction to make deduplication possible. With that, we can easily find and match the transactions added in the past and update them accordingly.

While this might sound like a stupid business decision, we always try our best to ensure as little vendor lock-in as possible for users who adopt BeanHub. We hope that even if BeanHub is no longer around one day, people who rely on it won’t be abandoned. That’s why we previously open-sourced our BeanHub Forms feature. Likewise, we also open-sourced BeanHub Inport’s core libraries: beanhub-extract for extracting transactions and beanhub-import for running the rule-based import. These two libraries aren’t meant for end-users to use them directly. You can install our beanhub-cli tool to run the import locally.

BeanHub import rules are defined in a YAML file at .beanhub/imports.yaml. Here’s an example of it:

# the `context` defines global variables to be referenced in the Jinja2 template for
# generating transactions
context:
  routine_expenses:
    "Amazon Web Services":
      account: Expenses:Engineering:Servers:AWS
    Netlify:
      account: Expenses:Engineering:ServiceSubscription
    Mailchimp:
      account: Expenses:Marketing:ServiceSubscription
    Circleci:
      account: Expenses:Engineering:ServiceSubscription
    Adobe:
      account: Expenses:Design:ServiceSubscription
    Digital Ocean:
      account: Expenses:Engineering:ServiceSubscription
    Microsoft:
      account: Expenses:Office:Supplies:SoftwareAsService
      narration: "Microsoft 365 Apps for Business Subscription"
    Mercury IO Cashback:
      account: Expenses:CreditCardCashback
      narration: "Mercury IO Cashback"
    WeWork:
      account: Expenses:Office
      narration: "Virtual mailing address service fee from WeWork"

# the `inputs` defines which files to import, what type of beanhub-extract extractor to use,
# and other configurations, such as `prepend_postings` or default values for generating
# a transaction
inputs:
  - match: "import-data/mercury/*.csv"
    config:
      # use `mercury` extractor for extracting transactions from the input file
      extractor: mercury
      # the default output file to use
      default_file: "books/{{ date.year }}.bean"
      # postings to prepend for all transactions generated from this input file
      prepend_postings:
        - account: Assets:Bank:US:Mercury
          amount:
            number: "{{ amount }}"
            currency: "{{ currency | default('USD', true) }}"

# the `imports` defines the rules to match transactions extracted from the input files and
# how to generate the transaction
imports:
  - name: Routine expenses
    match:
      extractor:
        equals: "mercury"
      desc:
        one_of:
          - Amazon Web Services
          - Netlify
          - Mailchimp
          - Circleci
          - WeWork
          - Adobe
          - Digital Ocean
          - Microsoft
          - Mercury IO Cashback
    actions:
      # generate a transaction into the beancount file
      - file: "books/{{ date.year }}.bean"
        txn:
          narration: "{{ routine_expenses[desc].narration | default(desc, true) | default(bank_desc, true) }}"
          postings:
            - account: "{{ routine_expenses[desc].account }}"
              amount:
                number: "{{ -amount }}"
                currency: "{{ currency | default('USD', true) }}"

  - name: Receive payments from contracting client
    match:
      extractor:
        equals: "mercury"
      desc:
        equals: Evil Corp
    actions:
      - txn:
          narration: "Receive payment from Evil Corp"
          postings:
            - account: "Assets:AccountsReceivable:EvilCorpContracting"
              amount:
                number: "{{ -amount / 300 }}"
                currency: "EVIL.WORK_HOUR"
              price:
                number: "300.0"
                currency: "USD"

  - name: Ignore unused entries
    match:
      extractor:
        equals: "mercury"
      desc:
        one_of:
        - Mercury Credit
        - Mercury Checking xx1234
    actions:
      # ignore action is a special type of import rule action to tell the importer to ignore the
      # transaction so that it won't show up in the "unprocessed" section in the import result
      - type: ignore

To learn more about the import rule and its schema, please visit our beanhub-import GitHub repository.

At this moment, beanhub-extract only supports CSV files from a few banks. We are working hard to expand the supported banks. Since it’s an open-source project, please feel free to open a pull request or file an issue in the GitHub repository if the extractor for your bank is not supported yet.

Upload import file

To import a CSV file with BeanHub, you can use the newly added import upload file feature. You can find the Import button in the left-hand-side menu. You can then click Upload Files to see the uploaded files.

Screenshot of import upload files

Next, click the Upload File button on the upper-right corner to upload a file. The upload file feature will compare the uploaded file with the previous CSV input files in the repository.

For example, you have an input CSV file at import-data/mercury/2024.csv for your Mercury transactions for the whole 2024 year. Now, you export the CSV file for 2024 again, and it contains more transactions. You can simply upload it, and BeanHub will find out which previous CSV file you are trying to update and set the file path for you to override it. You can review the path it found for you before running the import:

Screenshot of review import upload file

Once you click the Import button, the uploaded CSV file will be written to the given path, and beanhub-import will run to generate new transactions or update existing ones for you. And here you go, all the new transactions from the uploaded CSV file were added in a new commit for you like this:

Screenshot of git diff after import

What about those transactions that do not match my import rules? Currently, we only add new transactions generated from import rules, but we will provide a new feature for you to easily input the ones that are not matched by rules.

Run import locally

From time to time, you may want to try out your BeanHub Import rules before you push them to the BeanHub. Or, you prefer to run the import locally and commit the changes yourself, then push it to BeanHub. Either way, you can use the BeanHub Import powered by BeanHub CLI.

You need to install it first. It’s a Python package, so you need to have Python installed. Then you can run this to install it:

pip install beanhub-cli

Once you have it installed, you can then run the following command in your BeanHub repository folder:

bh import

Easy! To try it out yourself, you can clone our beanhub-import-demo repository to experiment.

New repository template

One more thing. We also updated our project template beanhub-beancount-cookiecutter to include all the necessary data and configurations so you can try out BeanHub’s new feature. We also added a README that shows you what you can do with the repository.

Screenshot the README document generated from the new repository template

Since a private repository is now available to free users, we highly recommend you create a new private BeanHub repository with the sample project option and follow README’s introduction to try it out yourself.

More to come

The new import feature makes importing a CSV exported from your bank account easier than ever. However, you still need to export the CSV files and upload them to BeanHub manually. What if I tell you that in the near future, we will provide automatic periodic transactions importing features from almost all the banks? Yes! We are currently working on it and will release it soon. We will raise our plan price accordingly after we release the new feature. Subscribe now while the price is still low.

Finally, as always, please don’t hesitate to contact us at support@beanhub.io if you have any questions or feedback!