How Shippo ensures accurate label costs with carrier invoice reconciliation
At Shippo, we strive to capture label costs accurately for our customers. A label’s cost is determined not only at purchase time when a customer gets a quote for their label, but also at invoice time after the label is scanned and the package is delivered to its final destination.
What is carrier invoice reconciliation?
Carrier invoice reconciliation is a financial process that closes the gap between the quoted cost and the final cost of the delivered label. The final cost of a label can be different from the quoted cost due to several factors such as discrepancies in the weight and dimensions of a package, service level changes, zone changes, carrier-imposed fees (e.g. Saturday delivery, Residential delivery), and customs and taxes.
The table below shows a sample label with a quoted cost of $29.30 and a final cost of $34.09. For this label, the $4.79 difference comes from two factors: an increased fuel surcharge from the carrier and a package pickup charge of $4.00:
Reconciliation system as a black box
The reconciliation system has two inputs and generates two outputs. The two inputs are label information from carrier quotes and carrier invoices for a given label. If there is a difference in cost between the quote and the invoice, the reconciliation system bills the customer for that difference. The other output is a report containing salient information on why there was a difference. These data points are valuable for shippers to understand how they can optimize their shipping costs. Equally as important, the data is used to cross-check accounting balances and statements by CFO organizations.
Quote processing
Reconciliation begins at label purchase. When a customer purchases a label through Shippo, it triggers a label-created event that contains not just the carrier-generated label cost and surcharges but also customer-reported package weight and dimensions, origin and destination, and selected service level. This extra information about the package is necessary to understand why a label cost changed.
Invoice processing
Invoice processing is the highlight of a reconciliation system. We divided invoice processing into eight steps:
This breakdown was born out of necessity: reconciliation is a complicated process, and doing so helps us manage that complexity. Once the steps were defined and the interfaces between steps codified, it was simpler to write clear, coherent code for each step. Unit testing for each step was much easier and more thorough as well.
The tech stack that ties all these steps together is Polars embedded in a FastAPI microservice backed by a PostgreSQL database cluster. Events like label-created events or billing events are published and consumed via hosted Apache Kafka. Polars (https://pola.rs/) is an open-source dataframe manipulation library. In our benchmark test of reconciliation, Polars used 67% less CPU and 88% less peak memory than pyspark. It also helped that carrier invoice files are small enough to fit in memory of a single machine, removing a need for a highly scalable distributed system like Apache Spark.
How the system satisfies important requirements
A complex system naturally comes with various requirements from different stakeholders, but if we were to list the three most important requirements for our reconciliation system, they would be accuracy of costs, idempotent billing, and charge traceability for financial audits.
Let’s dive into how we’re meeting these three requirements with our system.
Accuracy of costs
To guarantee the accuracy of label costs, our reconciliation system incorporates two well-known patterns: single source of truth and immutable records. The source of truth for label costs is the carriers themselves. Carriers generate the quote-time label cost and the invoice files after package delivery. These carrier-generated costs are stored in our central reconciliation database without modification, following the single source of truth pattern. Once the data is stored, it can never be updated, following the immutable records pattern. Any “updates” to the cost, such as when an invoice updates the original quoted cost, are stored as more immutable records containing updated costs. This way, we keep an accurate history of records for any changes (or no changes) in label costs.
Idempotent billing
Idempotent billing is supported with a couple of techniques: hashing invoice file content, removing duplicate charges, and ensuring the idempotency of billing events generated by the system. For every carrier invoice file processed by the reconciliation system, we store a hash of the file content to remove duplicate invoice files even if they have different names. For every label in an invoice file, we remove duplicate entries by looking at the tracking number and total cost or invoice ID, depending on the carrier. If a label charge has already been reconciled in the past, it is removed from future reconciliation via duplicate detection in the Resolve step. For every charge within a label, we assign a globally unique identifier for it in the Normalize step. This identifier is passed on to billing events and is used as an idempotency key so that duplicate billing events do not result in duplicate charges for customers.
Charge traceability
Charge traceability is important for both financial audits and customers to track where a particular charge came from. Our reconciliation system guarantees end-to-end traceability from the line number of a carrier invoice file to a specific charge posted to customers. The globally unique identifier for each charge is used for both billing and storing the charge in the central data store. A specific charge is stored in the data store along with the carrier invoice file where the charge originated from, and the line number within the invoice file.
Result
For the seven months ending March 2024, around 1100 carrier invoice files and 774,000 labels were reconciled using this system. These numbers are only a small portion of reconciliation done at Shippo as we are in the process of migrating customers and carriers from a legacy system.
Future
Lots of interesting new projects are on the horizon for invoice reconciliation. In the short term, we’re continuing to migrate more carriers to this system and build better dashboards for our finance team. In the long term, we want to enable customers and our finance team to handle disputes with carriers seamlessly.