Invoice Processing with Amazon Bedrock

Managing invoices is a critical yet often cumbersome task for businesses of all sizes. The sheer volume of data, coupled with the need for accuracy and efficiency, can make invoice processing a significant challenge. This code repo provides a solution using Streamlit application and Bedrock Anthropic models to streamline and automate the process.

This project demonstrates how to process PDF invoices stored in an Amazon S3 bucket using Amazon Bedrock. Amazon Bedrock is a fully managed service for building generative AI applications that gives access to range of LLM's. In this project, we will extract the invoice data and summarize the invoice and finally store in a JSON file. Alternatively, you can store this JSON and key value in your operational databases as required.

GenAI-powered invoice processing & review app

This application uses Amazon Bedrock Knowledge Base - Chat with document feature with Anthrophic Claude Sonnet LLM to extract information from pdf invoices and provides a streamlit app which displays the invoices and extracted information side-by-side for easier review.

Prerequisites

Python 3.7 or later on your local machine
AWS CLI installed and configured with appropriate credentials
- Set the region to where you would like to run this invoice processor by following the Set up AWS Credentials and Region for Development documentation.
Note: The region must have Bedrock and Anthropic Claude 3 Sonnet model available. You can check it here.
Access to foundation model Anthropic Claude 3 Sonnet on Amazon Bedrock in the region chosen
Invoices that you want to process

Install dependencies and clone repo

Clone Github repository

git clone https://github.com/aws-samples/genai-invoice-processor.git

Navigate to the project directory:

cd </path/to/your/folder>/genai-invoice-processor

Upgrade pip
```
python3 -m pip install -–upgrade pip
```
(Optionally) create a virtual environment to isolate dependencies:
```
python3 -m venv venv
```
Activate the virtual environment:

Mac/Linux: source venv/bin/activate

Windows: venv/Scripts/activate
Install the necessary Python packages:
```
pip install -r requirements.txt
```
Update the region in the config.yaml file to the same region set for your AWS CLI where Bedrock and Anthropic Claude 3 Sonnet model is available.

Create a S3 bucket to store the invoices via console or AWS CLI

Create Bucket -
```
aws s3 mb s3://<your-bucket-name> --region <your-region>
```
- Replace your-bucket-name with the desired name of your S3 bucket.
- Replace your-region with the AWS region set for your AWS CLI and in config.yaml, such as us-east-1.
Using the below AWS cli command, copy your invoices from your local computer to the S3 bucket created in the step above. If you would like to create a folder within the S3 bucket and upload your invoices there, then follow the second command below.
```
aws s3 cp </path/to/your/local/folder/with/invoices> s3://<your-bucket-name>/ --recursive
```
```
aws s3 cp </path/to/your/local/folder/with/invoices> s3://<your-bucket-name>/<folder>/ --recursive
```
Validate the Upload
```
aws s3 ls s3://<your-bucket-name>/ 
```

Configuration

This project uses a config.yaml file for configuration. Before running the application, ensure you've reviewed and updated this file as needed:

The file contains settings for AWS region and the Bedrock model ID.
The default model is set to Calude 3 Sonnet, you can find the model IDs on https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html
It also specifies the output file path and local download folder for invoices.

Steps to Run

To process invoices stored in an S3 bucket, run the following command:

Step 1: Process invoices

In this step we will process the invoices in S3 bucket and store the model output in the processes_invoice_output.json file. We are performing below 3 steps while processing the invoice:

Extracting data from each invoice in key value format.
Extracting only key infomation from the invoice required by our stakeholders.
And finally summarize the invoice.

You can check the prompts used in the invoices_processor.py file. And you can use different LLM's for all of these 3 steps.

python invoices_processor.py --bucket_name='<your-bucket-name>' --prefix='<your-folder>'

Note: The --prefix argument is optional. If omitted, the script will process all PDFs in the bucket.

Examples: python invoices_processor.py --bucket_name='gen_ai_demo_bucket'

python invoices_processor.py --bucket_name='gen_ai_demo_bucket' --prefix='invoice'

After successful completion of the job, you should see a invoices folder in your local file system with all the invoices. You will also see a processed_invoice_output.json file with all the metadata extracted by Amazon Bedrock Knowledge Base using Claude Sonnet Model.

Step 2: Review invoice data extracted by Amazon Bedrock

To review the processed invoice data, you can run the Streamlit app with the following command:

streamlit run review-invoice-data.py

or

python -m streamlit run review-invoice-data.py

The Streamlit app will open in your default web browser, allowing you to view and interact with the processed invoice data.

Project Structure

invoices_processor.py: The main script for processing invoices stored in an S3 bucket.
review-invoice-data.py: The Streamlit app for reviewing the processed invoice data.
requirements.txt: List of required Python packages.
README.md: This file, containing project documentation.
config.yaml: This contains the configuration for the AWS region, bedrock model and local folder/file structure to be used by both scripts

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
invoice-extractor.png		invoice-extractor.png
invoices_processor.py		invoices_processor.py
requirements.txt		requirements.txt
review-invoice-data.py		review-invoice-data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Invoice Processing with Amazon Bedrock

GenAI-powered invoice processing & review app

Prerequisites

Install dependencies and clone repo

Create a S3 bucket to store the invoices via console or AWS CLI

Configuration

Steps to Run

Step 1: Process invoices

Step 2: Review invoice data extracted by Amazon Bedrock

Project Structure

Security

License

About

Releases

Packages

Contributors 3

Languages

License

aws-samples/genai-invoice-processor

Folders and files

Latest commit

History

Repository files navigation

Invoice Processing with Amazon Bedrock

GenAI-powered invoice processing & review app

Prerequisites

Install dependencies and clone repo

Create a S3 bucket to store the invoices via console or AWS CLI

Configuration

Steps to Run

Step 1: Process invoices

Step 2: Review invoice data extracted by Amazon Bedrock

Project Structure

Security

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages