In today’s digital world, companies deal with a large amount of different data daily. The need to extract relevant information from various sources, for example PDFs, is a slow task because users manually copy or re-type the data. However, Microsoft AI Builder, an intelligent document processing tool simplifies this process dramatically.
In this article, we’ll explore how Microsoft AI Builder can help businesses extract data from PDFs and automate the process using Power Automate.
What is the Microsoft AI Builder?
Microsoft AI Builder is an intelligent automation tool that allows businesses to create custom AI models with a simple and intuitive drag-and-drop interface. So, no technical person is needed, it’s a very simple process to configure the AI Builder to automate data processing like:
- Category Classification
- Form Processing / Document processing (we need this to extract data from PDFs)
Key phrase extraction
Object Recognition on photos and images
Identity document reader like driver license
Business card reader
How to extract data from PDFs using Microsoft AI Builder Document processing?
The Microsoft Document Processing with AI Builder can recognize patterns in PDFs and extract data accurately and efficiently. The perfect way to automate and scale the extraction of PDF to structured data.
Creating an AI Model for PDF Data Extraction
To extract data from PDFs using Microsoft AI Builder, the first step is to create an AI model. Here are the steps to create an AI model for PDF data extraction:
Step 1: Create a new model
To create a new model, log in to the AI Builder portal, click on “AI Builder” in the left navigation, and select “Create a model.” Then, choose “Document Processing” as the model type, and select “PDF Processing.”
Step 2: Upload 5 sample PDFs
To train the AI model, it’s necessary to upload 5 sample PDFs with the same layout. After uploading the PDF, the AI model will analyze the document and recognize patterns in the data.
Step 3: Define the data to extract
After the AI model has analyzed the PDFs you can mark up the fields you want the model to extract by clicking on them (e.g. “Invoice Number”, “Customer Name”, “Invoice Date”).
Step 4: Train the AI model
Click on “Train” to train the model. This takes several minutes.
Step 5: Test the AI model
After training, test the model on another invoice to ensure it’s working correctly.
Step 6: Publish the AI model
Once the AI model has been published it can be used in Power Automate and other applications.
What is Power Automate?
Power Automate is a cloud-based service from Microsoft to automate workflows between different applications and services, similar to Zapier and IFTTT. Users can create custom workflows, called flows, that can perform a wide range of automated tasks like sending emails, creating tasks, updating databases and many many (!) more. Power Automate integrates with over 300 different services, including all Microsoft services for example MS Teams, SharePoint, Microsoft Dynamics 365, Power BI, Outlook, Excel, OneNote etc.
How to extract data from PDFs by using the model of AI Builder in Power Automate?
Creating a new flow with the “AI Builder” action
Now, we can use our trained AI Builder Model from above in Power Automate to extract the data from PDFs and insert them as structured data into a table on Sharepoint:
Step 1: Create a new flow
To create a new flow, log in to the Power Automate portal, click on “Create” in the left navigation, and select “Automated flow.” Then, choose a trigger for the flow, such as “When a new PDF is added to OneDrive.”
Step 2: Add the AI Builder action
Next, add the AI Builder action to the flow by clicking on “New Step” and selecting “Add an action.” Then, search for “AI Builder” and select “Extract data from PDF.”
Step 3: Configure the AI Builder action
After adding the AI Builder action select the PDF file you want to extract data from and map the extracted fields to the appropriate fields in your destination (e.g. SharePoint, OneDrive, Excel).
Step 4: Test the model
Save the flow and test it.
Step 6: Publish the model
Now, finally publish your new flow and use it by triggering it!
Benefits of using Microsoft AI Builder in Power Automate
There are several benefits to using Microsoft AI Builder in Power Automate to extract data from PDFs:
- Saves time: Extracting data from PDFs manually is super time consuming, every account knows what I am talking about. Using AI Builder in Power Automate can automate the process, saving time and reducing errors. Imagine manually transporting 1x PDF Invoice data to Excel for about 60 seconds, the AI builder can do this in 1 second. Now, multiply your amount of monthly PDFs with the time usage!
- Increases accuracy: The accuracy is already tremendous and since its AI it will become every day more accurate!
- Highly individual: You can upload new types/layouts of PDFs and train new models. With the OCR capabilities also scanned PDFs are supported.
- No developing costs: Individual software customizations for individual PDFs are expensive and are taking time to develop. Users can get started with the AI Builder without software developer skills and just need to pay the monthly usage.
- Future potential: The AI Builder Document processing is in a Preview stage currently, which means that in the future we can expect a lot more potential and a better price.
5 Use cases for Microsoft AI Builder in Power Automate to exact data from PDFs
Many businesses receive invoices in PDF format. AI Builder in Power Automate can be used to extract data from these invoices, such as the vendor name, invoice number, and total amount. This data can then be automatically entered into an accounting system or database, saving time and reducing errors.
Financial institutions often require borrowers to submit various documents in PDF format, such as bank statements and tax returns. AI Builder in Power Automate can be used to extract data from these documents, making it easier and faster to process loan applications.
Legal document processing
Law firms can extract names of parties involved, case numbers, and key dates.
Healthcare record processing
Healthcare providers can extract data from the patient records, such as patient names, medical conditions, and medication lists. This can improve the patient care workflow.
Human resources for job applications
HR departments mostly receive job applications in the PDF format, also here the AI Builder in Power Automate can be used to extract data from these documents, such as candidate names, contact information, and work experience.
The AI Builder is included in Power Automate, so it’s also required to have a Power Automate license in order to use the AI Builder Document Processing. See the Power Automate pricing
The Ai Builder uses credits for different processing purposes. The document processing costs around 5-10 euro cents for each PDF page, minimum 500 euro per month with tier 1. Checkout the pricing calculator for AI Builder.
In conclusion, the Microsoft AI Builder’s ability to extract data from PDF files in Power Automate offers a brilliant solution for businesses looking to streamline their workflows. For small businesses it’s quite a big expense right now, but may be the price as technology rises. No doubt, it’s the ultimate solution for saving time and increasing productivity.
Hi, I am a software developer. Do you need help with PDF automation?
I am a software engineer with over 9 years of professional development experience in desktop applications. So, you are in the right place to ask questions:
- I have consulted and implemented solutions for hundreds of clients to increase productivity by automating workflows.
- I create Addins for Outlook, Excel, PowerPoint and Word to automate your monotone work process and save you time.
Example 1: Excel, prevent PDF publishing if certain criteria in the Excel file are not filled.
Example 2. Outlook, force users to classify the email before sending them.
Example 3: Excel, match source files like PDFs with excel rows and sync data.
- For any questions, contact me now, and I always find a solution!
Patrick Gruber is homeless because
he made his dream of being a digital nomad real.
He started as a developer, ventured into Amazon FBA business, invested in the market, founded a Cardano Stake Pool, and started his blog in 2022.
His blog shares his insight into the LIMITLESS possibilities of life.
If you're looking to change your world and gain practical knowledge, you're in the right place. Keep reading to learn more.