How to extract text from a PDF image

How To Extract Text from Image Using Azure Cognitive Services

In this Azure tutorial, we will discuss how to extract text from a pdf image.

How to extract text from a PDF image

Well, here, we will discuss a very important example: How To Extract Bill Details From A PDF Image Using Azure Cognitive Services.

In real-time scenarios, we will come across many scenarios where instead of manually reading the bill, we will have to read the bill details from an image, and we need to maybe auto-populate these details in a form. Nowadays, in every case, we are not doing anything manually. We want to implement automation everywhere. We want to automate every process nowadays.

Microsoft helps us provide Azure Cognitive Service OCR to deal with this scenario. OCR’s meaning is Optical Character Recognition.

Read Text From Images

So, As we know, using the Azure Cognitive Service, A developer can easily implement the AI feature without any expertise in the AI and ML areas. The developer’s dream becomes true. A developer can easily implement the AI features by just calling the Azure Cognitive Service API. Very little effort is involved, and lots of time developers can save to achieve such an excellent functionality, i.e., the AI.

Here, as part of this functionality development, we will try to extract the date and the bill details from the image.

So, Before starting the actual development, we should know the Prerequisites needed for this development activity.

Prerequisites

  • You must have a valid Azure Subscription or a Valid Azure Account. If you don’t have till now, create an Azure Free Account now.
  • Visual Studio 2019 needs to be installed on your Machine. If you don’t have it on your local machine or dev machine, Install Visual Studio 2019 now.

Create Azure Cognitive Service using Azure Portal

We will discuss How to Create Azure Cognitive Service using Azure Portal.

1. Log in to the Azure Portal (https://portal.azure.com/)

2. Once you log in to the Azure Portal, then you need to search for Cognitive Services. Then click on the search result as shown below

extract text from image

One more way to get the same options: From the left navigation, you can also click on the + Create a resource and search for Cognitive Services. Then click on the search result as shown below

how to extract text from an image

3. The next step is, to click on the Create button as highlighted in the below window.

how do i extract text from an image

4. Provide the below details On the Create Cognitive services window

  • Subscription: You must Choose the valid subscription you want to use here.
  • Resource Group: For this option, You need to choose your existing Resource Group if you have one. If you don’t have an existing Resource Group, then you can create a new Resource Group by clicking on the Create new link.
  • Region: Provide the Region for your Azure Cognitive Services.
  • Name: Here, Provide a unique name for your Azure Cognitive Services.
  • Pricing tier: Select the pricing tier based on your business needs. You can click on the View full pricing details link to verify the pricing details for your reference.
  • Select the below highlighted two checkboxes to accept the terms and conditions.

Finally, click on the Review + Create button as highlighted below.

extract text from image free

Now it will check whether all the information entered by you is correct or not. Once it will find all the details are correct, It will show you Validation Passed. Now you can able to see the Create button is enabled. Click on the Create button as highlighted below to create the Cognitive Service.

cognitive services ocr

Now, you can see your deployment is completed successfully without any issues. Click on the Go to Resource button to navigate to the Azure Cognitive service you have created.

 How to Create Azure Cognitive Service Azure Portal

You can see the new Cognitive service once you click the Go to Resource button.

On the Cognitive service page, click on the keys and Endpoint option from the left navigation. Now you can see the Key1 and ENDPOINT values; keep both values and keep them with you, as we will use those values in our code in the next steps. Click on the copy button as highlighted to copy those values.

Azure Cognitive Services OCR

Creating Console App (.NET Core) Visual Studio 2019

The next step is creating the console App (.NET Core) using Visual Studio 2019. Assuming that you have already installed Visual Studio 2019. Let’s start creating the console App (.NET Core).

Open Visual Studio 2019 and click on the Create a New Project button

Select the Console App (.NET Core) as the project template and click Next.

Azure Cognitive Services OCR tutorial

Provide a name for the Project and then click Provide the location where you want to save your project on the Configure your new project window and click the Create button.

How To Extract Image Details Using Azure Cognitive Service

The Console App (.NET Core) project will be created successfully without any issues. Once the project is created successfully, the next step is adding a NuGet package.

Right-click on the Project name â€”–> Click on the Manage NuGet Packages option as shown below

How To Extract Bill Details From An Image Using Azure Cognitive Service

Now, search for Microsoft.Azure.CognitiveServices.Vision.ComputerVision Nuget package, and then you need to select the search result and then click on the Install button, and then click on the I accept button to install the Nuget Package as shown below

Extract Image Details Azure Cognitive Service

Now, if you check the project name.csproj file, it should look like below

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>netcoreapp3.1</TargetFramework>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.Azure.CognitiveServices.Vision.ComputerVision" Version="5.0.0" />
  </ItemGroup>

</Project>

Here is my bill that I have saved in my local machine in C:\Users\Bijay\Desktop\VM\Bill\mybill2.jpg path.

Extract Image Details Azure Cognitive Service Visual Studio 2019

Now, we need to make the code changes for the Program.cs file. So, add the code to your Program.cs file as below.

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
using System;
using System.IO;

namespace mydemoapp
{
    class Program
    {
         // put your Cognitive Service Key1 value
        static string Key = "12c9e662d55547779ce9785985932118";
        // put your Cognitive Service URL
        static string url = "https://mynewcognitiveservices91.cognitiveservices.azure.com/";
        
            static void Main(string[] args)
            {
                var client = new ComputerVisionClient(new ApiKeyServiceClientCredentials(Key));
                client.Endpoint = url;

                var myfile = File.OpenRead(@"C:\Users\Bijay\Desktop\VM\Bill\mybill2.jpg");
                var result = client.RecognizePrintedTextInStreamAsync(false, myfile);
                 result.Wait();

                var rst = result.Result;
                foreach (var r in rst.Regions)
                {
                    foreach (var t in r.Lines)
                    {
                        foreach (var w in t.Words)
                        {
                            var coordinates = w.BoundingBox.Split(',');
                            Console.WriteLine($"Found word {w.Text} at" +
                            $" X {coordinates[0]} and Y {coordinates[1]}");
                        }
                    }
                }
            }
        }
    }

You can see it here

Extract Image Details by using Azure Cognitive Service

After you have added the code to the program.cs file and replace the key and URL value with the key and endpoint URL value of your Azure Cognitive Service that you have created in the Azure Portal and already copied and kept with you in the above steps.

Now, let’s run the project and see what’s happening. So press F5 to run the project and see we got the expected output like below.

How to Create Cognitive Service in Azure Portal

So this is How To Extract Text from an Image Using Azure Cognitive Services. Azure extracts text from an image by following the above steps.

Wrapping Up

In this article, we discussed how to extract text from PDF images. I hope you have enjoyed this article !!!