How to do zone OCR in C# using Dynamic .NET TWAIN

Optical character recognition (OCR) is an important part of document management workflow. With OCR, the scanned images of documents can be stored more compactly. OCR also lets the scanned document be converted so the text becomes searchable.

Sometimes you may not want to OCR through the whole page of an image because it takes too long.  Dynamic .NET TWAIN version 4.3 added support for zone optical character recognition (OCR). With it, you can select a specified area of an image and OCR only that part. In this article I will show you how to do zonal OCR in C# using the .NET OCR add-on of Dynamic .NET TWAIN.

Background

Dynamic .NET TWAIN is a document scanning SDK based on the Microsoft .NET Framework which enables you to acquire images from TWAIN scanners and webcams. It also enables editing of images and saving / uploading to your local / server disk or database. It comes with an OCR SDK which you can use to convert scanned documents or webcam captures to searchable PDFs or text files.

If you don’t have Dynamic .NET TWAIN on your machine, you can download the 30-day free trial and install it first. The installer comes with an English language package for OCR by default. If you are using a different language in your document, you can download the corresponding language package here.

Embed Dynamic .NET TWAIN to your WinForms App

First, create a new C# Windows Forms Application or open your existing one in Visual Studio. We will need to add DynamicDotNetTWAIN.dll, DynamicOCR.dll and the corresponding language package. To do this, click Tools -> Choose Toolbox Items, under .NET Framework Components tab, click Browse… button and locate DynamicDotNetTWAIN.dll at “..\Program Files (x86)\Dynamsoft\Dynamic .NET TWAIN 4.3 Trial\Bin\v4.0” or v2.0 (depends on the .NET Framework version you are using). Click OK. After that, we will see the DynamicDotNetTwain component in the Toolbox dialog (under the View menu), as shown in the following image.

Add Dynamic .NET TWAIN .NET Component

 

In Solution Explorer, right-click the project file and click Add-> Existing Item… Then, choose All Files in the drop-down list of the file type filter. Navigate to “..\Program Files (x86)\Dynamsoft\Dynamic .NET TWAIN 4.3 Trial\Bin\OCRResources” and add the items in the folder to the project.

Then we can drag and drop the .NET TWAIN component to the form.

This is the code for a LoadImage button click:

        private void button1_Click(object sender, EventArgs e)
        {

            OpenFileDialog filedlg = new OpenFileDialog();
            if (filedlg.ShowDialog() == DialogResult.OK)
            {
                dynamicDotNetTwain1.LoadImage(filedlg.FileName); // choose an image from your local disk and load it into Dynamic .NET TWAIN
            }
        }

Now we can try to OCR the loaded image and convert it to a searchable txt file.

private void dynamicDotNetTwain1_OnImageAreaSelected(short sImageIndex, int left, int top, int right, int bottom)
        {
            dynamicDotNetTwain1.OCRTessDataPath = "../../"; // the path of the language package (tessdata)
            dynamicDotNetTwain1.OCRLanguage = "eng"; // the language type

            dynamicDotNetTwain1.OCRDllPath = "../../"; //the relative path of the OCR DLL file
            dynamicDotNetTwain1.OCRResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text; 

            byte[] sbytes = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom); // OCR the selected area of the image
            if (sbytes != null)
            {
                SaveFileDialog filedlg = new SaveFileDialog();                
                filedlg.Filter = "Text File(*.txt)| *.txt";

                if (filedlg.ShowDialog() == DialogResult.OK) {
                    FileStream fs = File.OpenWrite(filedlg.FileName);
                    fs.Write(sbytes, 0, sbytes.Length); //save the OCR result as a text file
                    fs.Close();
                }

                MessageBox.Show("OCR successful");
            }
            else
            {
                MessageBox.Show(dynamicDotNetTwain1.ErrorString);
            }
        }

This is how the application looks.

Demo App of Zone OCR using Dynamic .NET TWAIN OCR SDK

Get OCR sample code>>

Download 30-day free trial of Dynamic .NET TWAIN>>

Let us know in the below comments section about your experience using zone-based OCR in your WinForms application.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe Newsletter

Subscribe to our mailing list to get the monthly update.

Subscribename@email.com