Skip to main content
All CollectionsHow to GuidesExtracting Information
How to extract data from a document?
How to extract data from a document?

In this article we will cover ways to extract information from a document

Ilia Zelenkin avatar
Written by Ilia Zelenkin
Updated over a week ago

By automating document data extraction you can save precious time on manual data entry.

The documents come in different forms and there are several ways to extract information from the document. You can run several workflows on the same document depending on what you need to extract.


Using Templates


Standard Documents typically have the same structure. Bitskout constantly adds new templates where you can find standard documents like invoices or utility bills. For those you don't need to set up anything, just choose the document you and that's it.

Let's take the invoice example.

  1. Go to Templates. The list of templates will appear:

  2. You can search through the list or choose the Use Case in the menu. Once you found the template you need, click on Use Template.


  3. You will be transferred to an output configuration screen. The first step is to click on "Write Data to Fields" to see the options. And then click on "Select application to open the application setup screen.

  4. Once you've added a project/board, you will see the list of fields. Just drag and drop the value you extract from the left hand side to the field on the right hand side.



  5. Once finished configuring the output, press Next and give your model name and some description on the next screen.

The plugin is now ready to be used.


Your Documents

If there is no template, you can always create a new plugin.

In this guide, we will learn how to use BitScout's new feature to extract data from any document. This feature allows you to easily extract information from different types of forms by providing examples.

Video Instruction

Log in to your Bitskout account and click Create Plugin.

The next step is to click on the Extract button and then choose "From File":

Now we need to load an example. Bitskout needs examples from you to learn what you'd like to extract. We will use SEC filing reports as an example.

Add examples to guide the data extraction process. Once you've loaded the file, write the fields that you'd like to extract. You need to give the field a name and the value should come from the loaded examples. See below:

You can add more examples to improve the accuracy of the data extraction. Let's add another form with a different layout.

Once the file is loaded, you'll need to add the required values from this example:

Once you're done with examples, press Create and the plugin will be created. Next, depending on what you'd like to do, choose a tool where you want to use the plugin.

Verify the accuracy of the data extraction by testing it on various forms. For example, try using an "Apple 10-Q form":

Conclusion

Now your plugin is ready to be used. This way you can extract data from the documents with just a few examples.

We recommend adding 2-3 examples that have varying layouts. This way Bitskout will understand the variety of the documents that it'll have to analyze the information.

You can load documents in any language - the most important part is to add clear values from that example to extract.

Did this answer your question?