Short description:

Invoices contain relevant information for every company. The manual extraction of this data from the documents is time-consuming and should be automated. AI models are used for this purpose.


The HuggingFace library is used. This is a Python library for computational linguistics. The pre-trained models are executed in Google Colab. In particular, the model LayoutXLM is used. The task also includes setting up a pipeline that reads in the data from the documents, preprocesses it, passes it on to the ML model and postprocesses it. In addition, the model must be trained for the specific task using data sets.

Technical description:

The model is to extract data from documents such as invoices. Since the documents from different customers have a heterogeneous layout, an AI model must be used for pattern recognition. So-called transformers are used for this. These are machine learning models that roughly consist of two blocks, an encoder whose task is to understand the read text and a decoder that generates new text based on input data. The specific task of data extraction here requires only one encoder.