Classification of business addresses via deep learning
Project duration: 1 year, 2 months
Brief description
As part of the optimization of delivery processes, the proportion of shipments delivered to business addresses is being examined. To this end, PTA is analyzing the current state of the art and identifying suitable methods from the field of deep learning and artificial intelligence (AI). It then develops and implements an AI model based on neural networks that uses delivery information, such as recipient details and delivery address, to automatically determine whether a shipment is likely to have been delivered to a private or business address.
Supplement
PTA develops sequence-to-sequence language models (Seq2Seq) using the Python framework PyTorch and validates them using previously generated test data in a benchmark. Seq2Seq models essentially consist of two components: an encoder and a decoder. While the encoder analyzes and semantically captures the input text, the decoder decides whether the respective delivery address is a private or business address. Since the address data is transmitted to the customer by the clients via various input channels and interfaces, it is not uncommon for individual address components (e.g., title, first name, last name, company name, street, house number, house number suffix, postal code, city, or district). The model must therefore be able to tolerate such inconsistencies in order to achieve a sufficiently high classification accuracy.
Subject description
The model, which was previously trained on a synthetic test data set, is evaluated using a representative sample. Based on the proportion of business addresses determined by the model, the customer can precisely and cost-effectively determine the corresponding proportion in the population. This enables the customer to make informed assessments of various optimization measures in the delivery process and to evaluate their potential in a targeted manner.