Short description:

As a result of an expected increase in shipments to be transported, shipment data is to be generated as part of a project for the simulation of a logistics network. The shipment data serves as input for an in-house developed tool for the simulation of all logistically relevant processes of the company and thus forms the basis for evaluating different future scenarios against each other.


The PTA supports the provision of actual data, which serves as reference data or reference distribution. By means of SQL scripts, the data is provided according to previously defined interfaces so that the subsequent process of generating artificial shipment data can generate consistent data. The process of generating the shipment data is implemented in Python. PTA advises the customer on the development of the architecture of the process and on performance-relevant aspects. The main challenge here is the generation of consistent mass data (several million data records) in a few hours by using clever caching techniques. Adequate methods from the field of statistics, such as sampling and imputation, are used to ensure that the key statistical indicators of the generated data correspond to the indicators of the reference distribution.

Technical description:

Due to the expected volume growth, possible effects on the customer's logistics network are to be identified in advance. Based on the customer's potential future scenarios, shipment data is generated that serves as the basis for the simulation of the logistics network. Based on the generated data, the simulation of relevant logistical processes enables weak points, e.g. in the form of capacity bottlenecks, to be uncovered and various measures to be evaluated against each other. Thus, on the one hand, the adjusting screws for the optimal design of the network can be identified and, on the other hand, the effects of the changes can be tested.