Large-Scale Test Data Set for Location Problems

Matej Cebecauer a and Lubos Buznab,c

aDepartment of Transport Science, KTH Royal Institute of Technology, Teknikringen 10, SE‑100 44 Stockholm, Sweden

bDepartment of Mathematical Methods and Operations Research, University of Zilina, Univerzitna 8215/1, SK‑010 26 Zilina, Slovakia

cERA Chair for Intelligent Transport Systems, University of Zilina, Univerzitna 8215/1, SK‑010 26 Zilina, Slovakia

Contact email: matejc@kth.se

Abstract

Designers of location algorithms share test data sets (benchmarks) to be able to compare performance of newly developed algorithms. In previous decades, the availability of locational data was limited. Big data has revolutionised the amount and detail of information available about human activities and the environment. It is expected that integration of big data into location analysis will increase the resolution and precision of input data. Consequently, the size of solved problems will significantly increase the demand on the development of algorithms that will be able to solve such problems. Accessibility of realistic large scale test data sets, with the number of demands points above 100 000, is very limited. The presented data set covers entire area of Slovakia and consists of the graph of the road network and almost 700 000 connected demand points. The population of 5.5 million inhabitants is allocated to the locations of demand points considering the residential population grid to estimate the size of the demand. The resolution of demand point locations is 100 metres. With this article the test data is made publicly available to enable other researches to investigate their algorithms. The second area of its utilisation is the design of methods to eliminate aggregation errors that are usually present when considering location problems of such size. The data set is related to two research articles: A Versatile Adaptive Aggregation Framework for Spatially Large Discrete Location-Allocation Problem [1] and Effects of demand estimates on the evaluation and optimality of service centre locations [2].


Specifications Table

Subject area

applied mathematics, operations research, discrete optimization

More specific subject area

location analysis, geographic information systems

Type of data

graph of the road network, weighted demand points derived from GIS data and residential population grid

How data was acquired

Data set was created by combing publicly available data sets such as OpenStreetMap and residential population grid.

Data format

csv text files, shapefiles

Data source location

Slovakia (Longitude 17.001 - 22.110, Latitude 47.732 - 49.586)

Data accessibility

Data is published together with the article. Moreover, data is published on the professional web page of one of the co-authors:
http://frdsa.uniza.sk/~buzna/page5/page5.html


Value of the Data


1 Data


Central component of the benchmark Slovakia is the graph consisting of 1 956 067 georeferenced nodes further defining 2 080 694 edges representing the road sections covering the entire area of Slovakia. 663 203 of these nodes identify the potential population demand distribution derived from the residential population density. In the literature it is common to refer to these points as to demand points (DPs). A potential demand is located in the populated area approximately each 100 meters and connected to the road network (see Figure 1 for illustration).