Constructing a Visual Dataset to Study the Effects of Spatial Apartheid in South Africa

Part of Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1 (NeurIPS Datasets and Benchmarks 2021) round2

Bibtex Paper Reviews And Public Comment » Supplemental


Raesetje Sefala, Timnit Gebru, Nyalleng Moorosi, Luzango Mfupe, Richard Klein


Aerial images of neighborhoods in South Africa show the clear legacy of Apartheid, a former policy of political and economic discrimination against non-European groups, with completely segregated neighborhoods of townships next to gated wealthy areas. This paper introduces the first publicly available dataset to study the evolution of spatial apartheid, using 6,768 high resolution satellite images of 9 provinces in South Africa. Our dataset was created using polygons demarcating land use, geographically labelled coordinates of buildings in South Africa, and high resolution satellite imagery covering the country from 2006-2017. We describe our iterative process to create this dataset, which includes pixel wise labels for 4 classes of neighborhoods: wealthy areas, non wealthy areas, non residential neighborhoods and vacant land. While datasets 7 times smaller than ours have cost over 1M to annotate, our dataset was created with highly constrained resources. We finally show examples of applications examining the evolution of neighborhoods in South Africa using our dataset.