Afshin Sadeghi, Hirra Malik, Diego Collarana, Jens Lehmann
Knowledge graphs (KGs) encode facts about the world in a graph data structure where entities, represented as nodes, connect via relationships, acting as edges. KGs are widely used in Machine Learning, e.g., to solve Natural Language Processing based tasks. Despite all the advancements in KGs, they plummet when it comes to completeness. Link Prediction based on KG embeddings targets the sparsity and incompleteness of KGs. Available datasets for Link Prediction do not consider different graph patterns, making it difficult to measure the performance of link prediction models on different KG settings. This paper presents a diverse set of pragmatic datasets to facilitate flexible and problem-tailored Link Prediction and Knowledge Graph Embeddings research. We define graph relational patterns, from being entirely inductive in one set to being transductive in the other. For each dataset, we provide uniform evaluation metrics. We analyze the models over our datasets to compare the model’s capabilities on a specific dataset type. Our analysis of datasets over state-of-the-art models provides a better insight into the suitable parameters for each situation, optimizing the KG-embedding-based systems.