Knowledge Graph Embedding Space Identification with Hyperparameter Optimization

This thesis examines the impact of model hyperparameters, such as dimensionality and curvatures of the embedding space, on the embedding quality and examines the applicability of different hyperparameter optimization methods.

Completed Bachelor Thesis

Knowledge graph embedding (KGE) is a very successful method for learning compact representations of entities and relations in knowledge graphs. Recent work has demonstrated that the performance of KGE is highly sensitive to the choice of the embedding space [1]. Determining a suitable embedding space (e.g. Euclidean, hyperbolic, spherical or mixed space) and their hyperparameter configurations for the KGE task is still an open problem.  

Recent work used neighborhood growth rate [2] and Ricci graph curvature [3] to identify the embedding space, but these methods only focused on homogeneous graphs, ignoring the heterogeneity and complexity of KG. Besides, they only focused on the curvature of the geometry space and assume the dimensions of these spaces are fixed, ignoring the impact of embedding dimension on the embedding space. 

In this thesis, you will explore hyperparameter optimization [4] to discover a suitable embedding space—the geometry space like Euclidean, hyperbolic, spherical or mixed space for KGE tasks. You will experimentally evaluate how the embedding space and its properties influence the embedding performance for KGE tasks. Optionally, you can explore hyperparameter optimization for mixed curvature space, which is a more challenging problem.

The potential solutions (work packages) would be: 

  • Formulating the embedding space identification as a hyperparameter optimization problem.
  • Conducting hyperparameter optimization to find a suitable embedding space for the KGE task. To eliminate the impact of correlation between different hyperparameters, orthogonal design of model-free hyperparameter optimization [5] would be a potential direction. 
  • Evaluating the impact of embedding space and its dimension on the KGE task.

You will evaluate the method with link prediction tasks on standard benchmarks like WN18RR [6], FB15k-237 [7] and YAGO3-10 [8]. These datasets have some natural hierarchical structures, making them suitable to evaluate the KGE task with different underlying embedding spaces. 

 

[1] Chami, I., Wolf, A., Juan, D.C., Sala, F., Ravi, S. and Ré, C., 2020. Low-Dimensional Hyperbolic Knowledge Graph Embeddings. arXiv preprint arXiv:2005.00545.

[2] Weber, M., 2020, June. Neighborhood Growth Determines Geometric Priors for Relational Representation Learning. In International Conference on Artificial Intelligence and Statistics (pp. 266-276). PMLR.

[3] Prokhorenkova, L., Samosvat, E., & van der Hoorn, P. (2020, September). Global Graph Curvature. In International Workshop on Algorithms and Models for the Web-Graph (pp. 16-35). Springer, Cham.

[4] Yu, T. and Zhu, H., 2020. Hyper-Parameter Optimization: A Review of Algorithms and Applications. arXiv preprint arXiv:2003.05689.

[5] Zhang, X., Chen, X., Yao, L., Ge, C. and Dong, M., 2019, December. Deep neural network hyperparameter optimization with orthogonal array tuning. In International Conference on Neural Information Processing (pp. 287-295). Springer, Cham.

[6] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko.2013. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems, pages 2787–2795.

[7] Bollacker, K., Evans, C., Paritosh, P., Sturge, T. and Taylor, J., 2008, June. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data (pp. 1247-1250).

[8] Mahdisoltani, F., Biega, J. and Suchanek, F.M., 2013, January. Yago3: A knowledge base from multilingual wikipedias.

Supervisors

To the top of the page