Pan, X., Hernández, D., Seifer, P., Lämmel, R., & Staab, S. (2024). eSPARQL: Representing and Reconciling Agnostic and Atheistic Beliefs in RDF-star Knowledge Graphs. In G. Demartini, K. Hose, M. Acosta, M. Palmonari, G. Cheng, H. Skaf-Molli, N. Ferranti, D. Hernández, & A. Hogan (eds.),
The Semantic Web - ISWC 2024 - 23rd International Semantic Web Conference, Baltimore, MD, USA, November 11-15, 2024, Proceedings, Part II (Vol. 15232, pp. 155–172). Springer.
https://doi.org/10.1007/978-3-031-77850-6_9
Abstract
Over the past few years, we have seen the emergence of large knowledge graphs combining information from multiple sources. Sometimes, this information is provided in the form of assertions about other assertions, defining contexts where assertions are valid. A recent extension to RDF which admits statements over statements, called RDF-star, is in revision to become a W3C standard. However, there is no proposal for a semantics of these RDF-star statements nor a built-in facility to operate over them. In this paper, we propose a query language for epistemic RDF-star metadata based on a four-valued logic, called eSPARQL. Our proposed query language extends SPARQL-star, the query language for RDF-star, with a new type of FROM clause to facilitate operating with multiple and sometimes conflicting beliefs. We show that the proposed query language can express four use case queries, including the following features: (i) querying the belief of an individual, (ii) the aggregating of beliefs, (iii) querying who is conflicting with somebody, and (iv) beliefs about beliefs (i.e., nesting of beliefs).BibTeX
He, Y., Hernandez, D., Nayyeri, M., Xiong, B., Zhu, Y., Kharlamov, E., & Staab, S. (2024). Generating SROI⁻ Ontologies via Knowledge Graph Query Embedding Learning. In U. Endriss, F. S. Melo, K. Bach, A. J. Bugarín Diz, J. M. Alonso-Moral, S. Barro, & F. Heintz (eds.),
ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain - Including 13th Conference on Prestigious Applications of Intelligent Systems (PAIS 2024) (Vol. 392, pp. 4279–4286). IOS Press.
https://doi.org/10.3233/FAIA241002
Abstract
Query embedding approaches answer complex logical queries over incomplete knowledge graphs (KGs) by computing and operating on low-dimensional vector representations of entities, relations, and queries. However, current query embedding models heavily rely on excessively parameterized neural networks and cannot explain the knowledge learned from the graph. We propose a novel query embedding method, AConE, which explains the knowledge learned from the graph in the form of SROI− description logic axioms while being more parameter-efficient than most existing approaches. AConE associates queries to SROI− description logic concepts. Every SROI− concept is embedded as a cone in complex vector space, and each SROI− relation is embedded as a transformation that rotates and scales cones. We show theoretically that AConE can learn SROI− axioms, and defines an algebra whose operations correspond one-to-one to SROI− description logic concept constructs. Our empirical study on multiple query datasets shows that AConE achieves superior results over previous baselines with fewer parameters. Notably on the WN18RR dataset, AConE achieves significant improvement over baseline models. We provide comprehensive analyses showing that the capability to represent axioms positively impacts the results of query answering.BibTeX
Navigli, R., Lo Pinto, M., Silvestri, P., Rotondi, D., Ciciliano, S., & Scirè, A. (2024). NounAtlas: Filling the Gap in Nominal Semantic Role Labeling. In L.-W. Ku, A. Martins, & V. Srikumar (Eds.),
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 16245–16258). Association for Computational Linguistics.
https://aclanthology.org/2024.acl-long.857
Abstract
Despite significant advances in Semantic Role Labeling (SRL), much work in this field has been carried out with a focus on verbal predicates, with the research on nominal SRL lagging behind. In many contexts, however, nominal predicates are often as informative as verbal ones, thus needing proper treatment. In this paper we aim to fill this gap and make nominal SRL a first-class citizen. We introduce a novel approach to create the first large-scale, high-quality inventory of nominal predicates and organize them into semantically-coherent frames. Although automatically created, NounAtlas -- our frame inventory -- is subsequently fully validated. We then put forward a technique to generate silver training data for nominal SRL and show that a state-of-the-art SRL model can achieve good performance. Interestingly, thanks to our design choices which enable seamless integration of our predicate inventory with its verbal counterpart, we can mix verbal and nominal data and perform robust SRL on both types of predicates.BibTeX
Seifer, P., Hernández, D., Lämmel, R., & Staab, S. (2024). Inferring SHACL Constraints for Results of Composable Graph Queries (Extended Abstract). In L. Giordano, J. C. Jung, & A. Ozaki (Eds.),
Proceedings of the 37th International Workshop on Description Logics (DL 2024), Bergen, Norway, June 18-21, 2024 (Vol. 3739). CEUR-WS.org.
https://ceur-ws.org/Vol-3739/abstract-23.pdf
Abstract
SPARQL CONSTRUCT queries allow for the specification of data processing pipelines that transform given input graphs into new output graphs. Input graphs are now commonly constrained through SHACL shapes allowing for both their validation and aiding users (as well as tools) in understanding their structure. However, it becomes challenging to understand what graph data can be expected at the end of a data processing pipeline without knowing the particular input data: Shape constraints on the input graph may affect the output graph, but may no longer apply literally, and new shapes may be imposed by the query itself. In our recent work, From Shapes to Shapes: Inferring SHACL Shapes for Results of SPARQL CONSTRUCT Queries, we studied the derivation of shape constraints that hold on all possible output graphs of a given SPARQL CONSTRUCT query by axiomatizing the query and the shapes with the ALCHOI description logic. This extended abstract summarizes our previous work.BibTeX
Asma, Z., Hernández, D., Galárraga, L., Flouris, G., Fundulaki, I., & Hose, K. (2024, May). NPCS: Native Provenance Computation for SPARQL.
Proceedings of the ACM Web Conference 2024 (WWW ’24), May13--17, 2024, Singapore, Singapore.
https://doi.org/10.1145/3589334.3645557
Abstract
The popularity of Knowledge Graphs (KGs) both in industry and academia owes credit to their flexible data model, suitable for data integration from multiple sources. Several KG-based applications such as trust assessment or view maintenance on dynamic data rely on the ability to compute provenance explanations for query results. The how-provenance of a query result is an expression that encodes the records (triples or facts) that explain its inclusion in the result set. This article proposes NPCS, a Native Provenance Computation approach for SPARQL queries. NPCS annotates query results with their how-provenance. By building upon spm-provenance semirings, NPCS supports both monotonic and non-monotonic SPARQL queries. Thanks to its reliance on query rewriting techniques, the approach is directly applicable to already deployed SPARQL engines using different reification schemes -- including RDF-star. Our experimental evaluation on two popular SPARQL engines (GraphDB and Stardog) shows that our novel query rewriting brings a significant runtime improvement over existing query rewriting solutions, scaling to RDF graphs with billions of triples.BibTeX
Blomqvist, E., García-Castro, R., Hernández, D., Hitzler, P., Lindecrantz, M., & Poveda-Villalón, M. (2024).
Proceedings of the The 2nd International Workshop on Knowledge Graphs for Sustainability (KG4S 2024) colocated with the 21st Extended Semantic Web Conference (ESWC 2024) (Vol. 3753). CEUR.
https://ceur-ws.org/Vol-3753/
BibTeX
Elenter, J., Chamon, L. F. O., & Ribeiro, A. (2024, May). Near-Optimal Solutions of Constrained Learning Problems.
Proceedings of the International Conference on Learning Representations(ICLR 2024), May 7-11, 2024, Austria.
https://doi.org/10.48550/arXiv.2403.11844
Abstract
With the widespread adoption of machine learning systems, the need to curtail their behavior has become increasingly apparent. This is evidenced by recent advancements towards developing models that satisfy robustness, safety, and fairness requirements. These requirements can be imposed (with generalization guarantees) by formulating constrained learning problems that can then be tackled by dual ascent algorithms. Yet, though these algorithms converge in objective value, even in non-convex settings, they cannot guarantee that their outcome is feasible. Doing so requires randomizing over all iterates, which is impractical in virtually any modern applications. Still, final iterates have been observed to perform well in practice. In this work, we address this gap between theory and practice by characterizing the constraint violation of Lagrangian minimizers associated with optimal dual variables, despite lack of convexity. To do this, we leverage the fact that non-convex, finite-dimensional constrained learning problems can be seen as parametrizations of convex, functional problems. Our results show that rich parametrizations effectively mitigate the issue of feasibility in dual methods, shedding light on prior empirical successes of dual learning. We illustrate our findings in fair learning tasks.BibTeX
Elshani, D., Dervishaj, A., Hernández, D., Gudmundsson, K., Staab, S., & Wortmann, T. (2024). An Ontology for the Reuse and Tracking of Prefabricated Building Components.
Proceedings of the the 2nd International Workshop on Knowledge Graphs for Sustainability (KG4S 2024) Colocated with the 21st Extended Semantic Web Conference (ESWC 2024),
3753, 53–64.
https://ceur-ws.org/Vol-3753/paper5.pdf
BibTeX
Errica, F., & Niepert, M. (2024, May). Tractable Probabilistic Graph Representation Learning with Graph-Induced Sum-Product Networks.
Proceedings of the International Conference on Learning Representations(ICLR 2024), May 7-11, 2024, Austria.
https://doi.org/10.48550/arXiv.2305.10544
Abstract
We introduce Graph-Induced Sum-Product Networks (GSPNs), a new probabilistic framework for graph representation learning that can tractably answer probabilistic queries. Inspired by the computational trees induced by vertices in the context of message-passing neural networks, we build hierarchies of sum-product networks (SPNs) where the parameters of a parent SPN are learnable transformations of the a-posterior mixing probabilities of its children's sum units. Due to weight sharing and the tree-shaped computation graphs of GSPNs, we obtain the efficiency and efficacy of deep graph networks with the additional advantages of a probabilistic model. We show the model's competitiveness on scarce supervision scenarios, under missing data, and for graph classification in comparison to popular neural models. We complement the experiments with qualitative analyses on hyper-parameters and the model's ability to answer probabilistic queries.BibTeX
Hagnberger, J., Kalimuthu, M., Musekamp, D., & Niepert, M. (2024, May). Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent PDEs. Proceedings of the AI4DifferentialEquations in Science Workshop at ICLR 2024, May 7-11, 2024, Austria.
Abstract
Neural Operators are a recent class of data-driven models for learning solutions to Partial Differential Equations (PDEs). Traditionally, these models are trained in an autoregressive fashion using data collected at discrete time points in the evolution of the PDE. This setup gives rise to two problems: (i) poor temporal generalization due to error accumulation and (ii) poor zero-shot super-resolution capabilities. To address these issues, we propose Vectorized Conditional Neural Fields (VCNeF), a general framework that utilizes transformers and implicit neural representations to efficiently solve time-dependent PDEs of varying coefficients. A comprehensive evaluation of VCNeF on the challenging 1D and 2D PDEs from PDEBench demonstrates the superiority of our model over four state-of-the-art baselines. Furthermore, our proposed model achieves faster inference and generalizes better to unseen PDE parameters than the compared models.BibTeX
Liu, A., Niepert, M., & den Broeck, G. V. (2024, May). Image Inpainting via Tractable Steering of Diffusion Models.
Proceedings of the International Conference on Learning Representations(ICLR 2024), May 7--11, 2024, Austria.
https://doi.org/10.48550/arXiv.2401.03349
Abstract
Diffusion models are the current state of the art for generating photorealistic images. However, controlling the sampling process for constrained image generation tasks such as inpainting remains challenging since exact conditioning on such constraints is intractable. While existing methods use various techniques to approximate the constrained posterior, this paper proposes to exploit the ability of Tractable Probabilistic Models (TPMs) to exactly and efficiently compute the constrained posterior, and to leverage this signal to steer the denoising process of diffusion models. Specifically, this paper adopts a class of expressive TPMs termed Probabilistic Circuits (PCs). Building upon prior advances, we further scale up PCs and make them capable of guiding the image generation process of diffusion models. Empirical results suggest that our approach can consistently improve the overall quality and semantic coherence of in painted images across three natural image datasets (i.e., CelebA-HQ, ImageNet, and LSUN) with only ~10% additional computational overhead brought by the TPM.BibTeX
Qian, C., Manolache, A., Ahmed, K., Zeng, Z., den Broeck, G. V., Niepert, M., & Morris, C. (2024, May). Probabilistically Rewired Message-Passing Neural Networks.
Proceedings of the International Conference on Learning Representations(ICLR 2024), May 7--11, 2024, Austria.
https://doi.org/10.48550/arXiv.2310.02156
Abstract
Message-passing graph neural networks (MPNNs) emerged as powerful tools for processing graph-structured input. However, they operate on a fixed input graph structure, ignoring potential noise and missing information. Furthermore, their local aggregation mechanism can lead to problems such as over-squashing and limited expressive power in capturing relevant graph structures. Existing solutions to these challenges have primarily relied on heuristic methods, often disregarding the underlying data distribution. Hence, devising principled approaches for learning to infer graph structures relevant to the given prediction task remains an open challenge. In this work, leveraging recent progress in exact and differentiable k-subset sampling, we devise probabilistically rewired MPNNs (PR-MPNNs), which learn to add relevant edges while omitting less beneficial ones. For the first time, our theoretical analysis explores how PR-MPNNs enhance expressive power, and we identify precise conditions under which they outperform purely randomized approaches. Empirically, we demonstrate that our approach effectively mitigates issues like over-squashing and under-reaching. In addition, on established real-world datasets, our method exhibits competitive or superior predictive performance compared to traditional MPNN models and recent graph transformer architectures.BibTeX
Seifer, P., Hernández, D., Lämmel, R., & Staab, S. (2024, May). From Shapes to Shapes: Inferring SHACL Shapes for Results of SPARQL CONSTRUCT Queries.
Proceedings of the ACM Web Conference 2024 (WWW ’24), May13--17, 2024, Singapore, Singapore.
https://doi.org/10.1145/3589334.3645550
Abstract
SPARQL CONSTRUCT queries allow for the specification of data processing pipelines that transform given input graphs into new output graphs. It is now common to constrain graphs through SHACL shapes allowing users to understand which data they can expect and which not. However, it becomes challenging to understand what graph data can be expected at the end of a data processing pipeline without knowing the particular input data: Shape constraints on the input graph may affect the output graph, but may no longer apply literally, and new shapes may be imposed by the query template. In this paper, we study the derivation of shape constraints that hold on all possible output graphs of a given SPARQL CONSTRUCT query. We assume that the SPARQL CONSTRUCT query is fixed, e.g., being part of a program, whereas the input graphs adhere to input shape constraints but may otherwise vary over time and, thus, are mostly unknown. We study a fragment of SPARQL CONSTRUCT queries (SCCQ) and a fragment of SHACL (Simple SHACL). We formally define the problem of deriving the most restrictive set of Simple SHACL shapes that constrain the results from evaluating a SCCQ over any input graph restricted by a given set of Simple SHACL shapes. We propose and implement an algorithm that statically analyses input SHACL shapes and CONSTRUCT queries and prove its soundness and complexity.BibTeX
Tran, H.-C., Nguyen, D. M. H., Nguyen, M.-D., Le, N. H., & T. Nguyen, B. (2024, May). Energy Minimizing-based Token Merging for Accelerating Transformers. Proceedings of Practical ML for Low Resource Settings in Science Workshop at ICLR 2024, May 7-11, 2024, Austria.
Abstract
Model compression has been an active research field that has been used to reduce the size and complexity of the model. In a recent noteworthy study, ToMe and its variants utilize the Bipartite Soft Matching (BSM) algorithm in which tokens representing patches in an image are split into two sets, and top-k similar tokens from one set are merged. This approach utilizes pre-trained weights, enhances speed, and reduces memory usage. However, these algorithms have some drawbacks. First, the choice of a token-splitting strategy significantly influences algorithm performance since tokens in one set can only perceive tokens in the other set, leading to mis-merging issues. Furthermore, although ToMe is effective in the initial layers, it becomes increasingly problematic in deeper layers as the number of tokens diminishes because of damaged informative tokens. To address these limitations, rather than relying on specific splitting strategies like BSM, we propose a new algorithm called PiToMe. Specifically, we prioritize the protection of informative tokens using an additional factor called energy score. In experiments, PiToMe achieved up to a 50% memory reduction while exhibiting superior off-the-shelf performance on image classification ( keeping a 1.71% average performance drop compared to 2.6% for ToMe) and image-text retrieval (1.35% average performance drop compared to 6.89% for ToMe) compared to previous BSM-based approaches dependent solely on token similarity.BibTeX
Zubaria, A., Hernández, D., Galárraga, L., Flouris, G., Fundulaki, I., & Hose, K. (2024, May). NPCS: Native Provenance Computation for SPARQL.
Proceedings of the ACM Web Conference 2024 (WWW ’24), May13--17, 2024, Singapore, Singapore.
https://doi.org/10.1145/3589334.3645557
Abstract
The popularity of Knowledge Graphs (KGs) both in industry and academia owes credit to their flexible data model, suitable for data integration from multiple sources. Several KG-based applications such as trust assessment or view maintenance on dynamic data rely on the ability to compute provenance explanations for query results. The how-provenance of a query result is an expression that encodes the records (triples or facts) that explain its inclusion in the result set. This article proposes NPCS, a Native Provenance Computation approach for SPARQL queries. NPCS annotates query results with their how-provenance. By building upon spm-provenance semirings, NPCS supports both monotonic and non-monotonic SPARQL queries. Thanks to its reliance on query rewriting techniques, the approach is directly applicable to already deployed SPARQL engines using different reification schemes -- including RDF*. Our experimental evaluation on two popular SPARQL engines (GraphDB and Stardog) shows that our novel query rewriting brings a significant runtime improvement over existing query rewriting solutions, scaling to RDF graphs with billions of triples.BibTeX
“Hosseini, A. S., & “Staab, S. (2024). Disambiguating Emotional Connotations of Words Using Contextualized Word Representations.
BibTeX
Asma, Z., Hernandez, D., Galárraga, L., Flouris, G., Fundulaki, I., & Hose, K. (2024).
Code and benchmark for NPCS, a Native Provenance Computation for SPARQL.
https://doi.org/10.18419/darus-3973
Abstract
Code for the implementation and benchmark of NPCS, a Native Provenance Computation for SPARQL.The code in this dataset includes the implementation of the NPCS system, which is a middleware for SPARQL endpoints that rewrites queries to queries that annotate answers with provenance polynomials (i.e., how-provenance data). The translation rules implemented for the query rewriting can be seen in the paper.Also, the code contains scripts that include scripts and services to automatize the query execution.We use GraphDB (version 10.2.0) and Stardog (version 9.1.0) for the SPARQL endpoints. Because of the license restrictions, these software products cannot be included in this dataset and must be downloaded from the respective vendors. Also, the data must be loaded using the respective bulk loaders of GraphDB and Stardog.The datasets used in the experiments can be generated synthetic dataset generator of the WatDiv benchmark. The Wikidata dataset corresponds to the full RDF dump from May 22, 2023.Do not hesitate to contact the authors for any inquiries.BibTeX
Abstract
Knowledge Graphs (KGs) are fundamental for organizing and representing large amounts of information, but they often suffer from incompleteness. Link prediction using Knowledge Graph Embedding (KGE) methods has emerged as a solution to this problem. Many different methods have been proposed to perform link prediction, some of which are a combination of different methods. However, existing approaches that combine different methods typically train models on the entire graph, lacking the diversity seen in machine learning ensembles such as bagging and random forests. In this thesis, we present the novel ensemble approaches UnifEnt and UnifFeat, that divide the KG into sub-graphs by taking advantage of the core principles of bagging and random forests. We evaluated our approach on common KG datasets and showed the benefits of using our method by comparing it to common KGE baseline methods, as well as related work in the area of ensemble methods for link prediction.BibTeX
Chamon, L. F. O., Karimi, M. R., & Korba, A. (2024). Constrained Sampling with Primal-Dual Langevin Monte Carlo.
In Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024).
https://doi.org/10.48550/arXiv.2411.00568
Abstract
This work considers the problem of sampling from a probability distribution known up to a normalization constant while satisfying a set of statistical constraints specified by the expected values of general nonlinear functions. This problem finds applications in, e.g., Bayesian inference, where it can constrain moments to evaluate counterfactual scenarios or enforce desiderata such as prediction fairness. Methods developed to handle support constraints, such as those based on mirror maps, barriers, and penalties, are not suited for this task. This work therefore relies on gradient descent-ascent dynamics in Wasserstein space to put forward a discrete-time primal-dual Langevin Monte Carlo algorithm (PD-LMC) that simultaneously constrains the target distribution and samples from it. We analyze the convergence of PD-LMC under standard assumptions on the target distribution and constraints, namely (strong) convexity and log-Sobolev inequalities. To do so, we bring classical optimization arguments for saddle-point algorithms to the geometry of Wasserstein space. We illustrate the relevance and effectiveness of PD-LMC in several applications.BibTeX
Crum, E., Santis, A. D., Ovide, M., Pan, J., Pisu, A., Lazzari, N., & Rudolph, S. (2024). Enriching Ontologies with Disjointness Axioms using Large Language Models.
International Semantic Web Conference 2024.
https://doi.org/10.48550/arXiv.2410.03235
BibTeX
Das, A., Fathallah, N., & Obretincheva, N. (2024). Navigating Nulls, Numbers and Numerous Entities: Robust Knowledge Base Construction from Large Language Models. In Knowledge Base Construction from Pre-trained Language Models Challenge Workshop, ISWC′24.
BibTeX
Diaz Ochoa, J. G., Mustafa, F. E., Weil, F., Wang, Y., Kama, K., & Knott, M. (2024). The aluminum standard: using generative Artificial Intelligence tools to synthesize and annotate non-structured patient data.
BMC Medical Informatics and Decision Making,
24, Article 1.
https://doi.org/10.1186/s12911-024-02825-4
Abstract
Medical narratives are fundamental to the correct identification of a patient’s health condition. This is not only because it describes the patient’s situation. It also contains relevant information about the patient’s context and health state evolution. Narratives are usually vague and cannot be categorized easily. On the other hand, once the patient’s situation is correctly identified based on a narrative, it is then possible to map the patient’s situation into precise classification schemas and ontologies that are machine-readable. To this end, language models can be trained to read and extract elements from these narratives. However, the main problem is the lack of data for model identification and model training in languages other than English. First, gold standard annotations are usually not available due to the high level of data protection for patient data. Second, gold standard annotations (if available) are difficult to access. Alternative available data, like MIMIC (Sci Data 3:1, 2016) is written in English and for specific patient conditions like intensive care. Thus, when model training is required for other types of patients, like oncology (and not intensive care), this could lead to bias. To facilitate clinical narrative model training, a method for creating high-quality synthetic narratives is needed.BibTeX
Ding, Z., Cai, H., Wu, J., Ma, Y., Liao, R., Xiong, B., & Tresp, V. (2024). zrLLM: Zero-Shot Relational Learning on Temporal Knowledge Graphs with Large Language Models.
Annual Conference of the North American Chapter of the Association for Computational Linguistics.
https://arxiv.org/abs/2311.10112
BibTeX
Ding, Z., Wu, J., Wu, J., Xia, Y., Xiong, B., & Tresp, V. (2024). Temporal Fact Reasoning over Hyper-Relational Knowledge Graphs. In Y. Al-Onaizan, M. Bansal, & Y.-N. Chen (Eds.),
Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, Florida, USA, November 12-16, 2024 (pp. 355–373). Association for Computational Linguistics.
https://aclanthology.org/2024.findings-emnlp.20
BibTeX
Fathallah, N., Bhole, M., & Staab, S. (2024). Empowering the Deaf and Hard of Hearing Community: Improving Video Captions with Large Language Models. In In Proceedings of the 11th International Conference on Software Development and Technologies for Enhancing Accessibility and Fighting Info-exclusion.
BibTeX
Fathallah, N., Das, A., De Giorgis, G., Poltronieri, A., Haase, P., & Kovriguina, L. (2024). NeOn-GPT: A Large Language Model-Powered Pipeline for Ontology Learning.
Special Track on Large Language Models for Knowledge Engineering, Extended Semantic Web Conference, 2024. (ESWC 2024).
https://doi.org/10.5281/ZENODO.11221930
BibTeX
Fathallah, N., Staab, S., & Algergawy, A. (2024). LLMs4Life: Large language models for ontology learning in life sciences. In In Proceedings of the ELMKE Workshop on Evaluation of Language Models in Knowledge Engineering co-located with EKAW-24 (24th International Conference on Knowledge Engineering and Knowledge Management).
BibTeX
BibTeX
Hagnberger, J., Kalimuthu, M., Musekamp, D., & Niepert, M. (2024). Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations.
In Proceedings of the 41st International Conference on Machine Learning (ICML 2024).
https://arxiv.org/abs/2406.03919
Abstract
Transformer models are increasingly used for solving Partial Differential Equations (PDEs). Several adaptations have been proposed, all of which suffer from the typical problems of Transformers, such as quadratic memory and time complexity. Furthermore, all prevalent architectures for PDE solving lack at least one of several desirable properties of an ideal surrogate model, such as (i) generalization to PDE parameters not seen during training, (ii) spatial and temporal zero-shot super-resolution, (iii) continuous temporal extrapolation, (iv) support for 1D, 2D, and 3D PDEs, and (v) efficient inference for longer temporal rollouts. To address these limitations, we propose Vectorized Conditional Neural Fields (VCNeFs), which represent the solution of time-dependent PDEs as neural fields. Contrary to prior methods, however, VCNeFs compute, for a set of multiple spatio-temporal query points, their solutions in parallel and model their dependencies through attention mechanisms. Moreover, VCNeF can condition the neural field on both the initial conditions and the parameters of the PDEs. An extensive set of experiments demonstrates that VCNeFs are competitive with and often outperform existing ML-based surrogate models.BibTeX
Abstract
This CNVVE Dataset contains raw audio samples encompassing six distinct classes of voice expressions, namely “Uh-huh” or “mm-hmm”, “Uh-uh” or“mm-mm”, “Hush” or “Shh”, “Psst”, “Ahem”, and Continuous humming, e.g., “hmmm.” Audio samples of each class are found in the respective folders. The samples are recorded through a dedicated website for data collection that defines the purpose and type of voice data by providing example recordings to participants as well as the expressions’ written equivalent, e.g., “Uh-huh”. Audio recordings were automatically saved in the .wav format and kept anonymous, with a sampling rate of 48 kHz and a bit depth of 32 bits.This dataset contains a raw version of the samples. A cleaned version of these samples can be found on https://doi.org/10.18419/darus-3898. For more info, please check the paper or feel free to contact the authors for any inquiries.BibTeX
Abstract
This CNVVE Dataset contains clean audio samples encompassing six distinct classes of voice expressions, namely “Uh-huh” or “mm-hmm”, “Uh-uh” or“mm-mm”, “Hush” or “Shh”, “Psst”, “Ahem”, and Continuous humming, e.g., “hmmm.” Audio samples of each class are found in the respective folders. These audio samples have undergone a thorough cleaning process. The raw samples are published in https://doi.org/10.18419/darus-3897. Initially, we applied the Google WebRTC voice activity detection (VAD) algorithm on the given audio files to remove noise or silence from the collected voice signals. The intensity was set to "2", which could be a value between "1" and "3". However, because of variations in the data, some files required additional manual cleaning. These outliers, characterized by sharp click sounds (such as those occurring at the end of recordings), were addressed. The samples are recorded through a dedicated website for data collection that defines the purpose and type of voice data by providing example recordings toparticipants as well as the expressions’ written equivalent, e.g., “Uh-huh”. Audio recordings were automatically saved in the .wav format and keptanonymous, with a sampling rate of 48 kHz and a bit depth of 32 bits. For more info, please check the paper or feel free to contact the authors for any inquiries.BibTeX
Jalali Farahani, F., Hanke, S., Dima, C., Heiberger, R. H., & Staab, S. (2024). Who is targeted? Detecting social group mentions in online political discussions.
Companion Publication of the 16th ACM Web Science Conference, 24–25.
https://doi.org/10.1145/3630744.3658412
Abstract
Social groups are central to political discussions. However, detecting social groups in text often relies on pre-determined socio-demographic categories or supervised learning methods that require extensive hand-labeled datasets. In this paper, we propose a methodology designed to leverage the potential of Large Language Models (LLMs) for the identification and annotation of social groups in text. The experiments show that open LLMs like Llama-2-70B-Chat and Mixtral-8-7B can reliably be used to annotate social groups in a few-shot scenario without the need for supervised learning. The automatically obtained annotations largely match human annotations on random samples from the Reddit Politosphere, resulting in micro-F1 scores of 0.71 and 0.83, respectively.BibTeX
Liu, X., Liu, A., den Broeck, G. V., & Liang, Y. (2024). A Tractable Inference Perspective of Offline RL.
In Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024).
https://doi.org/10.48550/arXiv.2311.00094
Abstract
A popular paradigm for offline Reinforcement Learning (RL) tasks is to first fit the offline trajectories to a sequence model, and then prompt the model for actions that lead to high expected return. In addition to obtaining accurate sequence models, this paper highlights that tractability, the ability to exactly and efficiently answer various probabilistic queries, plays an important role in offline RL. Specifically, due to the fundamental stochasticity from the offline data-collection policies and the environment dynamics, highly non-trivial conditional/constrained generation is required to elicit rewarding actions. it is still possible to approximate such queries, we observe that such crude estimates significantly undermine the benefits brought by expressive sequence models. To overcome this problem, this paper proposes Trifle (Tractable Inference for Offline RL), which leverages modern Tractable Probabilistic Models (TPMs) to bridge the gap between good sequence models and high expected returns at evaluation time. Empirically, Trifle achieves the most state-of-the-art scores in 9 Gym-MuJoCo benchmarks against strong baselines. Further, owing to its tractability, Trifle significantly outperforms prior approaches in stochastic environments and safe RL tasks (e.g. with action constraints) with minimum algorithmic modifications.BibTeX
Abstract
In this work, we propose a simple transformer-based baseline for multimodal molecular representation learning, integrating three distinct modalities: SMILES strings, 2D graph representations, and 3D conformers of molecules. A key aspect of our approach is the aggregation of 3D conformers, allowing the model to account for the fact that molecules can adopt multiple conformations-an important factor for accurate molecular representation. The tokens for each modality are extracted using modality-specific encoders: a transformer for SMILES strings, a message-passing neural network for 2D graphs, and an equivariant neural network for 3D conformers. The flexibility and modularity of this framework enable easy adaptation and replacement of these encoders, making the model highly versatile for different molecular tasks. The extracted tokens are then combined into a unified multimodal sequence, which is processed by a downstream transformer for prediction tasks. To efficiently scale our model for large multimodal datasets, we utilize Flash Attention 2 and bfloat16 precision. Despite its simplicity, our approach achieves state-of-the-art results across multiple datasets, demonstrating its effectiveness as a strong baseline for multimodal molecular representation learning.BibTeX
Abstract
Liberal political philosophy advocates for the policy of equal treatment as blindness, which seeks to achieve fairness by treating individuals without considering their protected characteristics directly. However, this policy has faced longstanding criticism for perpetuating existing inequalities. In machine learning, this policy can be translated into the concept of fairness as unawareness, and be measured using disparate treatment metrics such as Demographic Parity (a.k.a. Statistical Parity). Our analysis reveals that Demographic Parity does not faithfully measure whether individuals are being treated independently of the protected attribute by the model. We introduce the Explanation Disparity metric to measure fairness under equal treatment as blindness policy. Our metric evaluates the fairness of predictive models by analyzing the extent to which the protected attribute can be inferred from the distribution of explanation values, specifically using Shapley values. The proposed metric tests for statistical independence of the explanation distributions over populations with different protected characteristics. We show the theoretical properties of "Explanation Disparity" and devise an equal treatment inspector based on the AUC of a Classifier Two-Sample Test. We experiment with synthetic and natural data to demonstrate and compare the notion with related ones.BibTeX
Musekamp, D., Kalimuthu, M., Holzmüller, D., Takamoto, M., & Niepert, M. (2024). Active Learning for Neural PDE Solvers.
NeurIPS 2024 Workshop on Data-Driven and Differentiable Simulations, Surrogates, and Solvers.
https://openreview.net/forum?id=LD63WlGRQQ
Abstract
Solving partial differential equations (PDEs) is a fundamental problem in engineering and science. While neural PDE solvers can be more efficient than established numerical solvers, they often require large amounts of training data that is costly to obtain. Active Learning (AL) could help surrogate models reach the same accuracy with smaller training sets by querying classical solvers with more informative initial conditions and PDE parameters. While AL is more common in other domains, it has yet to be studied extensively for neural PDE solvers. To bridge this gap, we introduce AL4PDE, a modular and extensible active learning benchmark. It provides multiple parametric PDEs and state-of-the-art surrogate models for the solver-in-the-loop setting, enabling the evaluation of existing and the development of new AL methods for PDE solving. We use the benchmark to evaluate batch active learning algorithms such as uncertainty- and feature-based methods. We show that AL reduces the average error by up to 71% compared to random sampling and significantly reduces worst-case errors. Moreover, AL generates similar datasets across repeated runs, with consistent distributions over the PDE parameters and initial conditions. The acquired datasets are reusable, providing benefits for surrogate models not involved in the data generation.BibTeX
Nguyen, D. M. H., Le, A. T., Nguyen, T. Q., Diep, N. T., Nguyen, T., Duong-Tran, D., Peters, J., Shen, L., Niepert, M., & Sonntag, D. (2024). Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model.
Proceedings of Machine Learning Research.
https://arxiv.org/abs/2407.04489
Abstract
Prompt learning methods are gaining increasing attention due to their ability to customize large vision-language models to new domains using pre-trained contextual knowledge and minimal training data. However, existing works typically rely on optimizing unified prompt inputs, often struggling with fine-grained classification tasks due to insufficient discriminative attributes. To tackle this, we consider a new framework based on a dual context of both domain-shared and class-specific contexts, where the latter is generated by Large Language Models (LLMs) such as GPTs. Such dual prompt methods enhance the model's feature representation by joining implicit and explicit factors encoded in LLM knowledge. Moreover, we formulate the Unbalanced Optimal Transport (UOT) theory to quantify the relationships between constructed prompts and visual tokens. Through partial matching, UOT can properly align discrete sets of visual tokens and prompt embeddings under different mass distributions, which is particularly valuable for handling irrelevant or noisy elements, ensuring that the preservation of mass does not restrict transport solutions. Furthermore, UOT's characteristics integrate seamlessly with image augmentation, expanding the training sample pool while maintaining a reasonable distance between perturbed images and prompt inputs. Extensive experiments across few-shot classification and adapter settings substantiate the superiority of our model over current state-of-the-art baselines.BibTeX
Nguyen, D. M. H., Lukashina, N., Nguyen, T., Le, A. T., Nguyen, T., Ho, N., Peters, J., Sonntag, D., Zaverkin, V., & Niepert, M. (2024). Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks.
In Proceedings of the 41st International Conference on Machine Learning (ICML 2024).
https://arxiv.org/abs/2402.01975
Abstract
A molecule's 2D representation consists of its atoms, their attributes, and the molecule's covalent bonds. A 3D (geometric) representation of a molecule is called a conformer and consists of its atom types and Cartesian coordinates. Every conformer has a potential energy, and the lower this energy, the more likely it occurs in nature. Most existing machine learning methods for molecular property prediction consider either 2D molecular graphs or 3D conformer structure representations in isolation. Inspired by recent work on using ensembles of conformers in conjunction with 2D graph representations, we propose E(3)-invariant molecular conformer aggregation networks. The method integrates a molecule's 2D representation with that of multiple of its conformers. Contrary to prior work, we propose a novel 2D-3D aggregation mechanism based on a differentiable solver for the Fused Gromov-Wasserstein Barycenter problem and the use of an efficient conformer generation method based on distance geometry. We show that the proposed aggregation mechanism is E(3) invariant and propose an efficient GPU implementation. Moreover, we demonstrate that the aggregation mechanism helps to significantly outperform state-of-the-art molecule property prediction methods on established datasets.BibTeX
Pan, J., Nayyeri, M., Li, Y., & Staab, S. (2024). HGE: Embedding Temporal Knowledge Graphs in a Product Space of Heterogeneous Geometric Subspaces. Thirty-Eighth Conference on Artificial Intelligence, AAAI, 2024, Vancouver, Canada, February 22 – February 25, 2024,.
Abstract
Temporal knowledge graphs represent temporal facts (s,p,o,τ) relating a subject s and an object o via a relation label p at time τ, where τ could be a time point or time interval. Temporal knowledge graphs may exhibit static temporal patterns at distinct points in time and dynamic temporal patterns between different timestamps. In order to learn a rich set of static and dynamic temporal patterns and apply them for inference, several embedding approaches have been suggested in the literature. However, as most of them resort to single underlying embedding spaces, their capability to model all kinds of temporal patterns was severely limited by having to adhere to the geometric property of their one embedding space. We lift this limitation by an embedding approach that maps temporal facts into a product space of several heterogeneous geometric subspaces with distinct geometric properties, i.e.\ Complex, Dual, and Split-complex spaces. In addition, we propose a temporal-geometric attention mechanism to integrate information from different geometric subspaces conveniently according to the captured relational and temporal information. Experimental results on standard temporal benchmark datasets favorably evaluate our approach against state-of-the-art models.BibTeX
Pan, X. (2024). eSPARQL : design and implementation of a query language for epistemic queries on knowledge graphs.
Department of Analytical Computing.
https://doi.org/10.18419/OPUS-15474
Abstract
In recent years, large-scale knowledge graphs have emerged, integrating data from various sources. Often, this data includes assertions about other assertions, establishing contexts in which these assertions hold. A recent enhancement to RDF, known as RDF-star, allows for statements about statements and is currently under consideration as a W3C standard. However, RDF-star lacks a defined semantics for such statements and lacks intrinsic mechanisms to operate on them. This thesis describes and implements a novel query language, termed eSPARQL, tailored for epistemic RDF-star metadata and grounded in four-valued logic. Our language builds on SPARQL-star, the query language for RDF-star, by incorporating an expanded FROM clause, called FROM BELIEF, designed to manage multiple, and occasionally conflicting, beliefs. eSPARQL’s capabilities are demonstrated through four example queries, showcasing its ability to (i) retrieve individual beliefs, (ii) aggregate beliefs, (iii) identify conflicts between individuals, and (iv) handle nested beliefs (beliefs about beliefs). The implementation of eSPARQL developed in this thesis is built on top of an existing SPARQL-star query engine. In this implementation, the execution process of a eSPARQL consists of two phases. First, the expression in the FROM BELIEF clause, called belief query, is translated into a SPARQL-star CONSTRUCT query that generates an intermediary graph, containing the beliefs of the subjects described in the belief query. In the second phase, This intermediary graph is then processed with the graph pattern of the eSPARQL by translating it to a graph pattern that can be processed by a standard SPARQ-star engine. In this last phase, the implementation translates eSPARQL operations to SPARQL-star, and checks if the pattern contains nested eSPARQL queries to be processed recursively. We study two research questions: (RQ1) Does the eSPARQL implementation scale? and (RQ2) How the eSPARQL implementation execution times compare with the execution time of manually written SPARQL-star queries? To answer these research questions, use the four example eSPARQL queries that showcase the abilities of eSPARQL and create a synthetic dataset generator that generates graphs of multiple sizes. Additionally, for research question RQ2, we manually generate SPARQL-star queries that are equivalent to the example eSPARQL queries. Regarding research question RQ1, our results show that eSPARQL has an execution time that is proportional with the data size. Regarding research question RQ2, except for one question, the manually written SPARQL-star queries are clearly faster than our implementation. Although the implementation showed to be slower than the manually generated SPARQL-star queries, the eSPARQL queries are shorter and easier to understand. This positive aspect of eSPARQL, can motivate further studies on how to optimize the eSPARQL implementation.BibTeX
Peng, K., Wen, D., Yang, K., Luo, A., Chen, Y., Fu, J., Sarfraz, M. S., Roitberg, A., & Stiefelhagen, R. (2024). Advancing Open-Set Domain Generalization Using Evidential Bi-Level Hardest Domain Scheduler.
In Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024).
https://doi.org/10.48550/arXiv.2409.17555
Abstract
In Open-Set Domain Generalization (OSDG), the model is exposed to both new variations of data appearance (domains) and open-set conditions, where both known and novel categories are present at test time. The challenges of this task arise from the dual need to generalize across diverse domains and accurately quantify category novelty, which is critical for applications in dynamic environments. Recently, meta-learning techniques have demonstrated superior results in OSDG, effectively orchestrating the meta-train and -test tasks by employing varied random categories and predefined domain partition strategies. These approaches prioritize a well-designed training schedule over traditional methods that focus primarily on data augmentation and the enhancement of discriminative feature learning. The prevailing meta-learning models in OSDG typically utilize a predefined sequential domain scheduler to structure data partitions. However, a crucial aspect that remains inadequately explored is the influence brought by strategies of domain schedulers during training. In this paper, we observe that an adaptive domain scheduler benefits more in OSDG compared with prefixed sequential and random domain schedulers. We propose the Evidential Bi-Level Hardest Domain Scheduler (EBiL-HaDS) to achieve an adaptive domain scheduler. This method strategically sequences domains by assessing their reliabilities in utilizing a follower network, trained with confidence scores learned in an evidential manner, regularized by max rebiasing discrepancy, and optimized in a bi-level manner. The results show that our method substantially improves OSDG performance and achieves more discriminative embeddings for both the seen and unseen categories.BibTeX
Peng, K., Yin, C., Zheng, J., Liu, R., Schneider, D., Zhang, J., Yang, K., Sarfraz, M. S., Stiefelhagen, R., & Roitberg, A. (2024). Navigating Open Set Scenarios for Skeleton-based Action Recognition.
The 38th Annual AAAI Conference on Artificial Intelligence.
https://arxiv.org/abs/2312.06330
BibTeX
Potyka, N., Zhu, Y., He, Y., Kharlamov, E., & Staab, S. (2024). Robust Knowledge Extraction from Large Language Models using Social Choice Theory.
In Proceedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems.
https://arxiv.org/abs/2312.14877
Abstract
Large-language models (LLMs) have the potential to support a wide range of applications like conversational agents, creative writing, text improvement, and general query answering. However, they are ill-suited for query answering in high-stake domains like medicine because they generate answers at random and their answers are typically not robust - even the same query can result in different answers when prompted multiple times. In order to improve the robustness of LLM queries, we propose using ranking queries repeatedly and to aggregate the queries using methods from social choice theory. We study ranking queries in diagnostic settings like medical and fault diagnosis and discuss how the Partial Borda Choice function from the literature can be applied to merge multiple query results. We discuss some additional interesting properties in our setting and evaluate the robustness of our approach empirically.BibTeX
Qian, C., Manolache, A., Morris, C., & Niepert, M. (2024). Probabilistic Graph Rewiring via Virtual Nodes.
In Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024).
https://doi.org/10.48550/arXiv.2405.17311
Abstract
Message-passing graph neural networks (MPNNs) have emerged as a powerful paradigm for graph-based machine learning. Despite their effectiveness, MPNNs face challenges such as under-reaching and over-squashing, where limited receptive fields and structural bottlenecks hinder information flow in the graph. While graph transformers hold promise in addressing these issues, their scalability is limited due to quadratic complexity regarding the number of nodes, rendering them impractical for larger graphs. Here, we propose implicitly rewired message-passing neural networks (IPR-MPNNs), a novel approach that integrates implicit probabilistic graph rewiring into MPNNs. By introducing a small number of virtual nodes, i.e., adding additional nodes to a given graph and connecting them to existing nodes, in a differentiable, end-to-end manner, IPR-MPNNs enable long-distance message propagation, circumventing quadratic complexity. Theoretically, we demonstrate that IPR-MPNNs surpass the expressiveness of traditional MPNNs. Empirically, we validate our approach by showcasing its ability to mitigate under-reaching and over-squashing effects, achieving state-of-the-art performance across multiple graph datasets. Notably, IPR-MPNNs outperform graph transformers while maintaining significantly faster computational efficiency.BibTeX
Schwindt, S., Meisinger, L., Negreiros, B., Schneider, T., & Nowak, W. (2024). Transfer learning achieves high recall for object classification in fluvial environments with limited data.
Geomorphology,
455, 109185.
https://doi.org/10.1016/j.geomorph.2024.109185
Abstract
Field surveys to collect data from fluvial ecosystems traditionally focus on specific phenomena related to geomorphology or hydrology. Low-cost unmanned aerial vehicles (UAVs) additionally empower the fast and massive collection of airborne photogrammetry, providing geospatially explicit information. This remote sensing data complements field surveys by offering contextual information on geomorphological conditions, including digital terrain models. AI-based image recognition can augment contextual information to extrapolate archetypal object classes through name labels, such as “gravel”, “sand”, “plant”, or “large wood”. However, obtaining sufficient ground truth data for these classifications, particularly in morphodynamic fluvial environments, is challenging and induces high costs. This study introduces a transfer learning approach to address the challenge of low data availability, enabling AI-based mapping of complex objects in fluvial landscapes. We leverage the learned general structure of a deep convolutional neural network (CNN) pre-trained on a broad range of images. The fixed latent features of the pre-trained CNN stem from GoogLeNet. A fixed feature extractor serves to classify objects with limited data amounts. Satisfactory performance is measured with a recall rate, expressing the ability of a model to find all occurrences of a class on an image. High spatial heterogeneity in the locations of measurements on the x-y plane improves model performance. With a minimum of 400 labeled instances, the model achieves a satisfactory 93.75-% recall for a “large wood” target class, providing evidence of the effectiveness of transfer learning in remote sensing for geomorphological studies. This ability to detect large woods in river environments is critical to restoration efforts as it helps create fish habitat, which is essential to supporting biodiversity.BibTeX
Abstract
This dataset contains the implementation code for an algorithm to infer SHACL shapes that the graph returned by an SPARQL CONSTRUCT query must satisfy if the input satisfies a given set of SHACL shapes. This dataset also includes an evaluation for the algorithm. The algorithm implemented in this dataset is proposed in the paper From Shapes to Shapes: Inferring SHACL Shapes for Results of SPARQL CONSTRUCT Queries. To execute the code, follow the instructions in the README.md file. For more info, please check the paper, and please have no hesitation to contact the authors for any inquiries.BibTeX
Abstract
Graph Neural Networks (GNNs) are a popular class of machine learning models. Inspired by the learning to explain (L2X) paradigm, we propose L2XGNN, a framework for explainable GNNs that provides faithful explanations by design. L2XGNN learns a mechanism for selecting explanatory subgraphs (motifs), which are exclusively used in the GNN message-passing operations. L2XGNN can select, for each input graph, a subgraph with specific properties, such as being sparse and connected. Imposing such constraints on the motifs often leads to more interpretable and effective explanations. Experiments on several datasets suggest that L2XGNN achieves the same classification accuracy as baseline methods using the entire input graph while ensuring that only the provided explanations are used to make predictions. Moreover, we show that L2XGNN can identify motifs responsible for the graph's properties it is intended to predict.BibTeX
Tan, Y., Lv, H., Zhou, Z., Guo, W., Xiong, B., Liu, W., Chen, C., Wang, S., & Yang, C. (2024). Logical Relation Modeling and Mining in Hyperbolic Space for Recommendation.
The 40th IEEE International Conference on Data Engineering.
http://www.cs.emory.edu/~jyang71/files/logirec.pdf
BibTeX
Torres, E., & Niepert, M. (2024). Survey: Adaptive Physics-informed Neural Networks.
Neurips 2024 Workshop Foundation Models for Science: Progress, Opportunities, and Challenges.
https://openreview.net/forum?id=bYP6YB84Pq
Abstract
Physics-informed neural networks (PINNs) have emerged as a promising approach for solving partial differential equations (PDEs) using neural networks, particularly in data-scarce scenarios due to their unsupervised training capability. However, a key limitation is the need for re-optimization with each change in PDE parameters, similar to the challenge in traditional numerical methods where each system of equations corresponds to a specific PDE instance. This characteristic poses a barrier to the widespread adoption of PINNs across scientific and engineering applications. This survey explores research addressing this limitation through transfer learning and meta-learning, synthesizing insights to establish a foundation for efficient data generation strategies tailored to PINNs. These methods can potentially improve PINNs' training efficiency, enabling quicker adaptation to new PDEs with fewer data and computational demands. While numerical methods directly solve systems of equations to derive solutions, neural networks implicitly learn solutions by adjusting their parameters. One notable advantage of neural networks lies in their capacity to abstract away from specific problem domains, enabling them to retain, discard, or adapt learned representations to efficiently address similar problems. By understanding how these techniques can be applied to PINNs, this survey seeks to identify promising directions for future research to enable the widespread adoption of PINNs across a wide range of scientific and engineering applications.BibTeX
Tran, H.-C., Nguyen, D. M. H., Nguyen, D. M., Nguyen, T.-T., Le, N., Xie, P., Sonntag, D., Zou, J. Y., Nguyen, B. T., & Niepert, M. (2024). Accelerating Transformers with Spectrum-Preserving Token Merging.
In Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024).
https://doi.org/10.48550/arXiv.2405.16148
Abstract
Increasing the throughput of the Transformer architecture, a foundational component used in numerous state-of-the-art models for vision and language tasks (e.g., GPT, LLaVa), is an important problem in machine learning. One recent and effective strategy is to merge token representations within Transformer models, aiming to reduce computational and memory requirements while maintaining accuracy. Prior works have proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top k similar tokens. However, these methods have significant drawbacks, such as sensitivity to token-splitting strategies and damage to informative tokens in later layers. This paper presents a novel paradigm called PiToMe, which prioritizes the preservation of informative tokens using an additional metric termed the energy score. This score identifies large clusters of similar tokens as high-energy, indicating potential candidates for merging, while smaller (unique and isolated) clusters are considered as low-energy and preserved. Experimental findings demonstrate that PiToMe saved from 40-60% FLOPs of the base models while exhibiting superior off-the-shelf performance on image classification (0.5% average performance drop of ViT-MAE-H compared to 2.6% as baselines), image-text retrieval (0.3% average performance drop of CLIP on Flickr30k compared to 4.5% as others), and analogously in visual questions answering with LLaVa-7B. Furthermore, PiToMe is theoretically shown to preserve intrinsic spectral properties of the original token space under mild conditions.BibTeX
Wang, Z., Cai, S., Mu, Z., Lin, H., Zhang, C., Liu, X., Li, Q., Liu, A., Ma, X., & Liang, Y. (2024). OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents.
In Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024).
https://doi.org/10.48550/arXiv.2407.00114
Abstract
This paper presents OmniJARVIS, a novel Vision-Language-Action (VLA) model for open-world instruction-following agents in Minecraft. Compared to prior works that either emit textual goals to separate controllers or produce the control command directly, OmniJARVIS seeks a different path to ensure both strong reasoning and efficient decision-making capabilities via unified tokenization of multimodal interaction data. First, we introduce a self-supervised approach to learn a behavior encoder that produces discretized tokens for behavior trajectories and an imitation learning policy decoder conditioned on these tokens. These additional behavior tokens will be augmented to the vocabulary of pretrained Multimodal Language Models. With this encoder, we then pack long-term multimodal interactions involving task instructions, memories, thoughts, observations, textual responses, behavior trajectories, etc into unified token sequences and model them with autoregressive transformers. Thanks to the semantically meaningful behavior tokens, the resulting VLA model, OmniJARVIS, can reason (by producing chain-of-thoughts), plan, answer questions, and act (by producing behavior tokens for the imitation learning policy decoder). OmniJARVIS demonstrates excellent performances on a comprehensive collection of atomic, programmatic, and open-ended tasks in open-world Minecraft. Our analysis further unveils the crucial design principles in interaction data formation, unified tokenization, and its scaling potentials.BibTeX
Xiong, B., Nayyeri, M., Cochez, M., & Staab, S. (2024).
Code for Hyperbolic Embedding Inference for Structured Multi-Label Prediction [DaRUS].
https://doi.org/10.18419/DARUS-3988
Abstract
This is a PyTorch implementation of the paper Hyperbolic Embedding Inference for Structured Multi-Label Prediction published in NeurIPS 2022. The code provides the Python scripts to reproduce the experiments in the paper, as well as a proof-of-concept example of the method. To execute the code, follow the instructions in the README.md file. For more info, please check the paper. Please have no hesitation to contact the authors for any inquiries.BibTeX
Xiong, B., Nayyeri, M., Luo, L., Wang, Z., Pan, S., & Staab, S. (2024). NestE: Modeling Nested Relational Structures for Knowledge Graph Reasoning.
The 38th Annual AAAI Conference on Artificial Intelligence.
https://arxiv.org/abs/2312.09219
BibTeX
Xiong, B., Nayyeri, M., Luo, L., Wang, Z., Pan, S., & Staab, S. (2024).
Replication Data for NestE: Modeling Nested Relational Structures for Knowledge Graph Reasoning (AAAI′24) [DaRUS].
https://doi.org/10.18419/DARUS-3978
Abstract
This code is a PyTorch implementation of the paper "NestE: Modeling Nested Relational Structures for Knowledge Graph Reasoning (AAAI'24)".NestE is a knowledge graph embedding method that can encode nested facts represented by quoted triples (h,r,t) in which the subject and object are triples themselves, e.g., ((BarackObama, holds_position, President), succeed_by, (DonaldTrump, holds_position, President)).We implement six variant models of NetsE based on different hypercomplex number systems. NestE_Q.py for NestE with quaternion. NestE_H.py for NestE with hyperbolic quaternion. NestE_D.py for NestE with split quaternion. NestE_B.py, NestE_HB.py, and NestE_DB.py are the respective version with a translation component. This code is used to reproduce the experiments of the paper. To execute the code, follow the instructions in the README.md file.BibTeX
Abstract
This is a Pytorch implementation of the paper Shrinking Embeddings for Hyper-relational Knowledge Graphs published in ACL'23.This code is used to reproduce the experiments of the method ShrinkE, a geometric embedding approach for hyper-relational knowledge graphs. The code is implemented with Python 3 and pytorch. The code is tested on public datasets which can be download from StarE. To execute the code, follow the instructions in the README.md file. For more info, please check the paper or feel free to contact the authors for any inquiries.BibTeX
BibTeX
Xiong, B., Zhu, S., Nayyeri, M., Xu, C., Pan, S., & Staab, S. (2024).
Code for Ultrahyperbolic Knowledge Graph Embeddings [DaRUS].
https://doi.org/10.18419/DARUS-4342
Abstract
This is a Pytorch implementation of the paper Ultrahyperbolic Knowledge Graph Embeddings published in KDD 2022. This code is used to reproduce the experiments of the method UltraE, a geometric embedding approach for knowledge graph embeddings. The code is tested on public datasets which can be downloaded from KGEmb. To execute the code, follow the instructions in the README.md file. For more info, please check the paper or feel free to contact the authors for any inquiries.BibTeX
Zaverkin, V., Alesiani, F., Maruyama, T., Errica, F., Christiansen, H., Takamoto, M., Weber, N., & Niepert, M. (2024). Higher-Rank Irreducible Cartesian Tensors for Equivariant Message Passing.
In Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024).
https://doi.org/10.48550/arXiv.2405.14253
Abstract
The ability to perform fast and accurate atomistic simulations is crucial for advancing the chemical sciences. By learning from high-quality data, machine-learned interatomic potentials achieve accuracy on par with ab initio and first-principles methods at a fraction of their computational cost. The success of machine-learned interatomic potentials arises from integrating inductive biases such as equivariance to group actions on an atomic system, e.g., equivariance to rotations and reflections. In particular, the field has notably advanced with the emergence of equivariant message passing. Most of these models represent an atomic system using spherical tensors, tensor products of which require complicated numerical coefficients and can be computationally demanding. Cartesian tensors offer a promising alternative, though state-of-the-art methods lack flexibility in message-passing mechanisms, restricting their architectures and expressive power. This work explores higher-rank irreducible Cartesian tensors to address these limitations. We integrate irreducible Cartesian tensor products into message-passing neural networks and prove the equivariance and traceless property of the resulting layers. Through empirical evaluations on various benchmark data sets, we consistently observe on-par or better performance than that of state-of-the-art spherical and Cartesian models.BibTeX
Zaverkin, V., Holzmüller, D., Christiansen, H., Errica, F., Alesiani, F., Takamoto, M., Niepert, M., & Kästner, J. (2024). Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials.
Npj Comput. Mater.,
10, Article 1.
https://doi.org/10.1038/s41524-024-01254-1
Abstract
Efficiently creating a concise but comprehensive data set for training machine-learned interatomic potentials (MLIPs) is an under-explored problem. Active learning, which uses biased or unbiased molecular dynamics (MD) to generate candidate pools, aims to address this objective. Existing biased and unbiased MD-simulation methods, however, are prone to miss either rare events or extrapolative regions—areas of the configurational space where unreliable predictions are made. This work demonstrates that MD, when biased by the MLIP’s energy uncertainty, simultaneously captures extrapolative regions and rare events, which is crucial for developing uniformly accurate MLIPs. Furthermore, exploiting automatic differentiation, we enhance bias-forces-driven MD with the concept of bias stress. We employ calibrated gradient-based uncertainties to yield MLIPs with similar or, sometimes, better accuracy than ensemble-based methods at a lower computational cost. Finally, we apply uncertainty-biased MD to alanine dipeptide and MIL-53(Al), generating MLIPs that represent both configurational spaces more accurately than models trained with conventional MD.BibTeX
Zhu, Y., Potyka, N., Nayyeri, M., Xiong, B., He, Y., Kharlamov, E., & Staab, S. (2024). Predictive Multiplicity of Knowledge Graph Embeddings in Link Prediction.
Proceeding of the 2024 Conference on Empirical Methods in Natural Language Processing.
https://arxiv.org/pdf/2408.08226
Abstract
Knowledge graph embedding (KGE) models are often used to predict missing links for knowledge graphs (KGs). However, multiple KG embeddings can perform almost equally well for link prediction yet give conflicting predictions for unseen queries. This phenomenon is termed predictive multiplicity in the literature. It poses substantial risks for KGE-based applications in high-stake domains but has been overlooked in KGE research. We define predictive multiplicity in link prediction, introduce evaluation metrics and measure predictive multiplicity for representative KGE methods on commonly used benchmark datasets. Our empirical study reveals significant predictive multiplicity in link prediction, with 8% to 39% testing queries exhibiting conflicting predictions. We address this issue by leveraging voting methods from social choice theory, significantly mitigating conflicts by 66% to 78% in our experiments.BibTeX