Bazo, M. W. (2021).
Learning quantitative argumentation frameworks using sparse neural networks and swarm intelligence algorithms. Department of Analytical Computing.
https://doi.org/10.18419/OPUS-11903
Abstract
Argumentation Frameworks are an approach of formalizing arguments and their interrelations in a graph structure. They can be used to draw conclusions from this modelling of knowledge. Since argumentation is an important part of human reasoning, these graph structures can be considered easily interpretable, what makes this technology an interesting explainable artificial intelligence method. Although this is not their main purpose, Quantitative Argumentation Frameworks can be used to solve classification problems by following a new approach. This approach is based on constructing them out of multilayer perceptrons (MLP), based on the work of Potyka. In this thesis we were motivated to construct Quantitative Argumentation Frameworks out of sparse MLPs. A swarm intelligence algorithm, namely Particle Swarm Optimization (PSO), was developed to search for sparse MLP models with specific characteristics that relate to performance and topology of the graphical structures. Models were implemented, tested, and evaluated on three different datasets. The implementation includes preprocessing of the datasets, parameter learning of MLPs based on backpropagation, and structure learning of the MLP graphical structures. The evaluation involves constructing fully connected MLPs and decision trees for comparison purposes. The resulting models achieved high performance and low complexity in their structure. The PSO algorithm also proved its efficiency in solving the structure learning.BibTeX
Abstract
The extraction and segmentation of references from scientific articles is a core task of modern digital libraries. Once references are extracted and segmented, the bibliographic information can be made publicly available and linked, enabling efficient literature study. However, references often vary in their structure and content. This makes the extraction and segmentation of references a challenging but valuable task. The purpose of this thesis is to investigate whether Bidirectional Encoder Representations from Transformers (BERT) is suitable for the extraction and segmentation of bibliographic references. Therefore, we follow a deep learning approach for the extraction and segmentation of references from PDF documents. We use a neural network architecture based on BERT, a deep language representation model that has significantly increased performance on many natural language processing tasks. Over the BERT output, we put a linear-chain Conditional Random Field. We experiment with different BERT models and input formats and also examine two approaches for reference extraction and segmentation. The experiments are evaluated on a challenging dataset that contains both English and German social science publications with highly varying references. Our results show that the best performing BERT models were pre-trained on similar data to the data that we used for the fine-tuning of the BERT models on the task of reference extraction and reference segmentation. Moreover, our findings show that long, context-based input sequences yield the best results. The extraction model identifies and extracts references with an average F1-score of 81.9%. References are segmented with an average F1-score of 93.6%. We show that our models compare well to one other previously published work. Our conclusion is that BERT is a suitable choice for reference extraction and reference segmentation.BibTeX
Pan, X. (2024). eSPARQL : design and implementation of a query language for epistemic queries on knowledge graphs.
Department of Analytical Computing.
https://doi.org/10.18419/OPUS-15474
Abstract
In recent years, large-scale knowledge graphs have emerged, integrating data from various sources. Often, this data includes assertions about other assertions, establishing contexts in which these assertions hold. A recent enhancement to RDF, known as RDF-star, allows for statements about statements and is currently under consideration as a W3C standard. However, RDF-star lacks a defined semantics for such statements and lacks intrinsic mechanisms to operate on them. This thesis describes and implements a novel query language, termed eSPARQL, tailored for epistemic RDF-star metadata and grounded in four-valued logic. Our language builds on SPARQL-star, the query language for RDF-star, by incorporating an expanded FROM clause, called FROM BELIEF, designed to manage multiple, and occasionally conflicting, beliefs. eSPARQL’s capabilities are demonstrated through four example queries, showcasing its ability to (i) retrieve individual beliefs, (ii) aggregate beliefs, (iii) identify conflicts between individuals, and (iv) handle nested beliefs (beliefs about beliefs). The implementation of eSPARQL developed in this thesis is built on top of an existing SPARQL-star query engine. In this implementation, the execution process of a eSPARQL consists of two phases. First, the expression in the FROM BELIEF clause, called belief query, is translated into a SPARQL-star CONSTRUCT query that generates an intermediary graph, containing the beliefs of the subjects described in the belief query. In the second phase, This intermediary graph is then processed with the graph pattern of the eSPARQL by translating it to a graph pattern that can be processed by a standard SPARQ-star engine. In this last phase, the implementation translates eSPARQL operations to SPARQL-star, and checks if the pattern contains nested eSPARQL queries to be processed recursively. We study two research questions: (RQ1) Does the eSPARQL implementation scale? and (RQ2) How the eSPARQL implementation execution times compare with the execution time of manually written SPARQL-star queries? To answer these research questions, use the four example eSPARQL queries that showcase the abilities of eSPARQL and create a synthetic dataset generator that generates graphs of multiple sizes. Additionally, for research question RQ2, we manually generate SPARQL-star queries that are equivalent to the example eSPARQL queries. Regarding research question RQ1, our results show that eSPARQL has an execution time that is proportional with the data size. Regarding research question RQ2, except for one question, the manually written SPARQL-star queries are clearly faster than our implementation. Although the implementation showed to be slower than the manually generated SPARQL-star queries, the eSPARQL queries are shorter and easier to understand. This positive aspect of eSPARQL, can motivate further studies on how to optimize the eSPARQL implementation.BibTeX
Abstract
Multistep prediction is the prediction of states based on initial states and a series of control inputs. This paper focuses on developing transformer models for multistep prediction of vehicle states and testing different modifications of the transformer architecture using the example of the prediction of a ship simulation. Research in NLP promises advantages with regard to training time and prediction accuracy for the transformer architecture compared to a state-of-the-art LSTM model. The author also investigates whether positional encodings are useful in this scenario and if a transformer model can learn the order of the inputs without positional encodings.BibTeX
Holeczek, C. (2021).
Entwicklung eines neuronalen Netzwerks zur Optimierung der Datenübertragungsqualität von Kleinsatellitenplattformen [Master’s Thesis].
https://doi.org/10.18419/OPUS-11904
Abstract
Kleinsatelliten sind heute eine zunehmend wichtige Möglichkeit für die Forschung, Experimente und Nutzlasten in Erdumlaufbahnen zu bringen. Bei diesen Satellitensystemen besteht einerseits ein wachsender Bedarf an Datenübertragung zu einer Bodenstation, andererseits gelten in der Raumfahrt verschiedene Einschränkungen bei Aufbau und Betrieb von Funkkommunikationssystemen. Heutige Funksysteme haben meist statische Transceiver-Konfigurationen und die genutzten Übertragungsraten entsprechen der Worst-case-Betrachtung für den jeweiligen Fall. Neuerdings werden adaptive Ansätze für den Betrieb von Funkverbindungen zwischen Satellit und Bodenstation beschrieben, bei denen die Transceiver-Konfiguration, gesteuert durch einen lernenden Algorithmus, im Betrieb verändert und den jeweiligen äusseren Einflüssen angepasst wird. Die Grundlage für die verwendeten Algorithmen ist das Reinforcement Learning Model. Ziel dieser Arbeit war die Weiterentwicklung eines für die Raumfahrtfunkkommunikation entwickelten Algorithmus und der Test dieses Algorithmus' innerhalb einer Simulationsumgebung, um die Auswirkungen der Veränderungen zu bewerten. Der modifizierte Algorithmus wurde in einer Software-Simulationsumgebung betrieben, welche die digitale Signalverarbeitung von Sender und Empfänger sowie die sich verändernden Übertragungsbedingungen während eines Satellitenüberfluges abbildete. Die Modifikation wurde mit dem Originalalgorithmus hinsichtlich der erreichbaren Datenübertragung verglichen. Unter anderem wurde die Anzahl der für das Lernen verwendeten Neuronalen Netze variiert. Weiterhin wurden verschiedene Hyperparameter variiert und die Auswirkungen auf die Datenübertragung untersucht. Eine Anpassung der Hyperparameter führte dabei zu einer Übertragung von 57,8% mehr Daten als bei der Baseline-Implementierung. Sowohl die Veränderung in der Architektur als auch die Reduzierung der parallel ausgeführten Neuronalen Netze führte zu leichten Performanzeinbußen von 4,0% und 3,3%. Die Ergebnisse der hier durchgeführten Software-Simulationen lassen sich jedoch nicht auf Hardware-Simulationsumgebungen oder tatsächliche Funkverbindungen übertragen. Die Implementierung der verschiedenen Algorithmusvarianten erlaubt es, diese Varianten zukünftig auch mit Hardware-Übertragungsstrecken zu testen und später ein solches System auf einem Kleinsatelliten zu erproben.BibTeX
Abstract
Playing computer games is one of the most common and enjoyable activities in the modern lifestyle. However, people with motor impairments who cannot use mouse and keyboard are not able to interact with computing devices as required by the game design. In this regard, novel interaction techniques can enable hands-free control using gaze and voice as input modalities to assist people with motor and speech impairments. This is the first evaluation of a video game control method consisting of a combination of eye tracking and non-verbal voice commands (e.g. humming) applied in a 2D jump-and-run game environment involving essential spatio-temporal interactions. To assess interaction feasibility, the evaluation study consisted of both qualitative and quantitative measures. In addition, the feasibility was validated with a target user group of people with motor impairments. Overall, the results indicate a lower but competitive performance while increasing the fun factor.BibTeX
Abstract
A knowledge graph is a datastructure that is capable of storing knowledge. Besides that, there are several methods that use knowledge graphs to derive more information. These methods need to be validated with example knowledge graphs. However, real data might not be available or not contain desired properties. Thus, there are use cases that benefit from the generation of synthetic knowledge graphs. To define a synthetic knowledge graph, there is the need of a characterization that expresses how the synthetic data should look. In this thesis, I use Horn clauses for this characterization because of their good balance of expressiveness and complexity, their use in the field of rule mining, and their base role in the logical language Datalog. As clauses are usually not represented perfectly in real data, the goal of this thesis is to generate a knowledge graph that does not perfectly fulfil given Horn clauses, but in a desired degree of fulfillment. During the thesis, I developed and implemented two modifiable algorithms to generate knowledge graphs. On the one hand, I adapted the general hill climbing technique to generate knowledge graphs. On the other hand, I implemented a greedy algorithm which orders a given set of Horn clauses using logical subsumptions between their bodies, and then add edges to fulfil one Horn clause after the other, in the computed order. Both algorithms aim to fulfil the goal of this thesis by generating synthetic knowledge graphs according to given Horn clauses, each with a degree of fulfilment. The degree of fulfilment of any Horn clauses is characterized by body support, the number of times the premise of the Horn clauses is fulfilled, and support, the number of times the premise and conclusion of the Horn clauses are fulfilled. Additionally, there is the confidence which is the fraction of support and body support, i.e., the percentage of cases the Horn clause is fulfilled. All code is published such that anyone can try it. During the evaluation, random sets of Horn clauses were produced and the implementations generated corresponding knowledge graphs. Generated knowledge graphs were compared by considering the difference between the expected and the actual degree of fulfilment for each Horn clause. The result is that generation variant hill climbing with the initial state set to the result of the greedy algorithm with rule order based on subsumption yields the best results. Also, the difficulty of generating a good knowledge graph increases along with the overlapping degree of the input set of Horn clauses. Note that the overlapping degree reflects how many relation names occur in how many Horn clauses of the set. Lastly, the state-of-the-art mining tool AMIE found many Horn clauses in the generated graphs which were not intended by the set given to the generator.BibTeX
Abstract
The modern world is highly influenced by user-contributed content such as user comments, which is one of the most popular forms of communication on social media. They help build a connection between content creators and content consumers, as well as a connection between users of a social platform, which makes them highly relevant for community interaction. However, researchers have not treated users' comment reading strategies in much detail. The existing accounts are limited to address solely an explicit comment reading approach, which is only able to track an active interaction, e.g. analyze attention by counting a number of clicks. This technique can not fully estimate the drawn attention, as 73% of people do not actively interact with comments. Eye-tracking plays a vital role in solving the task of analysing implicit attention, as it allows to estimate such crucial characteristics, as comment features, which drew the most attention, the amount of attention given or the order of seeing specific comments. The current research uses eye-tracking based attention analysis for investigating the phenomena of users' reading behaviour on the real YouTube interface. The present master thesis concentrates on analysing users' attention mechanisms and reading behaviour and finding a correlation between the comment features such as length, language, sparseness, comment position, number of likes, presence of replies, presence of video creator's like and authorised user label, which cannot be exhibited from a pure textual point of view. This work shows that number of likes and presence of answers contribute the most to the attention drawn by comments. The analysis revealed that the category of a video deeply influences the emotion and length of the popular comments: if people are looking for useful information, which applies to educational videos, they tend to pay attention to neutral and long comments, whereas while looking for entertainment, people most probably will notice short and positive comments. The two important findings from the gender analysis are that women tend to read longer comments, but skip a higher percentage of comments than men. It is hoped that this research will contribute to a deeper understanding of the features that draw the most attention, which can be exploited in content generation strategies and developing new ranking algorithms.BibTeX