Research

Active threads, papers, and ongoing questions.

Active research threads

Compositional generalisation in hybrid neuro-symbolic pipelines

How can systems that combine neural and symbolic components generalise compositionally — applying known rules to novel combinations of elements — in a way that purely neural systems struggle to do? This thread is exploring whether event-role structured representations, when used as an interface between neural and symbolic modules, can support systematic generalisation on relation extraction tasks where the entity types or argument structures are unseen at training time.

Knowledge extraction under distributional shift

Extraction pipelines trained on one domain frequently degrade when deployed on text from a related but distinct distribution — a problem that is acute in high-stakes professional domains where labelled data is expensive to produce and domain boundaries are not crisp. This thread is investigating staged adaptation strategies: using a small amount of in-domain annotated data to align a general-purpose extraction model to the target distribution, with the FST-based schema alignment layer acting as a regulariser.

Publications & preprints

FOLD-RM for Knowledge Extraction: Inductive Rule Learning over Noisy Event Graphs

May 2024 in-progress

An investigation into applying FOLD-RM, a scalable answer set programming induction algorithm, to rule learning over knowledge graphs extracted from noisy, domain-specific corpora. We report results on legal and biomedical graph benchmarks.

Code

Event-Role Semantics for Grounded LLM Reasoning via FST Scaffolding

arXiv preprint · Feb 2024 preprint

We propose a hybrid architecture in which finite-state transducers extract event-role structured representations from unstructured text, providing a symbolic scaffold for downstream LLM reasoning. Evaluated on temporal reasoning and relation extraction benchmarks.

arXiv Code