Active threads, papers, and ongoing questions.
Compositional generalisation in hybrid neuro-symbolic pipelines
How can systems that combine neural and symbolic components generalise compositionally — applying known rules to novel combinations of elements — in a way that purely neural systems struggle to do? This thread is exploring whether event-role structured representations, when used as an interface between neural and symbolic modules, can support systematic generalisation on relation extraction tasks where the entity types or argument structures are unseen at training time.
Knowledge extraction under distributional shift
Extraction pipelines trained on one domain frequently degrade when deployed on text from a related but distinct distribution — a problem that is acute in high-stakes professional domains where labelled data is expensive to produce and domain boundaries are not crisp. This thread is investigating staged adaptation strategies: using a small amount of in-domain annotated data to align a general-purpose extraction model to the target distribution, with the FST-based schema alignment layer acting as a regulariser.
May 2024 in-progress
An investigation into applying FOLD-RM, a scalable answer set programming induction algorithm, to rule learning over knowledge graphs extracted from noisy, domain-specific corpora. We report results on legal and biomedical graph benchmarks.
arXiv preprint · Feb 2024 preprint
We propose a hybrid architecture in which finite-state transducers extract event-role structured representations from unstructured text, providing a symbolic scaffold for downstream LLM reasoning. Evaluated on temporal reasoning and relation extraction benchmarks.