alphabits
A Prolog-based knowledge base using event-role semantics for structured temporal reasoning. Integrates FOLD-RM for inductive logic programming and MCP server exposure for LLM orchestration.
Prolog · Python · FOLD-RM · MCP · Neo4j
Overview
alphabits is the central research system underpinning work on hybrid neuro-symbolic knowledge extraction. At its core, it is a Prolog knowledge base structured around event-role semantics — a representational scheme that encodes entities, events, and their semantic relationships as typed logical facts. The system is designed to support temporal reasoning: facts can be time-stamped, updated, and queried across intervals.
The name is a deliberate portmanteau. “Alpha” refers to the first-order logical foundations; “bits” to the discrete, symbolic character of the representation. Together they gesture toward something between the continuous learned representations of modern neural networks and the crisp symbolic structures of classical AI.
Architecture
The architecture follows a three-layer design. The base layer is a Prolog knowledge base maintained via SWI-Prolog, with predicates organised around the event-role schema. Above this sits a Python orchestration layer that handles ingestion from upstream NLP pipelines, schema validation, and translation between Python data structures and Prolog terms.
The third layer is the interface: an MCP (Model Context Protocol) server that exposes the knowledge base to LLM agents. This allows language models to issue structured queries against the symbolic store, grounding their reasoning in verified, discretised facts rather than relying entirely on parametric knowledge. The result is a hybrid system in which learned representations handle open-vocabulary understanding while the symbolic layer enforces consistency and supports exact lookup.
Key Findings
Integration of FOLD-RM for inductive logic programming has yielded promising results on relation induction tasks. Given a seed set of positive and negative examples, FOLD-RM can induce new Prolog rules that generalise correctly in most cases, reducing the manual rule-engineering burden substantially. The most persistent challenge remains handling distributional shift between source documents and the event-role schema — a problem being addressed through a staged extraction pipeline with explicit schema alignment.