SYNAPSE: Neuro-Symbolic Visual Thought-to-Text Decoding via Topological Semantic Denoising-AI Topic

Recent advancements in large language models (LLMs) have accelerated research into open-vocabulary electroencephalography (EEG)-to-text decoding, a task that translates non-invASIve neural ACTivity recorded during visual perception into coherent natural language descriptions of visual stimuli. However, existing systems remAIn highly susceptible to biological noise. Corrupted neural projections often cause fRozen language models to produce hallucinated or semantically unstable ouTPUts. To address this, we propose SYNAPSE (Symbolic Neural Alignment for Precise Semantic Extraction), a Lightweight Neuro-Symbolic Framework that stabilizes neural text generation through symbolic regularization at inference time. By leverAGIng a commonsense knowledge graph structure and latent Sample purification to refine semantic candidates derived from EEG signals, SYNAPSE enhances semantic stability without requiring end-to-end fine-tuning of LLMs. Experiments on popular EEG decoding benchmarks across multiple frozen LLM backends dEMOnstrate consistent improvements over unconstrained Prompting baselines. The Framework exhibits robustness under object label ablation and achieves performance comparable to resource-intensive fine-tuneD systems. Furthermore, by strictly confining raw EEG processing within the encoder stack, SYNAPSE effectively preserves biometric privacy.

1. Main Contributions

Graph Purification Mechanism: Developed a mechanism that removes disconnected semantic noise while retaining high-confidence neural intentions.
Latent Sample Retrieval strategy: Introduced a strategy that injects syntactic templates to stabilize the downstream generation process under noisy biological conditions.
Robust Decoding Performance: Demonstrated robust and competitive decoding performance across multiple EEG corpora, frozen LLMs, and various ablation settings.

2. Methodology
SYNAPSE comprises three core modules: Topological Graph Purification, Relational Fact Extraction, and Latent Sample Retrieval. The process unfolds in three steps: retrieving candidate keywords from EEG signals, APPlying topological purification via the ConceptNet commonsense knowledge graph (removing isolated noise words and retaining high-confidence words) while extracting relational facts, and retrieving historical samples as syntactic templates. Finally, the purified bag-of-words, relational facts, and syntactic templates are integrated into a prompt for a frozen LLM to generate the final text.

Topological Graph Purification: This module utilizes a frozen SENSE frontend to encode raw EEG signals into 512-dimensional latent vectors. It retrieves the top-15 candidate keywords via cosine similarity with a frozen vocabulary matrix. These candidates are then projected onto the ConceptNet commonsense graph to construct an induced subgraph and calculate the normalized degree centrality for each node. A hybrid pruning rule is applied: nodes with a degree centrality greater than zero (indicating connections within the graph) or those belonging to a top-5 high-confidence priority protection set are retained. This effectively removes disconnected semantic noise while preserving high-confidence neural intentions.
Relational Fact Extraction: Building on the constructed subgraph, this module traverses all edges and filters for visual-related relationship types (e.g., LocatedNear, UsedFor, HasPRoPErty, CapableOf, PartOf). For each qualifying directed edge, a rule-mapping function translates the abstract graph triplet (head node, relation type, tail node) into natural language strings, forming semantic assertions. The top five facts are cached as context, providing explicit, structurally validated commonsense constraints to help the LLM achieve more accurate semantic grounding.
Latent Sample Retrieval: This module stores the ChannelNet embeddings of all historical training EEG samples as a frozen matrix. During inference, the current unrefined EEG feature vector serves as a query to compute parallel cosine similarity with the historical embedding matrix via a single tensor Operation, retrieving the top-2 nearest neighbor training samples. The natural language descriptions corresponding to these two samples are extracted from local stoRAGe to construct a syntactic sample set. These syntactic templates provide structural references for the frozen LLM, enabling it to generate grammatically compliant descriptions without fine-tuning.

3. Results

Table 1: Ablation tracking and comparative framework performance on the ImageNet-EEG Benchmark. Higher values indicate better performance. Configurations: A1: Full framework; A2: m=0; A3: Full framework without object labels; A4: Full framework without samples; A5: Full framework without facts; A6: minimal baseline (bag-of-words + object labels only).
Table 2: Macro-topological pruning and error-correction sensitivity statistics under varying priority protection constraints on ImageNet-EEG (N = 1,987) and THINGS EEG2 (N = 1,654) datasets. The minimum number of discarded tokens in a single trial is 0 across all evaluations.
Table 3: Ablation tracking performance on the THINGS EEG2 cross-sample eValuation suite. Higher values indicate better performance. Configurations: B1: Full framework (bag-of-words + object labels + facts + samples); B2: Full framework without samples.
Table 4: Qualitative Comparison of generated descriptions between SYNAPSE and the Thought2Text baseline. The "Pruned Bag-of-Words" column details the pruned token set obtained after the topological graph refinement phase. (Note: Descriptions were generated using Qwen2.5-7B).

4. Conclusion
This study introduces SYNAPSE, a lightweight neuro-symbolic framework for EEG-to-Text Decoding. Rather than relying on resource-intensive end-to-end fine-tuning, the framework stabilizes frozen language models through symbolic regularization at inference time. By integrating topological graph purification, relational grounding, and latent sample retrieval, SYNAPSE mitigates semantic drift and attention dispeRSIon caused by noisy neural projections. Experiments across multiple EEG benchmarks and heterogeneous LLM backends demonstrate consistent performance improvements over unconstrained prompting baselines, strong robustness under object label ablation, and performance parity with significantly larger fine-tuned systems. More broadly, the findings suggest that brain-to-text decoding can benefit from a retrieval-augmented paradigm shift, utilizing symbolic structures and external semantic memory to stabilize generation without modifying model parameters. By externalizing semantic correction from autoregressive weights to structured inference-time retrieval, SYNAPSE offers a scalable, Privacy-Preserving, and computationally efficient direction for next-generation neuro-symbolic Brain-Computer Interfaces.

5. Limitations
Despite the enhanced robustness, SYNAPSE has several limitations. Non-invasive EEG recordings inherently suffer from low spatial resolution and remain highly susceptible to physiological noise, posing fundamental limits on the fidelity of recovered semantic intentions. While topological graph purification significantly reduces semantic drift and attention dispersion, the framework assumes that graph connectivity implies contextual validity; thus, semantically incorrect but densely co-activated word clusters may evade pruning and propagate residual hallucinations downstream. Additionally, the effectiveness of symbolic regularization depends on the coverage and relational completeness of the external commonsense graph, which may omit rare or weakly connected visual concepts. The sample retrieval module is similarly constrained by static nearest-neighbor matching within a dense latent neural embedding space, where semantically similar concepts may overlap in large-scale open-vocabulary settings. More broadly, SYNAPSE inherits common limitations of retrieval-augmented paradigms: downstream generation quality is fundamentally constrained by retrieval quality, and in this setup, retrieval is performed on noisy neural semantic projections rather than clean text queries. Finally, although raw EEG processing and symbolic purification are fully localized to protect biometric privacy, the evaluated decoder backends rely on externally hosted LLM APIs, meaning end-to-end privacy ultimately depends on deployment with fully localized language models and on-device inference infrastructure.

★★★★★

Be the first to rate this article.

SYNAPSE: Neuro-Symbolic Visual Thought-to-Text Decoding via Topological Semantic Denoising

Comments & Questions (0)

No comments yet