[ICASSP 2025]BP-GPT: Auditory Neural Decoding Using fMRI-prompted LLM

计算机-人工智能-fMRI解码大模型

夏莉莉iy

1051人浏览 · 2025-03-01 22:36:07

夏莉莉iy · 2025-03-01 22:36:07 发布

论文网址：BP-GPT: Auditory Neural Decoding Using fMRI-prompted LLM

论文代码：https://github.com/1994cxy/BP-GPT

英文是纯手打的！论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误，若有发现欢迎评论指正！文章偏向于笔记，谨慎食用

2.3.1. fMRI to Text Decoding

2.4.2. Implementing Details

2.4.3. Baseline and Evaluation Metrics

2.4.4. Evaluation the Text Prompt

2.4.5. Evaluation of fMRI to Text Decoding

2.4.6. Ablation Study

2.5. Conclusion

3. Reference

1. 心得

（1）不好意思哈xd这么早给你扒来读了，只是刚好看到了，就当宣传了，github多来点Star也不是不行

（2）还只有四页，轻松愉悦看一看

（3）一天一论文，头发远离我

2. 论文逐段精读

2.1. Abstract

①现存问题：现有的LLM在从fMRI中提取语义的时候没有端到端？？？？？有点以偏概全了，我觉得不是一个很好的limitation

②They proposed Brain Prompt GPT (BP-GPT) to decoding fMRI by aligning fMRI and text

2.2. Introduction

①我很欣赏你，用一句名言开头。只有小登的世界是这样的，一本真正的故事会，而不是八股。

“The limits of my language mean the limits of my world” - Ludwig Wittgenstein.

如果作者认为语言带来了理解，这总有一种不能进步的意味。实际上造词这种东西时有发生，我们的词袋也一直更新，但ai似乎不能自动更新捏。

②The frequency of pronouncing is different from BOLD reaction

③Chanllenge: decoding multi words in one repetition time (TR)（这个现存问题不比上面那啥端到端正常？？？）

④Framework of BP-GPT:

（这图片还可以再优化一下吧....）

2.3. Method

2.3.1. fMRI to Text Decoding

①Encode fMRI by:

$P_i^B=\mathbf{E}_\eta(x_i^B),$

where $\mathbf{E}_\eta$ denotes encoder, $x_i^B$ denotes fMRI signal.

②BCELoss of fMRI encoder:

$\mathcal{L}_{brain} =-\sum_{i=1}^{N}\log p_{\eta}(W|P_{i}^{B}) \\ =-\sum^{N}\sum^{\mathcal{L}}\log p_{\eta}(w_{j}|p_{1}^{B},\ldots,p_{k}^{B},w_{1},\ldots,w_{j-1})$

③The similarity between positive pair fMRI prompt and text prompt:

$S_p=\exp(cos(P_B^i\cdot P_T^i)/\tau)$

where $\tau$ is temperature hyperparameter

④Negative pairs from different samples, the similarity is calculated by:

$S_n=\exp(\cos(P_B^i\cdot P_B^j)/\tau)+\exp(\cos(P_B^i\cdot P_T^j)/\tau),i\neq j$

⑤The contrastive loss:

$L_{\mathcal{C}}=-\mathbb{E}\left[\log\frac{S_p}{S_n}\right]$

2.3.2. Training

①BCEloss is for training text prompt, and the decoder is trained by:

$L=L_{brain}+\alpha L_{C}$

2.3.3. Inference

①The length of sentence is different from fMRI windows. "当前解决方案在最近的工作中利用字率模型来预测参与者感知的单词数。当生成的文本长度满足字率模型预测的字数时，文本生成过程将停止。虽然这种方法可以解决问题，但它并没有充分利用 LLM 的特性。"

②So they add $ in the real text:

based on TR

2.4. Experiment

2.4.1. Dataset

①Dataset:

A. LeBel, L. Wagner, S. Jain, A. Adhikari-Desai, B. Gupta, A. Morgenthal, J. Tang, L. Xu, and A. G. Huth, “A natural language fmri dataset for voxelwise encoding models,” Scientific Data, vol. 10, no. 1, p. 555, 2023.

②Subjects: they choose 3 from 8

③Situation: passively listened to naturally spoken English stories such as The Month and New York Times Modern Love podcasts

2.4.2. Implementing Details

① $\tau =0.1$

② $\alpha =1$

③Time series windows for fMRI sequence and corresponding text: 20s with no gap

④Length of prompt: $k=30$

⑤Input dimesion of BERT: 512

⑥Layer of Transformer: 8 with 8 head

⑦Optimizer: AdamW

⑧Batch size: 32

2.4.3. Baseline and Evaluation Metrics

①Test set: story “Where There's Smoke”

2.4.4. Evaluation the Text Prompt

①Performance:

2.4.5. Evaluation of fMRI to Text Decoding

①Performance table:

2.4.6. Ablation Study

①Contrastive module ablation:

②Fine tune ablation:

2.5. Conclusion

3. Reference

@article{chen2025bp,
  title={BP-GPT: Auditory Neural Decoding Using fMRI-prompted LLM},
  author={Chen, Xiaoyu and Du, Changde and Liu, Che and Wang, Yizhe and He, Huiguang},
  journal={arXiv preprint arXiv:2502.15172},
  year={2025}
}

火山引擎 ADG 社区

火山引擎开发者社区是火山引擎打造的AI技术生态平台，聚焦Agent与大模型开发，提供豆包系列模型（图像/视频/视觉）、智能分析与会话工具，并配套评测集、动手实验室及行业案例库。社区通过技术沙龙、挑战赛等活动促进开发者成长，新用户可领50万Tokens权益，助力构建智能应用。

更多推荐

Chess用户界面设计：Tailwind CSS样式系统和组件库

GitHub推荐项目精选中的ch/chess是一个类似chess.com的多人在线象棋平台，它采用现代化的前端技术栈构建，尤其在用户界面设计上通过Tailwind CSS样式系统和组件库实现了优雅且功能丰富的交互体验。本文将深入探讨该项目如何利用Tailwind CSS打造一致的设计语言和高效的组件系统，为象棋爱好者提供沉浸式的游戏界面。## 🎨 Tailwind CSS样式系统：构建统一视

火山引擎 ADG 社区

终极指南：GPT-Engineer如何通过AI自动发现代码问题并提升质量

GPT-Engineer是一款强大的AI驱动代码工具，它能帮助开发者自动检测潜在代码问题、优化代码质量，让编程效率提升3倍以上。无论是新手还是资深开发者，都能通过这款工具轻松发现代码中的隐藏缺陷，减少调试时间，释放更多精力在创造性工作上。## 一键发现代码问题：GPT-Engineer的AI审查魔力GPT-Engineer的核心能力在于其内置的智能代码分析系统。通过集成Python代码格式

火山引擎 ADG 社区

SatDump中的纠错编码技术：从RS码到Turbo码的完整实现指南

在卫星数据传输过程中，信号往往会受到各种干扰，导致数据错误。SatDump作为一款通用卫星数据处理软件，集成了多种先进的纠错编码技术，确保从卫星接收到的数据能够准确解码。本文将深入解析SatDump中从Reed-Solomon（RS）码到Turbo码的实现细节，帮助读者理解这些技术如何保障卫星通信的可靠性。## 为什么纠错编码对卫星数据至关重要？卫星与地面站之间的通信链路面临着空间辐射、大