Generalizing neural signal-to-text brain-computer interfaces

Janaki Sheth; Ariel Tankus; Michelle Tran; Nader Pouratian; Itzhak Fried; William Speier

doi:10.1088/2057-1976/abf6ab

Generalizing neural signal-to-text brain-computer interfaces

Janaki Sheth, Ariel Tankus, Michelle Tran, Nader Pouratian, Itzhak Fried, William Speier

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

Objective: Brain-Computer Interfaces (BCI) may help patients with faltering communication abilities due to neurodegenerative diseases produce text or speech by direct neural processing. However, their practical realization has proven difficult due to limitations in speed, accuracy, and generalizability of existing interfaces. The goal of this study is to evaluate the BCI performance of a robust speech decoding system that translates neural signals evoked by speech to a textual output. While previous studies have approached this problem by using neural signals to choose from a limited set of possible words, we employ a more general model that can type any word from a large corpus of English text. Approach: In this study, we create an end-to-end BCI that translates neural signals associated with overt speech into text output. Our decoding system first isolates frequency bands in the input depthelectrode signal encapsulating differential information regarding production of various phonemic classes. These bands form a feature set that then feeds into a Long Short-Term Memory (LSTM) model which discerns at each time point probability distributions across all phonemes uttered by a subject. Finally, a particle filtering algorithm temporally smooths these probabilities by incorporating prior knowledge of the English language to output text corresponding to the decoded word. The generalizability of our decoder is driven by the lack of a vocabulary constraint on this output word. Main result: This method was evaluated using a dataset of 6 neurosurgical patients implanted with intra-cranial depth electrodes to identify seizure foci for potential surgical treatment of epilepsy.We averaged 32% word accuracy and on the phoneme-level obtained 46% precision, 51% recall and 73.32% average phoneme error rate while also achieving significant increases in speed when compared to several other BCI approaches. Significance: Our study employs a more general neural signal-to-text model which could facilitate communication by patients in everyday environments.

Original language	English (US)
Article number	035023
Journal	Biomedical Physics and Engineering Express
Volume	7
Issue number	3
DOIs	https://doi.org/10.1088/2057-1976/abf6ab
State	Published - May 2021
Externally published	Yes

Keywords

Brain-computer interfaces
Intra-cranial depth electrodes
Neural speech recognition

ASJC Scopus subject areas

Biophysics
Bioengineering
Biomaterials
Physiology
Biomedical Engineering
Radiology Nuclear Medicine and imaging
Computer Science Applications
Health Informatics

Access to Document

10.1088/2057-1976/abf6ab

Cite this

@article{389b7732e8804387b1857472ad535858,

title = "Generalizing neural signal-to-text brain-computer interfaces",

abstract = "Objective: Brain-Computer Interfaces (BCI) may help patients with faltering communication abilities due to neurodegenerative diseases produce text or speech by direct neural processing. However, their practical realization has proven difficult due to limitations in speed, accuracy, and generalizability of existing interfaces. The goal of this study is to evaluate the BCI performance of a robust speech decoding system that translates neural signals evoked by speech to a textual output. While previous studies have approached this problem by using neural signals to choose from a limited set of possible words, we employ a more general model that can type any word from a large corpus of English text. Approach: In this study, we create an end-to-end BCI that translates neural signals associated with overt speech into text output. Our decoding system first isolates frequency bands in the input depthelectrode signal encapsulating differential information regarding production of various phonemic classes. These bands form a feature set that then feeds into a Long Short-Term Memory (LSTM) model which discerns at each time point probability distributions across all phonemes uttered by a subject. Finally, a particle filtering algorithm temporally smooths these probabilities by incorporating prior knowledge of the English language to output text corresponding to the decoded word. The generalizability of our decoder is driven by the lack of a vocabulary constraint on this output word. Main result: This method was evaluated using a dataset of 6 neurosurgical patients implanted with intra-cranial depth electrodes to identify seizure foci for potential surgical treatment of epilepsy.We averaged 32% word accuracy and on the phoneme-level obtained 46% precision, 51% recall and 73.32% average phoneme error rate while also achieving significant increases in speed when compared to several other BCI approaches. Significance: Our study employs a more general neural signal-to-text model which could facilitate communication by patients in everyday environments.",

keywords = "Brain-computer interfaces, Intra-cranial depth electrodes, Neural speech recognition",

author = "Janaki Sheth and Ariel Tankus and Michelle Tran and Nader Pouratian and Itzhak Fried and William Speier",

note = "Publisher Copyright: {\textcopyright} 2021 IOP Publishing Ltd.",

year = "2021",

month = may,

doi = "10.1088/2057-1976/abf6ab",

language = "English (US)",

volume = "7",

journal = "Biomedical Physics and Engineering Express",

issn = "2057-1976",

publisher = "IOP Publishing Ltd.",

number = "3",

}

TY - JOUR

T1 - Generalizing neural signal-to-text brain-computer interfaces

AU - Sheth, Janaki

AU - Tankus, Ariel

AU - Tran, Michelle

AU - Pouratian, Nader

AU - Fried, Itzhak

AU - Speier, William

PY - 2021/5

Y1 - 2021/5

N2 - Objective: Brain-Computer Interfaces (BCI) may help patients with faltering communication abilities due to neurodegenerative diseases produce text or speech by direct neural processing. However, their practical realization has proven difficult due to limitations in speed, accuracy, and generalizability of existing interfaces. The goal of this study is to evaluate the BCI performance of a robust speech decoding system that translates neural signals evoked by speech to a textual output. While previous studies have approached this problem by using neural signals to choose from a limited set of possible words, we employ a more general model that can type any word from a large corpus of English text. Approach: In this study, we create an end-to-end BCI that translates neural signals associated with overt speech into text output. Our decoding system first isolates frequency bands in the input depthelectrode signal encapsulating differential information regarding production of various phonemic classes. These bands form a feature set that then feeds into a Long Short-Term Memory (LSTM) model which discerns at each time point probability distributions across all phonemes uttered by a subject. Finally, a particle filtering algorithm temporally smooths these probabilities by incorporating prior knowledge of the English language to output text corresponding to the decoded word. The generalizability of our decoder is driven by the lack of a vocabulary constraint on this output word. Main result: This method was evaluated using a dataset of 6 neurosurgical patients implanted with intra-cranial depth electrodes to identify seizure foci for potential surgical treatment of epilepsy.We averaged 32% word accuracy and on the phoneme-level obtained 46% precision, 51% recall and 73.32% average phoneme error rate while also achieving significant increases in speed when compared to several other BCI approaches. Significance: Our study employs a more general neural signal-to-text model which could facilitate communication by patients in everyday environments.

AB - Objective: Brain-Computer Interfaces (BCI) may help patients with faltering communication abilities due to neurodegenerative diseases produce text or speech by direct neural processing. However, their practical realization has proven difficult due to limitations in speed, accuracy, and generalizability of existing interfaces. The goal of this study is to evaluate the BCI performance of a robust speech decoding system that translates neural signals evoked by speech to a textual output. While previous studies have approached this problem by using neural signals to choose from a limited set of possible words, we employ a more general model that can type any word from a large corpus of English text. Approach: In this study, we create an end-to-end BCI that translates neural signals associated with overt speech into text output. Our decoding system first isolates frequency bands in the input depthelectrode signal encapsulating differential information regarding production of various phonemic classes. These bands form a feature set that then feeds into a Long Short-Term Memory (LSTM) model which discerns at each time point probability distributions across all phonemes uttered by a subject. Finally, a particle filtering algorithm temporally smooths these probabilities by incorporating prior knowledge of the English language to output text corresponding to the decoded word. The generalizability of our decoder is driven by the lack of a vocabulary constraint on this output word. Main result: This method was evaluated using a dataset of 6 neurosurgical patients implanted with intra-cranial depth electrodes to identify seizure foci for potential surgical treatment of epilepsy.We averaged 32% word accuracy and on the phoneme-level obtained 46% precision, 51% recall and 73.32% average phoneme error rate while also achieving significant increases in speed when compared to several other BCI approaches. Significance: Our study employs a more general neural signal-to-text model which could facilitate communication by patients in everyday environments.

KW - Brain-computer interfaces

KW - Intra-cranial depth electrodes

KW - Neural speech recognition

UR - http://www.scopus.com/inward/record.url?scp=85105684847&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85105684847&partnerID=8YFLogxK

U2 - 10.1088/2057-1976/abf6ab

DO - 10.1088/2057-1976/abf6ab

M3 - Article

C2 - 33836507

AN - SCOPUS:85105684847

SN - 2057-1976

VL - 7

JO - Biomedical Physics and Engineering Express

JF - Biomedical Physics and Engineering Express

IS - 3

M1 - 035023

ER -

Generalizing neural signal-to-text brain-computer interfaces

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this