Spectral Convolutional Transformer: Harmonizing Real vs. Complex Multi-View Spectral Operators for Vision Transformer

arXiv:2403.18063v1 Announce Type: cross Abstract: Transformers used in vision have been investigated through diverse architectures - ViT, PVT, and Swin. These have worked to improve the attention mechanism and make it more efficient. Differently, the need for including local information was felt, leading to incorporating convolutions in transformers such as CPVT and CvT. Global information is captured using a complex Fourier basis to achieve global token mixing through various methods, such as AFNO, GFNet, and Spectformer. We advocate combining three diverse views of data - local, global, and long-range dependence. We also investigate the simplest global representation using only the real domain spectral representation - obtained through the Hartley transform. We use a convolutional operator in the initial layers to capture local information. Through these two contributions, we are able to optimize and obtain a spectral convolution transformer (SCT) that provides improved performance over the state-of-the-art methods while reducing the number of parameters. Through extensive experiments, we show that SCT-C-small gives state-of-the-art performance on the ImageNet dataset and reaches 84.5\% top-1 accuracy, while SCT-C-Large reaches 85.9\% and SCT-C-Huge reaches 86.4\%. We evaluate SCT on transfer learning on datasets such as CIFAR-10, CIFAR-100, Oxford Flower, and Stanford Car. We also evaluate SCT on downstream tasks i.e. instance segmentation on the MSCOCO dataset. The project page is available on this webpage.\url{https://github.com/badripatro/sct}

Transformative skeletal motion analysis: optimization of exercise training and injury prevention through graph neural networks

Introduction

Exercise is pivotal for maintaining physical health in contemporary society. However, improper postures and movements during exercise can result in sports injuries, underscoring the significance of skeletal motion analysis. This research aims to leverage advanced technologies such as Transformer, Graph Neural Networks (GNNs), and Generative Adversarial Networks (GANs) to optimize sports training and mitigate the risk of injuries.

Methods

The study begins by employing a Transformer network to model skeletal motion sequences, facilitating the capture of global correlation information. Subsequently, a Graph Neural Network is utilized to delve into local motion features, enabling a deeper understanding of joint relationships. To enhance the model's robustness and adaptability, a Generative Adversarial Network is introduced, utilizing adversarial training to generate more realistic and diverse motion sequences.

Results

In the experimental phase, skeletal motion datasets from various cohorts, including professional athletes and fitness enthusiasts, are utilized for validation. Comparative analysis against traditional methods demonstrates significant enhancements in specificity, accuracy, recall, and F1-score. Notably, specificity increases by ~5%, accuracy reaches around 90%, recall improves to around 91%, and the F1-score exceeds 89%.

Discussion

The proposed skeletal motion analysis method, leveraging Transformer and Graph Neural Networks, proves successful in optimizing exercise training and preventing injuries. By effectively amalgamating global and local information and integrating Generative Adversarial Networks, the method excels in capturing motion features and enhancing precision and adaptability. Future research endeavors will focus on further advancing this methodology to provide more robust technological support for healthy exercise practices.

Graph neural network based on brain inspired forward-forward mechanism for motor imagery classification in brain-computer interfaces

Introduction

Within the development of brain-computer interface (BCI) systems, it is crucial to consider the impact of brain network dynamics and neural signal transmission mechanisms on electroencephalogram-based motor imagery (MI-EEG) tasks. However, conventional deep learning (DL) methods cannot reflect the topological relationship among electrodes, thereby hindering the effective decoding of brain activity.

Methods

Inspired by the concept of brain neuronal forward-forward (F-F) mechanism, a novel DL framework based on Graph Neural Network combined forward-forward mechanism (F-FGCN) is presented. F-FGCN framework aims to enhance EEG signal decoding performance by applying functional topological relationships and signal propagation mechanism. The fusion process involves converting the multi-channel EEG into a sequence of signals and constructing a network grounded on the Pearson correlation coeffcient, effectively representing the associations between channels. Our model initially pre-trains the Graph Convolutional Network (GCN), and fine-tunes the output layer to obtain the feature vector. Moreover, the F-F model is used for advanced feature extraction and classification.

Results and discussion

Achievement of F-FGCN is assessed on the PhysioNet dataset for a four-class categorization, compared with various classical and state-of-the-art models. The learned features of the F-FGCN substantially amplify the performance of downstream classifiers, achieving the highest accuracy of 96.11% and 82.37% at the subject and group levels, respectively. Experimental results affirm the potency of FFGCN in enhancing EEG decoding performance, thus paving the way for BCI applications.

Attentional state-synchronous peripheral electrical stimulation during action observation induced distinct modulation of corticospinal plasticity after stroke

Introduction

Brain computer interface-based action observation (BCI-AO) is a promising technique in detecting the user's cortical state of visual attention and providing feedback to assist rehabilitation. Peripheral nerve electrical stimulation (PES) is a conventional method used to enhance outcomes in upper extremity function by increasing activation in the motor cortex. In this study, we examined the effects of different pairings of peripheral nerve electrical stimulation (PES) during BCI-AO tasks and their impact on corticospinal plasticity.

Materials and methods

Our innovative BCI-AO interventions decoded user's attentive watching during task completion. This process involved providing rewarding visual cues while simultaneously activating afferent pathways through PES. Fifteen stroke patients were included in the analysis. All patients underwent a 15 min BCI-AO program under four different experimental conditions: BCI-AO without PES, BCI-AO with continuous PES, BCI-AO with triggered PES, and BCI-AO with reverse PES application. PES was applied at the ulnar nerve of the wrist at an intensity equivalent to 120% of the sensory threshold and a frequency of 50 Hz. The experiment was conducted randomly at least 3 days apart. To assess corticospinal and peripheral nerve excitability, we compared pre and post-task (post 0, post 20 min) parameters of motor evoked potential and F waves under the four conditions in the muscle of the affected hand.

Results

The findings indicated that corticospinal excitability in the affected hemisphere was higher when PES was synchronously applied with AO training, using BCI during a state of attentive watching. In contrast, there was no effect on corticospinal activation when PES was applied continuously or in the reverse manner. This paradigm promoted corticospinal plasticity for up to 20 min after task completion. Importantly, the effect was more evident in patients over 65 years of age.

Conclusion

The results showed that task-driven corticospinal plasticity was higher when PES was applied synchronously with a highly attentive brain state during the action observation task, compared to continuous or asynchronous application. This study provides insight into how optimized BCI technologies dependent on brain state used in conjunction with other rehabilitation training could enhance treatment-induced neural plasticity.

Optimizing event-based neural networks on digital neuromorphic architecture: a comprehensive design space exploration

Neuromorphic processors promise low-latency and energy-efficient processing by adopting novel brain-inspired design methodologies. Yet, current neuromorphic solutions still struggle to rival conventional deep learning accelerators' performance and area efficiency in practical applications. Event-driven data-flow processing and near/in-memory computing are the two dominant design trends of neuromorphic processors. However, there remain challenges in reducing the overhead of event-driven processing and increasing the mapping efficiency of near/in-memory computing, which directly impacts the performance and area efficiency. In this work, we discuss these challenges and present our exploration of optimizing event-based neural network inference on SENECA, a scalable and flexible neuromorphic architecture. To address the overhead of event-driven processing, we perform comprehensive design space exploration and propose spike-grouping to reduce the total energy and latency. Furthermore, we introduce the event-driven depth-first convolution to increase area efficiency and latency in convolutional neural networks (CNNs) on the neuromorphic processor. We benchmarked our optimized solution on keyword spotting, sensor fusion, digit recognition and high resolution object detection tasks. Compared with other state-of-the-art large-scale neuromorphic processors, our proposed optimizations result in a 6× to 300× improvement in energy efficiency, a 3× to 15× improvement in latency, and a 3× to 100× improvement in area efficiency. Our optimizations for event-based neural networks can be potentially generalized to a wide range of event-based neuromorphic processors.

An inhibitory acetylcholine receptor gates context dependent mechanosensory processing in C. elegans

An animal's current behavior influences its response to sensory stimuli, but the molecular and circuit-level mechanisms of this context-dependent decision-making is not well understood. In the nematode C. elegans, inhibitory feedback from turning associated neurons alter downstream mechanosensory processing to gate the animal's response to stimuli depending on whether the animal is turning or moving forward. Until now, the specific neurons and receptors that mediate this inhibitory feedback were not known. We use genetic manipulations, single-cell rescue experiments and high-throughput closed-loop optogenetic perturbations during behavior to reveal the specific neuron and receptor responsible for receiving inhibition and altering sensorimotor processing. An inhibitory acetylcholine gated chloride channel comprised of lgc-47 and acc-1 expressed in neuron RIM receives inhibitory signals from turning neurons and performs the gating that disrupts the worm's mechanosensory evoked reversal response.

State-dependent Online Reactivations for Different Learning Strategies in Foraging

Reactivation of neural responses associated with navigation is thought to facilitate learning. We wondered whether reactivation is subject to contextual control, meaning that different types of learning promote different reactivation patterns. We trained macaques to forage in a first-person virtual maze and identified two distinct learning states prioritizing reward and information using unsupervised ethogramming based on low-level features. In orbitofrontal (OFC) and retrosplenial (RSC) cortices, representations of the goal, the path towards it, and recently traveled paths were strongly reactivated - online - during reward-prioritizing choices. During learning, reactivation of optimal paths increased in RSC after reward-prioritizing choices, and reactivation of uninformative paths decreased in RSC and OFC after information-prioritizing choices. Reactivation in OFC selectively covaried with ongoing RSC activity when prioritizing information; vice versa during prioritizing reward. These results highlight that cognitive states can drive learning and reactivation patterns can be tailored to the needs of the moment.

Decomposed frontal corticostriatal ensemble activity changes across trials, revealing distinct features relevant to outcome-based decision making

The frontal cortex-striatum circuit plays a pivotal role in adaptive goal-directed behaviours. However, the mediation of decision-related signals through cross-regional transmission between the medial frontal cortex and the striatum by neuronal ensembles remains unclear. We analysed neuronal ensemble activity obtained through simultaneous multiunit recordings in the secondary motor cortex (M2) and dorsal striatum (DS) while the rats performed an outcome-based choice task. Tensor component analysis (TCA), an unsupervised dimensionality reduction approach at the single-trial level, was adopted for concatenated ensembles of M2 and DS neurons. We identified distinct three spatiotemporal neural dynamics (TCA components) at the single-trial level specific to task-relevant variables. Choice-position selective neural dynamics was correlated with the trial-to-trial fluctuation of behavioural variables. This analytical approach unveiled choice-pattern selective neural dynamics distinguishing whether the incoming choice was a repetition or switch from the previous choice. Other neural dynamics was selective to outcome. Choice-pattern selective within-trial activity increased before response choice, whereas outcome selective within-trial activity increased following response. These results suggest that the concatenated ensembles of M2 and DS process distinct features of decision-related signals at various points in time. The M2 and DS may collaboratively monitor action outcomes and determine the subsequent choice, whether to repeat or switch, for coordinated action selection.

Distinct roles of medial prefrontal cortex subregions in the consolidation and recall of remote spatial memories

It is a common believe that memories with time become progressively independent of the hippocampus and are gradually stored in cortical areas. This view is mainly based on evidence demonstrating an impairing effect of prefrontal cortex (PFC) manipulations in the retrieval of remote memories paralleled by a lack of effect of hippocampal inhibition. What is more controversial is whether activity in the mPFC is required immediately after learning to initiate the consolidation process. Further question are possible functional differences among the subregions of the PFC in the formation and storage of remote memories. To address these issues, we directly contrasted the effects of loss-of-function manipulations of the the anterior cingulate cortex (aCC) and the ventro-medial prefrontal cortex (vmPFC), that includes the infralimbic and the prelimbic cortices, before testing, and immediately after training, on the ability of CD1 mice to recall the location of the hidden platform in the Morris water maze. To this aim we injected in the vmPFC or in the aCC an AAV carrying the hM4Di receptor. Interestingly, pre-test administrations of clozapine-N-oxide (CNO) revealed that the aCC, but not the vmPFC, is necessary to recall remote spatial information. Furthermore, systemic post-training administration of CNO (3mg/kg) impaired memory recall at remote time points but not recent time points in both experimental groups. Overall, these findings revealed a functional dissociation between the two prefrontal areas, demonstrating that they are both involved in the early consolidation of remote spatial memories, but that only the aCC is engaged in their recall.