Machine Learning – Recent Articles

Lero: applying learning-to-rank in query optimizer

April 25, 2024April 25, 2024 Latest Results for The VLDB Journal Edit

Abstract

In recent studies, machine learning techniques have been employed to support or enhance cost-based query optimizers in DBMS. Although these approaches have shown superiority in certain benchmarks, they also suffer from certain drawbacks. These include unstable performance, high training costs, and slow model updating, which can be attributed to the inherent challenges of predicting the cost or latency of execution plans using machine learning models. In this paper, we introduce a learning-to-rank query optimizer, called Lero, which builds on top of the native query optimizer and continuously learns to improve query optimization. The key observation is that the relative order or rank of plans, rather than the exact cost or latency, is sufficient for query optimization. Lero employs a pairwise approach to train a classifier to compare any two plans and tell which one is better. Such a binary classification task is much easier than the regression task to predict the cost or latency, in terms of model efficiency and effectiveness. Rather than building a learned optimizer from scratch, Lero is designed to leverage decades of wisdom of databases and improve the native optimizer. With its non-intrusive design, Lero can be implemented on top of any existing DBMS with minimum integration efforts. We implement Lero and demonstrate its outstanding performance using PostgreSQL and Spark SQL. In our experiments, Lero achieves near-optimal performance on several benchmarks. It reduces the execution time of the native PostgreSQL optimizer by up to \(70\%\) and other learned query optimizers by up to \(37\%\) on single-machine environments. On distributed environments, our Lero improves the running time of the native Spark SQL optimizer by up to \(27\%\) . Meanwhile, Lero continuously learns and automatically adapts to query workloads and changes in data.

Models optimized for real-world tasks reveal the necessity of precise temporal coding in hearing

April 25, 2024April 25, 2024 Saddler, M. R., McDermott, J. H. Edit

Neurons encode information in the timing of their spikes in addition to their firing rates. Spike timing is particularly precise in the auditory nerve, where action potentials phase lock to sound with sub- millisecond precision, but its behavioral relevance is uncertain. To investigate the role of this temporal coding, we optimized machine learning models to perform real-world hearing tasks with simulated cochlear input. We asked how precise auditory nerve spike timing needed to be to reproduce human behavior. Models with high-fidelity phase locking exhibited more human-like sound localization and speech perception than models without, consistent with an essential role in human hearing. Degrading phase locking produced task-dependent effects, revealing how the use of fine-grained temporal information reflects both ecological task demands and neural implementation constraints. The results link neural coding to perception and clarify conditions in which prostheses that fail to restore high-fidelity temporal coding could in principle restore near-normal hearing.

A theory of temporal self-supervised learning in neocortical layers

April 25, 2024April 25, 2024 Kermani Nejad, K., Anastasiades, P., Hertag, L., Costa, R. P. Edit

The neocortex constructs an internal representation of the world, but the underlying circuitry and computational principles remain unclear. Inspired by self-supervised learning algorithms, we introduce a computational model wherein layer 2/3 (L2/3) learns to predict incoming sensory stimuli by comparing previous sensory inputs, relayed via layer 4, with current thalamic inputs arriving at layer 5 (L5). We demonstrate that our model accurately predicts sensory information in a contextual temporal task, and that its predictions are robust to noisy or partial sensory input. Additionally, our model generates layer-specific sparsity and latent representations, consistent with experimental observations. Next, using a sensorimotor task, we show that the model's L2/3 and L5 prediction errors mirror mismatch responses observed in awake, behaving mice. Finally, through manipulations, we offer testable predictions to unveil the computational roles of various cortical features. In summary, our findings suggest that the multi-layered neocortex empowers the brain with self-supervised learning capabilities.

Hierarchical cortical entrainment orchestrates the multisensory processing of biological motion

April 25, 2024April 25, 2024 Shen, L., Li, S., Tian, Y., Wang, Y., Jiang, Y. Edit

When observing others' behaviors, we continuously integrate their movements with the corresponding sounds to achieve efficient perception and develop adaptive responses. However, how human brains integrate these complex audiovisual cues based on their natural temporal correspondence remains unknown. Using electroencephalogram, we demonstrated that cortical oscillations entrained to hierarchical rhythmic structures in audiovisually congruent human walking movements and footstep sounds. Remarkably, the entrainment effects at different time scales exhibit distinct modes of multisensory integration, i.e., an additive integration effect at a basic-level integration window (step-cycle) and a super-additive multisensory enhancement at a higher-order temporal integration window (gait-cycle). Moreover, only the cortical tracking of higher-order rhythmic structures is specialized for the multisensory integration of human motion signals and correlates with individuals' autistic traits, suggesting its functional relevance to biological motion perception and social cognition. These findings unveil the multifaceted roles of entrained cortical activity in the multisensory perception of human motion, shedding light on how hierarchical cortical entrainment orchestrates the processing of complex, rhythmic stimuli in natural contexts.

The hippocampus pre-orders movements for skilled action sequences

April 25, 2024April 25, 2024 Yewbrey, R., Kornysheva, K. Edit

Plasticity in the subcortical motor basal ganglia-thalamo-cerebellar network plays a key role in the acquisition and control of long-term memory for new procedural skills, from the formation of population trajectories controlling trained motor skills in the striatum to the adaptation of sensorimotor maps in the cerebellum. However, recent findings demonstrate the involvement of a wider cortical and subcortical brain network in the consolidation and control of well-trained actions, including an area traditionally associated with declarative memory - the hippocampus. Here, we probe which role these subcortical areas play in skilled motor sequence control, from sequence feature selection during planning to their integration during sequence execution. An fMRI dataset collected after participants learnt to produce four finger sequences entirely from memory with high accuracy over several days was examined for both changes in BOLD activity and their informational content in subcortical regions of interest. Although there was a widespread activity increase in effector-related striatal, thalamic and cerebellar regions, the associated activity did not contain information on the motor sequence identity. In contrast, hippocampal activity increased during planning and predicted the order of the upcoming sequence of movements. Our findings show that the hippocampus pre-orders movements for skilled action sequences, thus contributing to the higher-order control of skilled movements. These findings challenge the traditional taxonomy of episodic and procedural memory and carries implications for the rehabilitation of individuals with neurodegenerative disorders.

BDCC, Vol. 8, Pages 44: Topic Modelling: Going beyond Token Outputs

April 25, 2024April 25, 2024 Lowri Williams Edit

BDCC, Vol. 8, Pages 44: Topic Modelling: Going beyond Token Outputs

Big Data and Cognitive Computing doi: 10.3390/bdcc8050044

Authors: Lowri Williams Eirini Anthi Laura Arman Pete Burnap

Topic modelling is a text mining technique for identifying salient themes from a number of documents. The output is commonly a set of topics consisting of isolated tokens that often co-occur in such documents. Manual effort is often associated with interpreting a topic&rsquo;s description from such tokens. However, from a human&rsquo;s perspective, such outputs may not adequately provide enough information to infer the meaning of the topics; thus, their interpretability is often inaccurately understood. Although several studies have attempted to automatically extend topic descriptions as a means of enhancing the interpretation of topic models, they rely on external language sources that may become unavailable, must be kept up to date to generate relevant results, and present privacy issues when training on or processing data. This paper presents a novel approach towards extending the output of traditional topic modelling methods beyond a list of isolated tokens. This approach removes the dependence on external sources by using the textual data themselves by extracting high-scoring keywords and mapping them to the topic model&rsquo;s token outputs. To compare how the proposed method benchmarks against the state of the art, a comparative analysis against results produced by Large Language Models (LLMs) is presented. Such results report that the proposed method resonates with the thematic coverage found in LLMs and often surpasses such models by bridging the gap between broad thematic elements and granular details. In addition, to demonstrate and reinforce the generalisation of the proposed method, the approach was further evaluated using two other topic modelling methods as the underlying models and when using a heterogeneous unseen dataset. To measure the interpretability of the proposed outputs against those of the traditional topic modelling approach, independent annotators manually scored each output based on their quality and usefulness as well as the efficiency of the annotation task. The proposed approach demonstrated higher quality and usefulness, as well as higher efficiency in the annotation task, in comparison to the outputs of a traditional topic modelling method, demonstrating an increase in their interpretability.

A supervised contrastive learning-based model for image emotion classification

April 24, 2024April 24, 2024 Latest Results for World Wide Web Edit

Abstract

Images play a vital role in social media platforms, which can more vividly reflect people’s inner emotions and preferences, so visual sentiment analysis has become an important research topic. In this paper, we propose a Supervised Contrastive Learning-based model for image emotion classification, which consists of two modules of low-level feature extraction and deep emotional feature extraction, and feature fusion is used to enhance the overall perception of image emotions. In the low-level feature extraction module, the LBP-U (Local Binary Patterns with Uniform Patterns) algorithm is employed to extract texture features from the images, which can effectively capture the texture information of the images, aiding in the differentiation of images belonging to different emotion categories. In the deep emotional feature extraction module, we introduce a Supervised Contrastive Learning approach to improve the extraction of deep emotional features by narrowing the intra-class distance among images of the same emotion category while expanding the inter-class distance between images of different emotion categories. Through fusing the low-level and deep emotional features, our model comprehensively utilizes features at different levels, thereby enhancing the overall emotion classification performance. To assess the classification performance and generalization capability of the proposed model, we conduct experiments on the publicly FI (Flickr and Instagram) Emotion dataset. Comparative analysis of the experimental results demonstrates that our proposed model has good performance for image emotion classification. Additionally, we conduct ablation experiments to analyze the impact of different levels of features and various loss functions on the model’s performance, thereby validating the superiority of our proposed approach.

Anatomical circuits for flexible spatial mapping by single neurons in posterior parietal cortex

April 24, 2024April 24, 2024 Ahmed, B., Ko, H. K., Ruesseler, M., Smith, J. E. T., Krug, K. Edit

Primate lateral intraparietal area (LIP) is critical for cognitive processing. Its contribution to categorization and decision-making has been causally linked to neurons' spatial sensorimotor selectivity. We reveal the intrinsic anatomical circuits and neuronal responses within LIP that provide the substrate for this flexible generation of motor responses to sensory targets. Retrograde tracers delineate a loop between two distinct operational compartments, with a sensory-like, point-to-point projection from ventral to dorsal LIP and an asymmetric, more widespread projection in reverse. Neurophysiological recordings demonstrate that especially more ventral LIP neurons exhibit motor response fields that are spatially distinct from its sensory receptive field. The different associations of response and receptive fields in single neurons tile visual space. These anatomical circuits and neuronal responses provide the basis for the flexible allocation of attention and motor responses to salient or instructive visual input across the visual field.

Characterising time-on-task effects on oscillatory and aperiodic EEG components and their co-variation with visual task performance.

April 24, 2024April 24, 2024 Kopcanova, M., Thut, G., Benwell, C. S., Keitel, C. Edit

Fluctuations in oscillatory brain activity have been shown to co-occur with variations in task performance. More recently, part of these fluctuations has been attributed to long-term (>1hr) monotonous trends in the power and frequency of alpha oscillations (8-13 Hz). Here we tested whether these time-on-task changes in EEG activity are limited to activity in the alpha band and whether they are linked to task performance. Thirty-six participants performed 900 trials of a two-alternative forced choice visual discrimination task with confidence ratings. Pre- and post-stimulus spectral power (1-40Hz) and aperiodic (i.e., non-oscillatory) components were compared across blocks of the experimental session and tested for relationships with behavioural performance. We found that time-on-task effects on oscillatory EEG activity were primarily localised within the alpha band, with alpha power increasing and peak alpha frequency decreasing over time, even when controlling for aperiodic contributions. Aperiodic, broadband activity on the other hand did not show time-on-task effects in our data set. Importantly, time-on-task effects in alpha frequency and power explained variability in single-trial reaction times. Moreover, controlling for time-on-task effectively removed the relationships between alpha activity and reaction times. However, time-on-task effects did not affect other EEG signatures of behavioural performance, including post-stimulus predictors of single-trial decision confidence. Therefore, our results dissociate alpha-band brain-behaviour relationships that can be explained away by time-on-task from those that remain after accounting for it - thereby further specifying the potential functional roles of alpha in human visual perception.

Multiple hypergraph convolutional network social recommendation using dual contrastive learning

April 24, 2024April 24, 2024 Latest Results for Data Mining and Knowledge Discovery Edit

Abstract

Due to the strong representation capabilities of graph structures in social networks, social relationships are often used to improve recommendation quality. Most existing social recommendation models exploit pairwise relations to mine latent user preferences. However, since user interactions are relatively complex with possibly higher-order relationships, their performance in real-world applications is limited. Furthermore, user behavior data in many practical recommendation scenarios tend to be noisy and sparse, which may lead to suboptimal representation performance. To address this issue, we propose a dual objective contrastive learning multiple hypergraph convolution model for social recommendation (DCMHS). Specifically, our model first constructs hypergraphs with different social relationships. Then, we construct hypergraph encoders to obtain higher-order user representations through hypergraph convolution. Aiming to avoid aggregation loss caused by aggregating user embeddings under different views into one, we construct neighbor identification and semantic identification contrastive learning objectives to iteratively refine the user representation. In addition, we optimize the negative sampling process using the global embedding of items. The results of experiments conducted on real-world datasets demonstrate the effectiveness of the proposed DCMHS, and the ablation study validates the rationality of different components of the model.