cond-mat.dis-nn – Recent Articles

Identifying phase transitions in physical systems with neural networks: a neural architecture search perspective

April 24, 2024April 24, 2024 Rodrigo Carmo Terin, Zochil Gonz\'alez Arenas, Roberto Santana Edit

arXiv:2404.15118v1 Announce Type: new Abstract: The use of machine learning algorithms to investigate phase transitions in physical systems is a valuable way to better understand the characteristics of these systems. Neural networks have been used to extract information of phases and phase transitions directly from many-body configurations. However, one limitation of neural networks is that they require the definition of the model architecture and parameters previous to their application, and such determination is itself a difficult problem. In this paper, we investigate for the first time the relationship between the accuracy of neural networks for information of phases and the network configuration (that comprises the architecture and hyperparameters). We formulate the phase analysis as a regression task, address the question of generating data that reflects the different states of the physical system, and evaluate the performance of neural architecture search for this task. After obtaining the optimized architectures, we further implement smart data processing and analytics by means of neuron coverage metrics, assessing the capability of these metrics to estimate phase transitions. Our results identify the neuron coverage metric as promising for detecting phase transitions in physical systems.

Effect of Synaptic Heterogeneity on Neuronal Coordination

April 5, 2024April 5, 2024 Moritz Layer, Moritz Helias, David Dahmen Edit

arXiv:2308.00421v3 Announce Type: replace-cross Abstract: Recent advancements in measurement techniques have resulted in an increasing amount of data on neural activities recorded in parallel, revealing largely heterogeneous correlation patterns across neurons. Yet, the mechanistic origin of this heterogeneity is largely unknown because existing theoretical approaches linking structure and dynamics in neural circuits are restricted to population-averaged connectivity and activity. Here we present a systematic inclusion of heterogeneity in network connectivity to derive quantitative predictions for neuron-resolved covariances and their statistics in spiking neural networks. Our study shows that the heterogeneity in covariances is not a result of variability in single-neuron firing statistics but stems from the ubiquitously observed sparsity and variability of connections in brain networks. Linear-response theory maps these features to the effective connectivity between neurons, which in turn determines neuronal covariances. Beyond-mean-field tools reveal that synaptic heterogeneity modulates the variability of covariances and thus the complexity of neuronal coordination across many orders of magnitude.

Structure of activity in multiregion recurrent neural networks

February 20, 2024February 20, 2024 David G. Clark, Manuel Beiran Edit

arXiv:2402.12188v1 Announce Type: new Abstract: Neural circuits are composed of multiple regions, each with rich dynamics and engaging in communication with other regions. The combination of local, within-region dynamics and global, network-level dynamics is thought to provide computational flexibility. However, the nature of such multiregion dynamics and the underlying synaptic connectivity patterns remain poorly understood. Here, we study the dynamics of recurrent neural networks with multiple interconnected regions. Within each region, neurons have a combination of random and structured recurrent connections. Motivated by experimental evidence of communication subspaces between cortical areas, these networks have low-rank connectivity between regions, enabling selective routing of activity. These networks exhibit two interacting forms of dynamics: high-dimensional fluctuations within regions and low-dimensional signal transmission between regions. To characterize this interaction, we develop a dynamical mean-field theory to analyze such networks in the limit where each region contains infinitely many neurons, with cross-region currents as key order parameters. Regions can act as both generators and transmitters of activity, roles that we show are in conflict. Specifically, taming the complexity of activity within a region is necessary for it to route signals to and from other regions. Unlike previous models of routing in neural circuits, which suppressed the activities of neuronal groups to control signal flow, routing in our model is achieved by exciting different high-dimensional activity patterns through a combination of connectivity structure and nonlinear recurrent dynamics. This theory provides insight into the interpretation of both multiregion neural data and trained neural networks.

Training Coupled Phase Oscillators as a Neuromorphic Platform using Equilibrium Propagation

February 14, 2024February 14, 2024 cs.NE updates on arXiv.org Edit

Given the rapidly growing scale and resource requirements of machine learning applications, the idea of building more efficient learning machines much closer to the laws of physics is an attractive proposition. One central question for identifying promising candidates for such neuromorphic platforms is whether not only inference but also training can exploit the physical dynamics. In this work, we show that it is possible to successfully train a system of coupled phase oscillators - one of the most widely investigated nonlinear dynamical systems with a multitude of physical implementations, comprising laser arrays, coupled mechanical limit cycles, superfluids, and exciton-polaritons. To this end, we apply the approach of equilibrium propagation, which permits to extract training gradients via a physical realization of backpropagation, based only on local interactions. The complex energy landscape of the XY/ Kuramoto model leads to multistability, and we show how to address this challenge. Our study identifies coupled phase oscillators as a new general-purpose neuromorphic platform and opens the door towards future experimental implementations.

X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Design

February 13, 2024February 13, 2024 cs.CL updates on arXiv.org Edit

We report a mixture of expert strategy to create fine-tuned large language models using a deep layer-wise token-level approach based on low-rank adaptation (LoRA). Starting with a set of pre-trained LoRA adapters, we propose a gating strategy that uses the hidden states to dynamically mix adapted layers, allowing the resulting X-LoRA model to draw upon different capabilities and create never-before-used deep layer-wise combinations of adaptations are established to solve specific tasks. The design is inspired by the biological principles of universality and diversity, where neural network building blocks are reused in different hierarchical manifestations. Hence, the X-LoRA model can be easily implemented for any existing large language model (LLM) without a need for modifications of the underlying structure. We develop a tailored X-LoRA model that offers scientific capabilities including forward/inverse analysis tasks and enhanced reasoning capability, focused on biomaterial analysis, protein mechanics and design. The impact of this work include access to readily expandable, adaptable and changeable models with strong domain knowledge and the capability to integrate across areas of knowledge. With the X-LoRA model featuring experts in biology, mathematics, reasoning, bio-inspired materials, mechanics and materials, chemistry, and protein mechanics we conduct a series of physics-focused case studies. We examine knowledge recall, protein mechanics forward/inverse tasks, protein design, and adversarial agentic modeling including ontological knowledge graphs. The model is capable not only of making quantitative predictions of nanomechanical properties of proteins, but also reasons over the results and correctly predicts likely mechanisms that explain distinct molecular behaviors.

Coexistence of asynchronous and clustered dynamics in noisy inhibitory neural networks

February 12, 2024February 12, 2024 q-bio.NC updates on arXiv.org Edit

A regime of coexistence of asynchronous and clustered dynamics is analyzed for globally coupled homogeneous and heterogeneous inhibitory networks of quadratic integrate-and-fire (QIF) neurons subject to Gaussian noise. The analysis is based on accurate extensive simulations and complemented by a mean-field description in terms of low-dimensional next generation neural mass models for heterogeneously distributed synaptic couplings. The asynchronous regime is observable at low noise and becomes unstable via a sub-critical Hopf bifurcation at sufficiently large noise. This gives rise to a coexistence region between the asynchronous and the clustered regime. The clustered phase is characterized by population bursts in the {\gamma}-range (30-120 Hz), where neurons are split in two equally populated clusters firing in alternation. This clustering behaviour is quite peculiar: despite the global activity being essentially periodic, single neurons display switching between the two clusters due to heterogeneity and/or noise.

Nature-Inspired Local Propagation

February 12, 2024February 12, 2024 cs.NE updates on arXiv.org Edit

The spectacular results achieved in machine learning, including the recent advances in generative AI, rely on large data collections. On the opposite, intelligent processes in nature arises without the need for such collections, but simply by online processing of the environmental information. In particular, natural learning processes rely on mechanisms where data representation and learning are intertwined in such a way to respect spatiotemporal locality. This paper shows that such a feature arises from a pre-algorithmic view of learning that is inspired by related studies in Theoretical Physics. We show that the algorithmic interpretation of the derived "laws of learning", which takes the structure of Hamiltonian equations, reduces to Backpropagation when the speed of propagation goes to infinity. This opens the doors to machine learning studies based on full on-line information processing that are based the replacement of Backpropagation with the proposed spatiotemporal local algorithm.

Optimal input reverberation and homeostatic self-organization towards the edge of synchronization

February 8, 2024February 8, 2024 q-bio.NC updates on arXiv.org Edit

Transient or partial synchronization can be used to do computations, although a fully synchronized network is frequently related to epileptic seizures. Here, we propose a homeostatic mechanism that is capable of maintaining a neuronal network at the edge of a synchronization transition, thereby avoiding the harmful consequences of a fully synchronized network. We model neurons by maps since they are dynamically richer than integrate-and-fire models and more computationally efficient than conductance-based approaches. We first describe the synchronization phase transition of a dense network of neurons with different tonic spiking frequencies coupled by gap junctions. We show that at the transition critical point, inputs optimally reverberate through the network activity through transient synchronization. Then, we introduce a local homeostatic dynamic in the synaptic coupling and show that it produces a robust self-organization toward the edge of this phase transition. We discuss the potential biological consequences of this self-organization process, such as its relation to the Brain Criticality hypothesis, its input processing capacity, and how its malfunction could lead to pathological synchronization.

What does self-attention learn from Masked Language Modelling?

February 8, 2024February 8, 2024 cs.CL updates on arXiv.org Edit

Transformers are neural networks which revolutionised natural language processing and machine learning. They process sequences of inputs, like words, using a mechanism called self-attention, which is trained via masked language modelling (MLM). In MLM, a word is randomly masked in an input sequence, and the network is trained to predict the missing word. Despite the practical success of transformers, it remains unclear what type of data distribution self-attention can learn efficiently. Here, we show analytically that if one decouples the treatment of word positions and embeddings, a single layer of self-attention learns the conditionals of a generalised Potts model with interactions between sites and Potts colours. Moreover, we show that training this neural network is exactly equivalent to solving the inverse Potts problem by the so-called pseudo-likelihood method, well known in statistical physics. Using this mapping, we compute the generalisation error of self-attention in a model scenario analytically using the replica method.

Exact minimax entropy models of large-scale neuronal activity

February 2, 2024February 2, 2024 q-bio.NC updates on arXiv.org Edit

In the brain, fine-scale correlations combine to produce macroscopic patterns of activity. However, as experiments record from larger and larger populations, we approach a fundamental bottleneck: the number of correlations one would like to include in a model grows larger than the available data. In this undersampled regime, one must focus on a sparse subset of correlations; the optimal choice contains the maximum information about patterns of activity or, equivalently, minimizes the entropy of the inferred maximum entropy model. Applying this ``minimax entropy" principle is generally intractable, but here we present an exact and scalable solution for pairwise correlations that combine to form a tree (a network without loops). Applying our method to over one thousand neurons in the mouse hippocampus, we find that the optimal tree of correlations reduces our uncertainty about the population activity by 14% (over 50 times more than a random tree). Despite containing only 0.1% of all pairwise correlations, this minimax entropy model accurately predicts the observed large-scale synchrony in neural activity and becomes even more accurate as the population grows. The inferred Ising model is almost entirely ferromagnetic (with positive interactions) and exhibits signatures of thermodynamic criticality. These results suggest that a sparse backbone of excitatory interactions may play an important role in driving collective neuronal activity.