MERGE-RNA: a physics-based model to predict RNA secondary structure ensembles with chemical probing
Abstract
The function of RNA molecules is deeply related to their secondary structure, which determines which nucleobases are accessible for pairing. Most RNA molecules however function through dynamic and heterogeneous structural ensembles. Chemical probing methods (e.g., DMS probing) rely on selective chemical modification of accessible RNA nucleotides to infer base-pairing status, yet the resulting nucleotide-resolution data represent ensemble averages over dynamic RNA conformations. We present MERGE-RNA, a unified, physics-based framework that explicitly models the full experimental pipeline, from the thermodynamics of probe binding to the mutational profiling readout. By integrating measurements across probe concentrations and replicates, our model learns a small set of transferable and interpretable parameters together with minimal sequence-specific soft constraints. This enables the prediction of secondary structure ensembles that best explain the data and the detection of suboptmal structures involved in dynamic processes. We validate MERGE-RNA on diverse RNAs, showing that it achieves strong structural accuracy while preserving essential conformational heterogeneity. In a designed RNA for which we report new DMS data, MERGE-RNA detects transient intermediate states associated with strand displacement, dynamics that remain invisible to traditional methods.
Summary
This paper introduces MERGE-RNA, a novel physics-based computational framework for predicting RNA secondary structure ensembles using chemical probing data, specifically DMS probing. The core idea is to model the entire experimental pipeline, from probe-RNA interaction and binding to the final mutational profiling readout, using physically meaningful parameters. This allows the model to learn transferable parameters from multiple datasets (different RNA sequences, probe concentrations, and experimental replicates) simultaneously, leading to more robust and accurate predictions of RNA structural ensembles. MERGE-RNA uses a maximum entropy approach to refine an initial thermodynamic folding model (ViennaRNA) with sequence-specific soft constraints derived from the experimental data, ensuring minimal adjustments to the baseline model. The key finding is that MERGE-RNA achieves strong structural accuracy while preserving essential conformational heterogeneity in the predicted ensembles. The model demonstrates the ability to deconvolve mixed structural states, even detecting transient intermediate states associated with dynamic processes like strand displacement that are often missed by traditional methods. The authors validate MERGE-RNA on diverse RNAs and demonstrate its ability to capture complex structural rearrangements, such as the temperature-dependent changes in the cspA 5' UTR. This matters to the field because it offers a more principled and accurate way to interpret chemical probing data and predict RNA structure, leading to a better understanding of RNA function and dynamics. By explicitly modeling the experimental process and incorporating physical parameters, MERGE-RNA overcomes limitations of existing methods that rely on heuristic conversions of reactivity data into pseudo-free energies.
Key Insights
- •MERGE-RNA introduces a novel physics-based model for predicting RNA secondary structure ensembles by explicitly modeling the chemical probing experimental pipeline.
- •The model learns transferable physical parameters (μr, Δμpairing, pbind(A, unpaired), pbind(C, unpaired), pbind(G), pbind(U), m0/m1) from multiple datasets, improving robustness and generalizability.
- •Using maximum entropy inference, MERGE-RNA infers sequence-specific soft constraints (λi) to minimally adjust the baseline thermodynamic model, avoiding over-constraining the predictions.
- •The study found that there's an optimal balance between structural accuracy and ensemble heterogeneity, and that enforcing the reference structure too strongly can degrade performance.
- •MERGE-RNA accurately deconvolves mixed structural states, both in synthetic and experimental data, and detects transient intermediate states associated with strand displacement. For example, the loop co-occupancy analysis shows that MERGE-RNA yields concurrent formation in 42–57% of structures versus < 0.2% in baseline.
- •The model is robust to the choice of the underlying thermodynamic model (Turner 2004 vs Andronescu 2007).
- •A limitation is that the underlying ViennaRNA framework does not account for pseudoknots. The model also assumes constant experimental conditions for parameter transferability.
Practical Implications
- •MERGE-RNA can be used to predict RNA secondary structure ensembles and identify dynamic structural rearrangements, which is crucial for understanding RNA function in various biological processes.
- •Researchers studying RNA structure and function, particularly those using chemical probing data, would benefit from using MERGE-RNA to obtain more accurate and informative structural predictions.
- •Practitioners can use MERGE-RNA to analyze chemical probing data, predict RNA structural ensembles, and identify potential drug targets or design RNA-based therapeutics.
- •The ability to deconvolve mixed structural states opens up new avenues for studying complex RNA dynamics and identifying potential regulatory mechanisms.
- •Future research directions include extending the framework to incorporate data from multiple chemical probes (SHAPE, CMCT, etc.), modeling RNA-protein interactions, and accounting for experimental variations (temperature, buffer, etc.).