(season 1)
" • " " • " " • " " • " " • " " • " " • " " • " " • " " • " " • " " • " " |
Oh no, when I think back to those experiments…In short, these experiments, which I created for mass chaos were activated in the house of a lonely old woman and stayed to live with her as a cats.
All items (17)
A purple heavy-set hamster-like experiment with a purple nose. Designed to turn objects into ham or pork. She was activated when Mrs. Hasagawa's cats were activated, as indicated by Gantu's experiment computer, but did not physically appear in the episode. She was referred to in Remmy when Pleakley said, "Is that the one that turns everything into ham?" Her one true place is as one of Mrs. Hasagawa's "cats".
Here is the complete list of Jumba's experiments, as found during the end credits for Leroy and Stitch . Because some of the names of experiments are patently absurd, even for Lilo, the entire list must be taken with a grain of salt; Whatsits Galore does not guarantee its accuracy.
All Disney characters & images © Disney and are used for fan purposes only All other content © 2008-2024 Whatsits Galore
Disney Links: Disneyana For Sale Testimonials The Disney Point It's Tough to Be a Bird For Disney Girls Only Disney Glitches The Many Metaphors of Darkwing Duck Very Good Advice Separated at Birth Disney's House of Mouse So You Wanna Be a Collector Animated Feature Timeline Coco: The Rivera Family Tree Disney Villains Guilty! Disney's Other Villains Henchmen, Minions, & Thugs: Disney's Second Class Villains The End: Final Fate of Disney Villains Definitive Princess List Prep & Landing Easy Disney Costumes Mouse History 101 The Not Your Average Disney Trivia Quiz The Perfect Collectible Mouse Tales Disney Roleplaying Site
Other Links: The Wacky Races The Incredible Jack McGee Get Smart Catchphrases A Christmas Quiz Greatest American Hero Fed-Speak The Ballad of Gilligan's Trial The Kolchak Survival Guide World's Longest Yard Sale Six Flavors of Quark Skits & Bits TV Glitches The Spooks of Scooby-Doo Star Trek Poster Gallery Tarzan's Dictionary Brisco County, Jr. & The Orb Sherlock Holmes Dwarf Identification Guide Dragonmaster Game Variants Your Life's Purpose Panini Stickers Site Map Home Base
The power to turn any matter into meat. Technique of Meat Manipulation . Variation of Food Transmutation .
The user can turn any matter, organic or inorganic, into meat.
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
Email citation, add to collections.
Your saved search, create a file for external citation management software, your rss feed.
Affiliations.
Introduction: Medical androgen deprivation therapy (ADT) options have expanded for patients with advanced prostate cancer (PC). Historically, ADT was primarily available in long-acting injectable formulations. In 2020, the first oral formulation was US Food and Drug Administration-approved for adults with advanced PC. This study's aim was to assess patient preferences for attributes of medical ADT, including mode of administration, side effects, impact on sexual interest, and out-of-pocket (OOP) costs, and to segment respondents into distinct groups based on their treatment choice patterns.
Methods: A cross-sectional survey was conducted among US residents aged > 40 years with PC, employing a discrete choice experiment to assess preferences for ADT attributes. For each choice task, respondents were asked to select the hypothetical treatment profile that they preferred out of two presented. Latent class analysis (LCA) was conducted to estimate attribute-level preference weights and calculate attribute relative importance for groups of respondents with similar treatment preferences.
Results: A total of 304 respondents completed the survey (mean age 64.4 years). LCA identified four preference groups, named according to the attribute each group considered most important: Sexual interest, Cost-sensitive, Favors daily pill, and Favors injection. Most respondents in the Sexual interest group were < 65 years, while the Cost-sensitive group was mostly ≥ 65 years. Favors daily pill had the highest proportion of ADT-naïve individuals. On average, respondents in these groups preferred an oral medication. Favors injection, which had the highest proportion of ADT-experienced individuals, preferred infrequent intramuscular injections, lower chance of post-ADT testosterone recovery, and lower OOP cost.
Conclusion: Respondents differed in their preferences regarding ADT attributes, highlighting the need for patient involvement in their treatment decisions. Effective communication between healthcare providers and patients about the benefits and risks of available therapies should be encouraged to ensure that patients receive the PC treatment that best meets their needs.
Keywords: Advanced prostate cancer; Androgen deprivation therapy; Discrete choice experiment; Latent class analysis; Patient preference; Prostatic neoplasms.
Prostate cancers often depend on the male sex hormone, testosterone, to grow. Androgen deprivation therapy (ADT) is used to lower testosterone levels in patients with advanced prostate cancer. ADT options available to patients have different characteristics, including how they are taken (injection or pill), side effects, impact on sexual interest, and costs. Researchers wanted to understand which ADT characteristics were most important to groups of patients with similar preferences. To do this, they gave 304 patients a series of two hypothetical (meaning not real) examples of ADT options with different characteristics and asked them to choose the option that they preferred most. Researchers found that patients could be separated into four different groups based on their preferences for ADT characteristics. One group preferred an ADT that had the least impact on their interest in sex. These patients were mainly younger than 65 years old. A second group preferred a lower cost ADT. These patients were mainly 65 years or older. A third group preferred a pill that could be taken once a day by mouth. Most of these patients did not take ADT in the past. A fourth group preferred an ADT that was given in a physician’s office as an injection every 6 months. These patients mainly had taken ADT in the past. This study shows that patients have different preferences for ADT treatment characteristics. It is important for doctors to discuss the different ADT options with patients to find the treatment that best meets their needs.
© 2024. The Author(s).
PubMed Disclaimer
Linkout - more resources, full text sources, research materials.
NCBI Literature Resources
MeSH PMC Bookshelf Disclaimer
The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.
Lilo and Stitch try to help Mrs. Hasagawa out by organizing her fruit stand and cleaning her yard. However, when she tries to offer them a bowl of apricots for their help, they discover that the bowl is actually full of experiment pods , and her house is later full of stray experiments, which she believes are "cats". Lilo and Stitch try to gather all the experiments, but it turns out that Mrs. Hasagawa has taken a liking to them and tamed them herself, so the experiments won't need to be captured after all.
- - | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Current limitations in predicting mRNA translation with deep learning models
Genome Biology volume 25 , Article number: 227 ( 2024 ) Cite this article 145 Accesses Metrics details The design of nucleotide sequences with defined properties is a long-standing problem in bioengineering. An important application is protein expression, be it in the context of research or the production of mRNA vaccines. The rate of protein synthesis depends on the 5′ untranslated region (5′UTR) of the mRNAs, and recently, deep learning models were proposed to predict the translation output of mRNAs from the 5′UTR sequence. At the same time, large data sets of endogenous and reporter mRNA translation have become available. In this study, we use complementary data obtained in two different cell types to assess the accuracy and generality of currently available models for predicting translational output. We find that while performing well on the data sets on which they were trained, deep learning models do not generalize well to other data sets, in particular of endogenous mRNAs, which differ in many properties from reporter constructs. ConclusionsThese differences limit the ability of deep learning models to uncover mechanisms of translation control and to predict the impact of genetic variation. We suggest directions that combine high-throughput measurements and machine learning to unravel mechanisms of translation control and improve construct design. The translation of most mRNAs into proteins is initiated by the recruitment of the eIF4F complex at the 7-methylguanosine cap, followed by eIF3, the initiator tRNA and the 40S subunit of the ribosome [ 1 ]. The 40S subunit scans the mRNA’s 5′ untranslated region (5′UTR) until it recognizes a start codon; then, the 60S subunit joins to complete the ribosome assembly and initiate protein synthesis. Initiation is the limiting step of translation, largely determining the rate of protein synthesis [ 2 ]. It is influenced by multiple features of the 5′ untranslated region (5′UTR), from the structural accessibility of the cap-proximal region [ 3 ], to the strength of the Kozak sequence around the start codon (consensus gccRccAUGG, upper case-highly conserved bases, R = A or G, [ 4 ]), and the number and properties of upstream open reading frames (uORFs) that can hinder ribosome scanning to the main ORF (mORF), inhibiting its translation [ 5 , 6 , 7 , 8 ]. These (and presumably other) factors lead to initiation rates that differ up to 100 fold between mRNAs [ 9 ] and a similarly wider range of protein relative to mRNA abundance [ 10 ]. Accurate prediction of protein output from the mRNA sequence is of great interest for protein engineering and increasingly relevant with the rise of RNA-based therapies. This has prompted the development of both experimental methods for the high-throughput measurement of protein outputs as well as of computational models that can be trained on these data. An important development has been the introduction of ribosome footprinting (also known as ribosome profiling), a technique for capturing and sequencing the footprints of translating ribosomes (RPFs) on individual mRNAs [ 2 ]. The ratio of normalized RPFs and RNA-seq reads over the coding region is used as an estimate of “translation efficiency” (TE), which is considered a proxy for the synthesis rate of the encoded protein [ 2 ]. Ribosome footprinting has been applied to a variety of cells and organisms [ 11 ], yielding new mechanistic and regulatory insights (e.g., [ 12 , 13 ]). An early study of yeast translation concluded that up to 58% of the variance in TE can be explained with 6 parameters, though the most predictive was the mRNA level expression of the gene, which is not a feature that can be derived from the sequence of the mRNA [ 6 ]. At the same time, massively parallel reporter assays (MPRA) were developed to measure translation for large libraries of reporter constructs, further used to train deep learning (DL) models. A convolutional neural network (CNN) [ 14 ] explained 93% of the variance in the mean ribosome load (MRL) of reporter constructs, but less, 81%, for 5′UTR fragments taken from endogenous mRNAs. The CNN also recovered some of the important regulatory elements such as uORFs [ 14 ]. More recently, a novel experimental design was used to accurately measure the output of yeast reporters driven by natural 5′UTRs [ 15 ], while novel DL architectures and training approaches aimed to improve prediction accuracy [ 16 , 17 ]. Potential limitations of DL models built from synthetic sequences is that it is a priori unclear whether the training set contains the regulatory elements that are relevant in vivo and whether the features extracted by the model generalize well across systems such as cell types and readouts of the process of interest. These bottlenecks may limit not only the understanding of regulatory mechanisms but also the use of derived models for predictions of functional impact of sequence variations and for construct design. To assess whether these issues impact the current RNA sequence-based models of translation, we carried out a detailed comparison of model performance in a standardized setting that uses complementary data sets obtained in two distinct cell types. We trained and applied models to the prediction of translation output in yeast and human cells, addressing the following questions: (1) are models trained on synthetic sequences able to predict the translation output of endogenous mRNAs in the same cellular system? (2) do these models generalize between different cellular systems (different cell types, different species)? (3) what is their parameter-efficiency (fraction of explained variance per model parameter)? (4) what are the conserved regulatory elements of translation that have so far been learned by DL models? Experimental measurements of translation outputThe current method of choice for measuring the translation output of endogenous mRNAs is ribosome footprinting, consisting in the purification and sequencing of mRNA fragments that are protected from RNase digestion by translating ribosomes [ 2 ]. The TE of an mRNA is then estimated as the ratio of ribosome-protected fragments (RPFs) obtained from the mRNA by ribosome footprinting and coding-region-mapped reads obtained by RNA-seq from the same sample [ 18 ]. Ribosome footprinting has been applied to many systems, including yeast cells [ 6 , 19 ] and the human embryonic kidney cell line HEK 293 [ 20 ], for which a variety of omics measurements are available. Importantly, MPRA of translation were carried out in these cell types, giving us the opportunity to determine whether reporter-based models can predict translation of endogenous mRNAs in a given cell type. Figure 1 summarizes the main approaches used to measure translation in yeast and human cells, starting from the just-described ribosome footprinting technique (Fig. 1 A). The MPRA approach used by [ 14 ] to generate the Optimus50/100 MPRA data sets (Fig. 1 B) consists in the transfection of in vitro-transcribed mRNAs with randomized 5′UTRs upstream of the eGFP coding region into HEK 293 cells, followed by sequencing of RNAs from polysome fractions. The MRL, i.e., the average number of ribosomes on individual mRNAs is derived from abundance profile of individual mRNAs along polysome fractions. In another approach, called DART (for direct analysis of ribosome targeting), Niederer and colleagues [ 15 ] have synthesized in vitro translation-competent mRNAs consisting of natural 5′UTRs and a very short (24 nucleotides (nts)) coding sequence. A few mutations were introduced in the 5′UTRs, as necessary to unambiguously define the translation start. After incubation with yeast extract, ribosome-associated mRNAs were isolated, sequenced, and a ribosome recruitment score (RRS) of an mRNA was calculated as the ratio of its abundance in the ribosome-bound fraction relative to the input mRNA pool (Fig. 1 C). A previously developed MPRA used plasmids containing randomized 5′UTRs placed upstream of the HIS3 gene to transform yeast cells lacking a native copy of the gene [ 21 ]. The amount of HIS3 protein generated from the reporters was assessed in a competitive growth assay, by culturing the yeast cells in media lacking histidine and using the enrichment of reporter constructs in the output vs. the input culture as a measure of HIS3 expression (Fig. 1 D). Experimental approaches to quantifying translation output. A Sequencing of total mRNA and ribosome-protected fragments of endogenous mRNAs is used to estimate the translation efficiency per mRNA. B Massively parallel reporter assays (MPRA) measure the output of constructs consisting in randomized 5′UTRs attached to the coding region of a reporter protein. Sequencing of polysome fractions enables the calculation of a mean ribosome load per construct, which is used as a measure of translation output. C DART follows a similar approach with endogenous 5′UTRs, once upstream AUGs (uAUGs) located in the 5′UTR are mutated to AGU to avoid ambiguity in translation start sites. D In an alternative MPRA in yeast, the enrichment of 5′UTRs driving expression of a protein required for growth served as proxy for the translation output of the respective constructs. More details can be found in the Methods - Experimental methods section The reproducibility of experimental measurements sets an upper bound on the accuracy of models’ predictions of different types of data. For MPRA data sets the \(R^2\) of replicate measurements is usually very high, values of 0.95 being generally reported [ 14 ]. In contrast, the reproducibility of TE, which is a ratio of two variables-ribosome footprints and RNA-seq fragments mapping to a given mRNA-is generally lower. In the HEK 293 ribo-seq data set that we analyzed [ 20 ], the \(R^2\) for RPFs was in the range \(0.77-0.82\) , while for mRNA-seq it was 0.96, leading to \(R^2\) of TE estimates \(0.47-0.52\) (Additional file 1: Fig. S1). We further obtained an additional ribo-seq data set from another human cell line, HepG2, with the aim of exploring the limits of replicate reproducibility of this type of measurement and evaluating the conservation of TE between cell types, which is also important when applying a model trained in a particular cell type to predict data from another cell type, c.f. Additional file 1: Fig. S2. The TE estimates from HepG2 cells were more reproducible, with \(R^2\) for replicates \(0.68-0.8\) . When comparing the TE estimates from HEK 293 and HepG2 cells, we obtained an \(R^2 = 0.31\) , which would be an upper bound on the accuracy of a model trained on one of these data sets in predicting the TEs in the other cell line. The general reproducibility of translational efficiency as well as coverage in RNA sequencing and ribosome footprinting in yeast (data from [ 19 ]) appears to be of similar quality as the HepG2 data, as can be seen from Additional file 1: Fig. S3. To ensure comparability of our results with those of previous studies, we aimed to replicate their model training strategy, which generally involved setting aside the highest quality data (constructs with the largest number of mapped reads) for testing and using the rest for training [ 14 , 16 ]. High expression is not the only determinant of measurement accuracy for endogenous mRNA data sets. For example, in yeast, the integrity of the sequenced RNAs was previously identified as key source of noise for TE estimates [ 6 ]. A proxy for RNA integrity is the transcript integrity score (TIN) [ 22 ], which quantifies the uniformity of coverage of the RNA by sequenced reads and ranges from 0 (3′ bias) to 100 (perfectly uniform coverage). As the TE reproducibility in HEK 293 cells increased with the TIN score ( \(R^2 = 0.67-0.75\) for TIN \(> 70\) vs. \(0.47-0.52\) for all), for the human endogenous data, we used the mRNAs with TIN \(>70\) ( \(\approx\) 10% mRNAs) for testing and all the others for training. In yeast, however, the reproducibility of TE does not depend on TIN (Additional file 1: Fig. S3 shows that), indicating that RNA degradation is much less dominant here than in human cells. As can be seen from Additional file 1: Fig. S4, selection of the test data set based on the TIN score does not introduce any bias in 5′UTR length, main ORF length, or TE in HEK 293 cells. The situation for yeast, however, is slightly different; transcripts with higher TIN score have higher TE. Also, in the yeast data, the increase in inter-replicate reproducibility of TE when selecting the transcripts with TIN \(>70\) is negligible. Therefore, we used three different random splits for the endogenous yeast data set and provide the average performance. Models for predicting translation output from the mRNA sequenceTo explain the translation efficiency estimated by ribosome footprinting in yeast, Weinberg et al. [ 6 ] proposed a simple, 6-parameter linear model with the following features: lengths of the CDS and 5′UTR, G/C content of the 5′UTR, number of uAUGs, free energy of folding of the 5′cap-proximal region and the mRNA abundance. This linear model was surprisingly accurate ( \(R^2 = 0.58\) ) in predicting the efficiency, though leaving out the mRNA level reduced the \(R^2\) to 0.39. Here, we use a similar model as baseline to assess the parameter efficiency of DL models, i.e., the fraction of explained variance per model parameter. The features of our linear model are as follows: we use the same length and G/C content measures, the 5′UTR folding free energy divided by the 5′UTR length, the number of out-of-frame upstream AUGs (OOF uAUGs), the number of in-frame upstream AUGs (IF uAUGs), and the number of exons in the mRNA [ 23 ]. A bias term adds an additional parameter. The first type of DL architecture trained on MPRA data was the Optimus 5-Prime CNN [ 14 ], operating on one-hot-encoded 5′UTR sequences. Optimus 5-Prime has 3 convolutional layers, each with 120 filters of 8 nucleotides (nts), followed by two dense layers separated by a dropout layer. The output of the last layer is the predicted translation output, i.e., the MRL for the HEK 293 cell line data set (Fig. 1 B, Fig. 2 A) and the relative growth rate for yeast cells [ 21 ] (Fig. 1 D). While reaching very good performance in predicting the MRL for synthetic sequences, Optimus 5-Prime could only make predictions for UTRs of up to 100 nts, which account for only \(\sim\) 32% of annotated human \(5'\) UTRs. Longer 5′UTRs could be accommodated by truncation to the upstream vicinity of the start codon. The Optimus 5-Prime model just described has 474,681 parameters. Architectures of different artificial neural networks used to predict the output of translation. Optimus 5-Prime [ 14 ] uses three convolutional layers and two dense layers ( A ), Framepool ( B ) is similar, but with a customized frame-wise pooling operation between convolutional and dense layers [ 16 ], and MTtrans ( C ) stacks a “task-specific” tower of two recurrent and one dense layers on top of a “shared encoder” of four convolutional layers. D represents an approach entirely relying on recurrent layers; it is built from two bidirectional LSTM layers, followed by a dropout and fully connected layer. TranslateLSTM ( E ) consists of three sub-networks: a two-layer bidirectional LSTM network for the 5′UTRs, another two-layer bidirectional LSTM network for the first 100 nts of the CDS, and non-sequential input features previously found to control translation. For further information, we refer to the Methods - Model architectures section The 5′UTR length limitation was overcome by Framepool [ 16 ], another CNN containing 3 convolutional layers with 128 7-nts filters (Fig. 2 B). Importantly, Framepool slices the output of the third convolutional layer according to the frame relative to the start codon, then pools the data frame-wise, taking the maximum and average values for each pool. This allows both the processing of sequences of arbitrary length and the detection of motifs in specific reading frames. The frame-wise results are the input for the final two dense layers. For variable length 5′UTRs, Framepool was shown to yield somewhat improved predictions relative to Optimus 5-Prime [ 16 ], with a smaller number of parameters, 282,625. MTtrans [ 17 ] is the most recently proposed DL model (Fig. 2 C). Its basic premise is that the elements controlling translation generalize across data sets generated with different experimental techniques. Each data set is viewed as a task. The model combines 4 convolutional layers with batch normalization, regularization and dropout, with two bidirectional gated recurrent unit (GRU) layers and a final dense layer. GRUs are recurrent layers that process the input sequence token by token and map these to an internal space that allows contextual information to be preserved. Inputs of different lengths can naturally be processed by this architecture. The 4 convolutional layers, which the authors called “shared encoder,” are assumed to be universal among different prediction tasks, while the recurrent and dense layers (“task-specific tower”) are specific to each task. The shared encoder is therefore trained on multiple data sets, while the task-specific tower is trained only on the respective task. In comparison to Optimus 5-Prime and Framepool, MTtrans provides an increase in \(R^2\) of \(0.015-0.06\) in prediction accuracy, depending on the data set [ 17 ]). Interestingly, training MTtrans on multiple data sets at once rather than in a sequential, task-specific manner, achieved an almost similar effect. While we were able to obtain the code for MTtrans from [ 24 ], we were unable to run the code “out-of-the-box.” Therefore, we set up MTtrans as described in the conference publication [ 17 ] though this left many details unclear, i.e., the exact layout of the “task-specific tower,” its recurrent and dense layers, the number of training epochs, the exact training rate schedule, and criteria for early stopping. It also led to a different number of parameters in our implementation, 776,097, compared to the number reported by the authors \(\sim 2.1\) million. Consequently, we trained with a callback that automatically stops once overfitting is reached and restores the best weights. Although, in our experience, these are details have only a minor impact on the model performance, we note that our results differ to some extent from those reported in [ 17 ]. The use of GRUs in the task-specific tower allows MTtrans to predict output for any 5′UTR length. While DL models become increasingly more parameter-rich, their performance improves only marginally, leading to a decrease in the gained accuracy per parameter. We were therefore interested in whether the parameter-efficiency of DL models can be improved, i.e., whether the top performance can be achieved with smaller rather than larger models. To address this, we turned to long short-term memory networks (LSTMs), a variety of recurrent neural networks (RNNs) designed to detect and take advantage of long-range dependencies in sequences [ 25 ]. While such dependencies are expected in structured 5′UTRs, LSTMs have not been applied yet to the prediction of translation output. We therefore implemented here two LSTM-based architectures: one operating only on 5′UTR sequences and a second one, TranslateLSTM, operating not only on 5′UTRs but also on the first 100 nts of the associated coding regions and the non-sequential features of the linear model described above. The extended TranslateLSTM allows for factors such as the secondary structure and codon bias in the vicinity of the start codon [ 5 ] to impact the translation output. One-hot-encoded sequences are fed into two bidirectional LSTM layers, the outputs of the second layers are concatenated and sent to dense layer which predicts the output (Fig. 2 D). TranslateLSTM has 268,549–268,552 parameters, while the 5′UTR-only LSTM model has 134,273 parameters. We further note that, depending on the experimental design, not all data sets to which a given model is applied require the same number of parameters. For instance, a data set in which all sequences have the same length like Optimus50 does not require the sequence length as a parameter in TranslateLSTM or the linear model. Similarly, as the first 100 nts of the CDS are the same in all MPRA data sets, the associated parameters are not needed in TranslateLSTM, which reduces the number of parameters to about 50% relative to the full model. Available DL models do not generalize well across experimental systemsThe results of our comprehensive tests of prediction accuracy of all models across multiple data sets are summarized in Fig. 3 A. The most salient result is that differences in performance between DL models applied to a particular data set are much smaller than differences between applications of the same model to distinct data sets. In particular, DL models can be trained on synthetic constructs to predict the output of leave-out constructs, but they cannot be trained well on TE data to predict the translation of endogenous mRNAs (compare lines 1,2,3 and 4,5 in Fig. 3 A). To make sure the test data selection strategy is not the issue, we also tested three different random splits and stratified splits (enforcing similar distributions in test and train sets) for TIN and TE as selection strategies for the test data in HEK 293 cells for Optimus 5-Prime and TranslateLSTM. They showed comparable but mildly worse performance (difference in \(R^2\) around 0.02). The stratified split along the TE-axis performed the worst, whereas random and stratified split along the TIN-axis performed with \(R^2\) values only 0.01 smaller than test data selection based on the TIN, where only highest TIN score transcripts were used for testing. Figure 3 B shows scatter plots of ribosome load predicted by each of the discussed DL architectures against their measured counterparts. It can be clearly seen that OOF uAUGs are strongly inhibiting translation. Moreover, the TranslateLSTM predictions are most uniformly spread around the diagonal, as measured by the sum of the differences between predicted and measured MRL for every model, where we find \(-360\) (translateLSTM) vs. \(-652\) (Optimus 5-Prime) vs. \(-762\) (Framepool) vs. \(-461\) (MTtrans). The size of the training data set is not strictly a limiting factor, because DL models can be trained to some extent on the relatively small DART data set of \(\sim 7'000\) natural yeast 5′UTRs (Fig. 3 A, l. 6). Furthermore, models trained on synthetic 5′UTRs do not predict the TE of endogenous mRNAs measured in the same cell type (see Fig. 3 A ls. 7,8). This reduced performance was previously attributed to the different experimental readout and to the influence of the coding sequence, which is the same in MPRA, but different in TE assays [ 16 ]. To test this, we applied the models trained on human MPRA data to the prediction of MPRA data from yeast and vice versa. This involved not only a very different readout of translation (MRL in human, growth-dependent enrichment of 5′UTRs in yeast) but also entirely different organisms. In both cases, the cross-system performance was substantially higher, \(R^2 = 0.41\) and 0.64 (c.f. Fig. 3 A ls. 9,10) compared to the performance of the model trained on synthetic data in predicting the TE data in the same cell type. Thus, the type of readout is not the main factor behind the reduced predictability of TE data. Another limiting factor, not discussed before, could be the accuracy of the experimental measurements. MPRA and DART-based measurements are very reproducible, with \(R^2 ~ 0.95\) , while the TE estimates much less so ( \(R^2 \approx 0.5\) for the HEK 293 data set, Additional file 1: Fig. S1). Thus, the TE data may be less predictable as it is also more noisy. However, the measurement accuracy is not a factor in the highly reproducible DART experiments, yet models trained on synthetic construct data from yeast could not predict the RRS measured in the DART experiment, also done in yeast. Altogether, these results indicate that synthetic sequences differ substantially from natural, evolved, 5′UTRs, leading to models trained on synthetic data not being able to adequately capture 5′UTR features that are relevant for the translation of endogenous mRNAs. We also applied a transfer-learning strategy to human HEK 293 data, where we first trained the models on the Optimus100 data set, then re-trained the last layer on endogenous data, and finally some epochs of training the entire network on the endogenous data. For the models that did not specify a certain number of training epochs training was terminated automatically by a callback function with patience of 10 epochs. Typically, that lead to \(\sim 30\) epochs of pre-training, \(\sim 50\) epochs re-training the last layer, and \(\sim 15\) epochs of fine-tuning the entire network. The results are displayed in Fig. 3 A l. 13. Applying transfer learning indeed lead to a small performance increase of 0.04 in \(R^2\) . Performance of all evaluated models in different application scenarios ( A ), measured by the Pearson correlation coefficient \(R^2_{\text {Pearson}}\) between experimentally measured and predicted translation output in the test data. Different random splits of the DART data lead to variations in \(R^2\) of \(\lesssim 0.02\) , with the exception of the Framepool model, which had differences of up to 0.16 between splits. The average of the correlation of TE between different replicates for the endogenous HEK 293 and yeast data sets serves as theoretical upper bound on the predictive power of the model, imposed by measurement reproducibility. Values for DART and Optimus50 data sets were taken from the corresponding publications [ 14 , 15 ]. B Correlation of predicted and true ribosome load of four model architectures trained on the Optimus100 data set. OOF uAUGs clearly inhibit translation initiation. TranslateLSTM predicts the most even scattering pattern around the diagonal, as measured by sum of the differences between predicted and measured ribosome load for all transcripts in the test set As seen above, the yeast-human cross-species prediction accuracy is substantial, indicating that the translation regulatory elements inferred from synthetic constructs in the two species are partially conserved. Given that the cell type in which a model is developed will generally differ from the cell type where model predictions are of interest, we asked whether the TE of human mRNAs are largely similar across human cell lines. We thus generated ribosome footprinting data from the human HepG2 liver cancer cell line and compared the TE inferred from this cell line with those from the HEK 293 cell line data set. The estimates of TE were more reproducible than those from the HEK 293 cells ( \(R^2 = 0.68-0.8\) for replicate experiments). The TEs estimated from HEK 293 were moderately similar to those from HepG2 cells ( \(R^2 = 0.31\) ), especially when considering mRNAs with TIN \(> 70\) ( \(R^2 = 0.44\) ). This indicates that within an organism, transcript-intrinsic properties contribute substantially to the variation in translation output relative to the cellular context. This is a good basis for developing models in model systems, provided that the protocol allows for highly accurate measurements on translation output (Additional file 1: Fig. S2). Although DL models are not generally benchmarked against simple models with more limited predictive power, this test provides an assessment of parameter-efficiency (gain in predictive power per parameter) as well as insights into model interpretation. Trained on synthetic construct data, the 8-parameter linear model described above could explain as much as 60% of the variance in the respective test sets, which is quite remarkable given the size of the model. In addition, this model could also be trained to some extent on TE measurements of endogenous mRNAs. Strikingly, the accuracy of cross-system predictions of synthetic construct-based DL models is similar to the accuracy of linear models inferred from the respective data sets. This indicates that the conserved mechanisms of translation control learned by the DL architectures from synthetic sequences are represented in a small number of features and that currently available DL architectures are heavily over-parameterized. Reporter sequences differ in translation-relevant properties from endogenous mRNAsTo further identify the most predictive and conserved features, we inspected the weights learned by the linear model from individual data sets (Additional file 1: Fig. S5A). We found that only the uAUGs, especially those located out-of-frame (OOF) with respect to the mORF, consistently contributed to the prediction of translation output across all systems. OOF uORFs/uAUGs are known to repress translation, by hindering ribosome scanning towards the mORF [ 26 ] and triggering the mechanism of nonsense-mediated mRNA decay [ 27 ]. uAUGs contribute much more to the translation output of human or yeast reporters constructs compared to endogenous mRNAs, which is a reflection of differences in sequence composition between synthetic and natural 5′UTRs (Additional file 1: Fig. S6). To gain further insight into the sequence features learned by the LSTMs, we visualized the contributions (Shapley values) of single nucleotides in test sequences to the output of the LSTM architecture using the SHAP package [ 28 ]. While the inhibitory effect of a uAUG becomes evident for representative sequences from the Optimus50 data set (Additional file 1: Fig. S5C), this is not the case for sequences from the HEK 293 data, where individual nucleotides composing the AUG codon may even have contributions of opposite signs (Additional file 1: Fig. S5D). A superposition of 200 high-TIN sequences from the HEK 293 data set in Additional file 1: Fig. S5E shows position-dependent nucleotide biases that contribute to the translation output of endogenous sequences (with the caveat of a small predictive power in this setting). Specifically, C nucleotides contribute positively when located upstream and in the vicinity of the start codon, while G nucleotides contribute negatively, especially when located at the 5′ end, downstream of the cap. Thus, test examples, the weights of the linear model, and the visualization of the effect of individual nucleotides on the LSTM predictions all suggest that models trained on synthetic sequences will incorrectly weigh the translation-relevant features they learned from these sequences when predicting the output of natural 5′UTRs, leading to reduced prediction accuracy. To illustrate this, we carried out a simulation using the Optimus50 data set: we set aside the 20,000 constructs with highest coverage in mRNA sequencing for testing as before but trained the Optimus 5-Prime model on the subset of remaining constructs that did not contain uAUGs. As shown in Additional file 1: Fig. S5B, the resulting model performs poorly on the test set, specifically on the subset of test sequences that do contain uAUGs. However, the model trained on the entire spectrum of sequences that could, in principle, learn all regulatory elements of translation does not predict the translation output of the DART dataset of natural yeast 5′UTRs lacking uAUGs, see l. 11 of Fig. 3 A. These results demonstrate that the similarity of distributions of translation-relevant features among training and test set are key to the ability of the DL model to generalize. Having undergone extensive selection under a variety of constraints, endogenous 5′UTRs likely accumulated multiple elements that control their translation, elements that are probably not represented among synthetic 5′UTRs. This leads to large differences in performance when models trained on synthetic data are applied to other data sets. Previous studies reached different conclusions concerning the impact of IF uAUGs on translation [ 15 , 21 , 29 , 30 ]. To clarify this, we determined the relationship between the location of OOF and IF uAUGs in the 5′UTR and the translation output of the mRNAs, in both yeast and human sequences, synthetic or endogenous. To avoid a superposition of effects from multiple uAUGs, we analyzed only constructs with a single uAUG in the 5′UTR. As shown in Additional file 1: Fig. S7A-F, the repressive effect of IF uAUGs increases with their distance from the mORF, while the repressive effect of OOF uAUGs on the translation of synthetic constructs only weakly depends on the position. The data for endogenous mRNAs was too noisy to verify or falsify the trend observed in synthetic data (Additional file 1: Fig. S7E, F). These results indicate that both the frame and the distance of uAUGs with respect to the mORF should be taken into account when predicting their impact on translation. A more accurate and parameter-efficient DL model to predict the impact of 5′UTR sequence variation on translationTo provide a more accurate model of endogenous mRNA translation, accommodating different constraints on uAUGs and improving parameter-efficiency, we turned to LSTM-based architectures. The two architectures that we implemented, LSTM and TranslateLSTM (see Fig. 2 ) performed similarly on the synthetic data sets, and were more accurate than the other DL models tested. The largest performance gain was reached for RNAs with IF uAUGs, as may be expected from the model’s treatment of sequence context (Additional file 1: Fig. S8). The similar performance of LSTM and TranslateLSTM on synthetic data indicates the LSTM can learn correlates of the non-sequential features represented in TranslateLSTM. However, these features were important for the performance of TranslateLSTM on the endogenous HEK 293 TE data (Fig. 3 A and Additional file 1: Fig. S5A). To demonstrate the relevance of DL models for interpreting the functional significance of single nucleotide polymorphisms (SNPs), Sample et al. [ 14 ] measured the MRL of constructs with 50 nts-long fragments of natural 5′UTRs as well as of variants with naturally occurring SNPs. TranslateLSTM predicted better the measured MRL of these sequences than Optimus 5-Prime model (Fig. 4 A, B). However, in this experiment, 5′UTR sequences were taken out of their endogenous context, which, as we have shown above, is important for the prediction of translation output and thereby functional impact. Therefore, we sought to improve the prediction of SNP effects on translation taking advantage of the insights provided by our analyses. We used transfer learning (TL) to extract information from both synthetic and endogenous 5′UTRs, and we applied the resulting model to all the 5′UTR-located SNPs from the ClinVar database [ 31 , 32 ], in their native 5′UTR context. 84,128 of the 2,300,005 SNPs were located in 5′UTRs, and of these, 7238 were located in mRNA isoforms (one per gene) expressed and with measured TE in HEK 293. As shown in Fig. 3 A l. 13, the TL strategy leads to better predictions than the training on endogenous data alone and also better than the predictions of other DL models trained by TL. The distribution of log-ratio of predicted translation output of variant and wildtype sequences is shown in Fig. 4 C. One hundred ten of the 7238 variants are predicted to affect the TE by 10-fold or more, 34 increasing and 76 decreasing the TE compared to the wildtype sequence. Interestingly, despite the large predicted impact, none of the 110 SNPs create or destroy an uAUG. However, overall, while absolute numbers of uAUG changes are small (328 of 7238 variants), creation/destruction of an uAUG was associated with a predicted reduction/increase of translation output. Moreover, the pathogenic variants had a small bias for increased TE (Fig. 4 D). Effect of 5′UTR sequence variation on mRNA translation output. A Optimus 5-Prime was trained on a pool of randomized 50nt long sequences and applied to a pool of equally long known variants (see the “ Human genetic variants ” section). Yellow points indicate 5′UTRs with OOF uAUGs; purple points without OOF uAUGs. Same was done for the TranslateLSTM architecture in panel ( B ). C TranslateLSTM was used to predict the TE of known clinical variants of endogenous sequences from the ClinVar database [ 31 ], which were compared to the measured TEs of their wildtype counterparts (see the “ ClinVar data ” section) and obtain predictions of log-fold changes of the translation efficiency (TE LFC). These follow a normal distribution, where a negative TE LFC can be associated with a propensity for the variant to create uAUGs (orange fraction of bars), while positive TE LFC is associated with a propensity of breaking uAUGs (green fraction of bars). D Clinical variants annotated as pathogenic (clinical significance annotation: pathogenic, likely pathogenic, risk factor) are predicted to significantly increase the TE compared to variants with neutral phenotype annotation (clinical significance annotation: other, uncertain), whereas variants with benign phenotypes (clinical significance annotation: benign, likely benign, protective, drug response) do not significantly alter the distribution, as demonstrated by Kolmogorov-Smirnov tests The wider dynamic range of protein compared to mRNA expression suggested an important role of translation control in determining protein levels [ 10 ]. Initiation is the limiting step of translation [ 2 , 6 , 7 ], modulated by a variety of regulatory elements in the 5′UTRs [ 6 , 33 ], from uORFs to internal ribosome entry sites [ 34 , 35 , 36 ]. With the rise in mRNA-based therapies, the interest in designing 5′UTRs to drive specific levels of protein expression has surged [ 14 ], prompting the development of DL models to predict the translation output of mRNAs from the 5′UTR sequence. To satisfy the large data needs of these models, a few groups have devised innovative approaches to measure the translation output of large numbers of constructs, containing either random 5′UTRs or fragments of endogenous sequences [ 14 , 15 , 21 ]. DL models trained on these data achieve impressive prediction performance on leave-out data and are used to identify sequence elements that modulate translation, the most predictive among these being uAUGs/uORFs. However, DL models trained on synthetic data do not predict well the translation output of endogenous mRNAs. In this study, we carried out an extensive comparison of models of translation across multiple data sets and settings, to understand the limits of their applicability and generality. We took advantage of two systems in which the translation output has been measured for both synthetic and endogenous 5′UTRs, namely yeast [ 6 , 21 ], and HEK 293 cells [ 14 , 20 ]. For yeast, an additional library of \(\sim 12,000\) endogenous 5′UTRs devoid of uAUGs was tested for their ability to recruit ribosomes [ 15 ]. We observed the best performance in the yeast-human cross-prediction of translation output of synthetic constructs, even though the readouts of the assays were very different for the two organisms. This prediction relies on a small number of conserved determinants of translation output, in particular uAUGs, as underscored by the similar performance achieved with an 8-parameter linear model trained on the same data sets. However, models trained on synthetic constructs do not predict the translation output of endogenous mRNAs. The coding region or trans-acting factors do not explain this discrepancy, as demonstrated with the various yeast data sets, where these factors were controlled. Rather, endogenous sequences have been selected in evolution under multiple constraints, not limited to translation output, and have acquired a variety of regulatory elements that are not well-represented in the randomized 5′UTRs. This leads to models trained on synthetic data not having the possibility to learn such features. We could most clearly demonstrate this with a simulation, in which a model trained on sequences lacking uAUGs performed poorly on a data set in which these elements are represented. While in this case the outcome may seem obvious, as uAUGs are important modulators of translation output, there are likely many other elements that are not well represented among synthetic sequences yet affect the translation output in various ways, including for, e.g., by influencing the mRNA stability. All of these factors ultimately contribute to the poor performance of models trained in synthetic 5′UTRs on predicting the translation output of endogenous mRNAs. The same issues likely compound the prediction of SNP effects. As the genetic variation of human populations is being mapped, DL models are increasingly used to predict various molecular phenotypes, including the translation output of mRNAs [ 14 , 16 ]. Genetic variation is manifested in the native gene expression context, implying that predictions of models trained on synthetic sequences will not be reliable. Given that with TranslateLSTM we were able to explain more of the variance in TE compared to other DL models, we also sought to provide updated predictions of the potential impact of ClinVar variants on TE [ 31 ]. Surprisingly, variants classified as pathogenic are predicted to more often increase than decrease the TE of the respective mRNA, i.e., they tend to be gain-of-function variants. Interestingly, the increase is not generally explained by the removal of a repressive uAUG, as relatively few SNPs changed the number of uAUGs in the 5′UTRs. These 1000 SNPs predicted to most increase the TE came from genes involved in protein and organelle localization (Additional file 1: Tab. S1), predictions that could be tested in a future study. That \(\sim\) 60% of the variance in MPRA data can be explained with models constructed from such distant species such as yeast and human indicates that the models have learned deeply conserved mechanisms of controlling the translation output. That the simple, 8-parameter linear model, performs almost on par with the DL models in this setting indicates not only that these mechanisms are reflected in a small number of mRNA features but also that the DL models are heavily over-parameterized. Indeed, the cross-species prediction power comes largely from the OOF uAUGs, as demonstrated by the poor performance of the linear model lacking this element. The 5′UTR G/C content and/or free energy of folding appear to be additional conserved regulatory elements, with more prominent role in explaining the translation output of evolved 5′UTRs. To the extent to which synthetic data sets and DL models are used to uncover molecular mechanisms, it is important to ponder whether the synthetic sequences cover these mechanisms as well as whether the model architecture allows for the appropriate representation of these mechanisms. This is, of course, difficult to ensure a priori, when the mechanisms are unknown. However, an improved grasp of the parameter efficiency of models and their interpretation should facilitate the discovery of regulatory mechanisms and avoid false inferences. For example, CNN type of architectures may be able to encode correlates of RNA secondary structure sufficiently well to predict the translation of short, synthetic 5′UTRs. Yet, the sequence motifs learned by the CNN need not represent as such a regulatory mechanism. Instead, they could reflect long-range secondary structure constraints, which could be more efficiently captured by a different type of representation than the CNN allows. A main application of DL models trained on synthetic sequences is the design of constructs with desired translation outputs [ 14 ]. While this has been demonstrated within the setting of the MPRA, where randomized 5′UTRs drive the expression of proteins such as eGFP or mCherry, whether the same accuracy can be achieved for endogenous mRNAs of interest remains to be determined [ 14 ] have tested the same 5′UTR library in the context of the eGFP and mCherry coding regions and found that the model trained on the eGFP constructs explains 77–78% of the variance in mCherry expression, contrasting the 93% of the variance explained in eGFP expression. Interestingly, this difference has been attributed, in part, to differences in the polysome profiling protocol [ 14 ]. This points to the importance of establishing robust experimental protocols for generating reference data sets. However, the different coding regions likely contribute to the discrepancy in prediction accuracy as well, underscoring the importance of measuring the same library of constructs in different systems to identify the mechanisms responsible for a specific readout. In summary, our analysis suggests a few directions for the study of translation control and applications to protein expression. First, to continue to uncover mechanisms that explain the expression of endogenous sequences, it will be important to include these sequences in high-throughput assays. The method of choice for measuring the translation output of endogenous mRNAs is ribosome footprinting, a method that, on its own, is very reproducible ( \(R^2 \gtrsim 0.8\) ). However, factoring in the mRNA-seq-based estimation of mRNA abundance to calculate the TE leads to increased error in the TE estimate. Ensuring high accuracy of mRNA-seq and ribo-seq is important for obtaining reference data sets of TE. An additional limitation of endogenous mRNA translation data is its size. Currently, the number of mRNAs whose TE is estimated in a typical experiment is \(\sim 20,000\) , which corresponds roughly to one isoform per gene. Accurate estimation of the TE of individual isoforms could be an important direction of methodological development [ 37 , 38 ]. However, it is unlikely that many isoforms are simultaneously expressed at a high enough level to be accurately measured in a given cell type or that sufficiently accurate data can be currently obtained from single cells [ 39 ] that express distinct isoforms. As a suboptimal alternative, TE measurements could be obtained in closely related cell types in which sufficient variation of transcription and thereby translation start sites occurs. In terms of training DL models on such data, an important consideration will be to ensure that training and test sets do not contain related sequences, to prevent models from achieving high prediction accuracy simply based on sequence similarity, without learning the regulatory grammar [ 40 ]. Second, towards predicting the impact of SNPs on translation, accurate models of endogenous mRNA expression are needed. As we have seen here, architectures beyond CNNs are desirable, and models used in natural language processing may provide a useful stepping stone. However, it will be interesting to develop architectures that can represent long-range dependencies of RNA secondary structures, perhaps also incorporated co-evolution constraints, as done for protein structure prediction [ 41 , 42 ]. Third, towards the goal of designing constructs with specified translation outputs, it will be important to first determine the range of variation afforded by randomized 5′UTR variants by actually measuring the range of protein expression that can be covered with these variants. If this is sufficient, it will be important to determine the impact of unexplored parameters, such as the cellular context of construct expression and the impact of the coding region downstream of the randomized construct. For the former, the same construct library can be tested in various cell types, especially those that are closest to the cell type in which the mRNAs will be ultimately expressed (e.g., muscle cells for mRNA vaccines) [ 43 ]. Regarding the coding region, it will be interesting to test at least a few that cover the range of endogenous expression, from mRNAs with different life times and codon bias. To conclude, DL models can be trained to very high precision on synthetic data, irrespective of their architecture. However, so far, synthetic data does not appropriately cover the space of regulatory elements influencing translation initiation. To achieve a comprehensive and predictive model as well as understand translation, training on endogenous sequences is necessary. The main bottleneck at the moment is obtaining sufficient and highly reproducible data on the translation of endogenous mRNAs. Experiments in a single cell type such as a cell line may not yield sufficiently many reliably measured 5′UTRs to train models such as TranslateLSTM. Perhaps this limitation can be circumvented by collecting data from multiple cell types, as they may contain distinct isoforms, with distinct 5′UTRs and translation efficiencies. Such a model could then be used for a broad variety of tasks, such as predicting the effect of point mutations, the translation efficiency of synthetic constructs, and for deepening our mechanistic understanding of translational control. Experimental methodsWe outline the experimental procedure for RNA and ribosome footprint sequencing of HepG2 cells. Cell cultureThe HepG2 cell line was obtained from the laboratory of Dr. Salvatore Piscuoglio (DBM, Basel) and was cultured in Dulbecco’s Modified Eagle Medium (DMEM) containing 4.5 g/l glucose, 10% fetal calf serum, 4 mM L-glutamine, 1X NEAA, 50 U/ml penicillin and 50 µg/ml streptomycin at 5% \(CO_2\) , \(37^{\circ }\textrm{C}\) . Cells were passaged every 3–4 days. Cell lysis and collectionCells were grown in 15-cm dishes to achieve a 70–80% confluency. Medium was replenished 3 h prior to cell lysis. Cycloheximide (CHX) was added to a final concentration of 100 µg/ml to arrest elongating ribosomes. Medium was immediately discarded and cells were washed once with ice-cold PBS containing 100 µg/ml CHX. Five hundred microliters of lysis buffer (20 mM Tris-HCl pH 7.5, 100 mM NaCl, 10 mM \(\textrm{MgCl}_2\) , 1% Triton X-100, 2 mM dithiothreitol (DTT), 100 µg/ml CHX, 0.8 U/µl RNasin plus RNase inhibitor (Promega), 0.04 U/µl Turbo DNase (Invitrogen), and EDTA-free protease inhibitor cocktail (Roche)) was added directly to the cells on the Petri dish. Cells were scraped and collected into 1.5 ml tubes. Then, samples were incubated for 5 min at \(4^{\circ }\textrm{C}\) at continuous rotation (60 rpm), passed through a 23G needle for 10 times, and again incubated for 5 min at \(4^{\circ }\textrm{C}\) at continuous rotation (60 rpm). Lysates were clarified by centrifugation at 3000×g for 3 min at \(4^{\circ }\textrm{C}\) . Supernatants were centrifuged again at 10,000×g for 5 min at \(4^{\circ }\textrm{C}\) . Ribosome footprint sequencingThe ribosome footprinting sequencing protocol was adapted from protocols described in Refs. [ 18 , 19 , 44 ]. An equivalent to 8 OD 260 of lysate was treated with 66 U RNase I (Invitrogen) for 45 min at \(22^{\circ }\textrm{C}\) in a thermomixer with mixing at 1000 rpm. Then, 200 U SUPERase·In RNase inhibitor (20 U/µl, Invitrogen) was added to each sample. Digested lysates were loaded onto 10–50% home-made sucrose density gradients in open-top polyclear centrifuge tubes (Seton Scientific). Tubes were centrifuged at 35,000 rpm (210,100×g) for 3 h at \(4^{\circ }\textrm{C}\) (SW-41 Ti rotor, Beckmann Coulter ultracentrifuge). Samples were fractionated using the Piston Gradient Fractionator (Biocomp Instruments) at 0.75 ml/min by monitoring A260 values. Thirty fractions of 0.37 ml were collected in 1.5 ml tubes, flash frozen, and stored at \(-80^{\circ }\textrm{C}\) . The fractions (typically 3 or 4) corresponding to the digested monosome peak were pooled. RNA was extracted using the hot acid phenol/chloroform method. The ribosome-protected RNA fragments (28–32 nt) were selected by electrophoresis on 15% polyacrylamide urea TBE gels and visualized with SYBR Gold Nucleic Acid Gel Stain (ThermoFisher Scientific). Size selected RNA was dephosphorylated by T4 PNK (NEB) for 1 h at \(37^{\circ }\textrm{C}\) . RNA was purified using the acid phenol/chloroform method. Depletion of rRNA was performed using the riboPOOL kit (siTOOLs biotech) from 433 ng of RNA according to the manufacturer’s instructions. Libraries were prepared using the SMARTer smRNA-Seq Kit for Illumina (Takara) following the manufacturer’s instructions from 15 ng of RNA. Libraries were purified by electrophoresis on 8% polyacrylamide TBE gels and sequenced on the Illumina NextSeq 500 sequencer in the Genomics Facility Basel (Department of Biosystems Science and Engineering (D-BSSE), ETH Zürich). RNA-sequencingRNA was extracted from 15 µl of cell lysate using the Direct-zol RNA Microprep Kit (Zymo Research) following the manufacturer’s instructions and including DNase treatment for 15 min at room temperature. Samples were eluted with 15 µl nuclease-free water. The RNA integrity numbers (RIN) of the samples were between 9.9 and 10.0, measured using High Sensitivity RNA ScreenTape (TapeStation system, Agilent). RNA was quantified using a Qubit Flex fluorometer (Thermo Fisher Scientific). Libraries were prepared using the SMART-seq Stranded for total RNA-seq kit (Takara) from 5 ng of RNA and sequenced on the Illumina NextSeq 500 sequencer in the Genomics Facility Basel (Department of Biosystems Science and Engineering (D-BSSE), ETH Zürich). The data sets used in this study are as follows. Constructs consisted in 25 nts of identical sequence (for PCR amplification) followed by a 50-nt-long random 5′UTR sequence upstream of the GFP coding region. Their sequences and associated mean ribosome load measurements were obtained from the GEO repository, accession number GSE114002 [ 45 ]. Non-sequential features were computed and annotated for each sequence with a python script. The normalized 5′UTR folding energy was determined with the RNAfold program from the ViennaRNA package [ 46 ]. The G/C-fraction was calculated using the biopython package [ 47 ]. Number of OOF and IF uAUGs were calculated with standard python methods. ORF/UTR length and number of exons were identical in this data set and therefore uninformative. Following [ 14 ], we split the 20,000 5′UTRs with the highest coverage in mRNA seq for testing and kept the rest for training. Constructs were made from random sequences, human 5′UTRs of suitable size (25–100 nts), their single nucleotide polymorphism-containing variants, and 3′-terminal fragments of longer 5′UTRs. MRL measurements were done as for the Optimus50 data set. Sequences and associated MRL estimates were obtained from the GEO repository, accession number GSE114002 [ 45 ]. The non-sequential features were computed just as for Optimus50, with the UTR length being an additional degree of freedom. The 5000 5′UTRs with highest coverage in mRNA-seq are held out for testing, just as in [ 14 ]. Human genetic variantsSample et al. [ 14 ] extracted 3577 5′UTR SNPs from the ClinVar database [ 31 ] and constructed variant 5′UTRs containing these SNPs. These variants were transfected to HEK 293 cells, and the respective MRL was measured as described in the paragraph about Optimus50. We also appended non-sequential features as outlined there, with the UTR length as an additional variable. The sequences and MRL were downloaded from GEO repository GSE114002 [ 45 ]. Yeast colonies were grown in media without HIS3 . Yeast cells were transduced with plasmids containing the HIS3 -ORF attached to a random pool of \(\sim 500,000\) randomized 50-nt-long 5′UTRs. The growth rate is directly controlled by the amount of HIS3 protein, which only is controlled by the 5′UTR sequence. The data were obtained from GEO, accession number GSE104252 [ 48 ]. The calculation of non-sequential features followed the exact same procedure as for Optimus50. The top 5% 5′UTRs in terms of read coverage were used for testing. We downloaded the training data from Suppl. Tab. S2 of [ 15 ]. Non-sequential features were calculated as for Optimus50. Since uAUGs are mutated in this data set to avoid ambiguity in the translation start site, we did not include the number of OOF or IF uAUGs in the list of non-sequential features to learn from. Also, DART uses a luciferase reporter only including the first bit of the coding sequence, so neither the number of exons nor the CDS length are meaningful; therefore, we did not include these features, either. The first bit of the CDS sequence is available as a separate column in their Suppl. Tab. S2. We use three different random splits of 10% of the data for testing. Human mRNA sequencesThe human transcript sequences were pulled from ENSEMBL [ 49 ] with pybiomart. We use the GRCh38.105 annotation and the GRCh38.dna_sm.primary_assembly.fa primary assembly file. The human transcriptome sequences were assembled with gffread version 0.12.7 [ 50 ]. Yeast mRNA sequencesWe used the R64.1.1 yeast genome [ 51 ] with the R64.1.1.110 annotation from the Saccharomyces cerevisiae Genome Database (SGD). We enriched this annotation with the longest annotated transcript from TIF-seq, see [ 52 ], providing us with 5′UTR sequences. Gffread [ 50 ] yielded the yeast transcriptome. Yeast TE dataWe used ribosome footprinting (GSM2278862, GSM2278863, GSM2278864) and RNA sequencing data (GSM2278844, GSM2278845, GSM2278846) from the control experiments performed in [ 19 ], downloaded from the European Nucleotide Archive, accession PRJNA338918 [ 53 ]. The riboseq analysis was conducted as in [ 54 ]; the RNA-seq analysis was performed using zarp [ 55 ]. All non-sequential features (log ORF length, UTR length, G/C-content fraction of the UTR, number of exons, number of OOF uAUGs, number of IF uAUGs, normalized 5′UTR folding energy) were computed or extracted from the genome annotation. The 10% of transcripts with the highest TIN were used for testing purposes. HEK 293 TE dataRibo-seq and mRNA-seq data were obtained from the European Nucleotide Archive, accession PRJNA591214 [ 56 ]. The riboseq analysis was conducted as in [ 54 ]; the RNA-seq analysis was performed as in [ 55 ]. For the calculation of the translation efficiency, we only took into account RNA-seq and ribo-seq reads in the CDS, not on the entire transcript. For stringency in the attribution of reads to mRNAs, we calculated relative isoform abundances by running salmon [ 57 ] on the RNA-seq samples and selected the most abundant isoform as representative, to which we mapped the RNA and ribo-seq reads. The 10% of transcripts with the highest TIN (squared average over the three replicates) were used for testing. HepG2 TE dataWe followed the experimental procedure outlined in the experimental methods. The rest of the analysis was done as for the HEK 293 TE data. The data was deposited in the European Nucleotide Archive under accession PRJNA1045106 [ 58 ]. ClinVar dataWe downloaded the ClinVar data base vcf file ( vcf_GRCh38 [ 32 ]). With bedtools-intersect (v 2.30) [ 59 ], we identified variants from ClinVar in annotated genes and only kept variants of annotated 5′UTRs. With a python script, we calculated the coordinates of the polymorphisms on all affected transcripts. Then, we constructed the variant 5′UTRs in the human transcriptome (created with gffread [ 50 ] from the GRCh38.105 ENSEMBL annotation) and extracted the coding regions. This left us with 84,127 mutated transcripts. Next, we computed the non-sequential features as for Optimus50. We predicted the variant TE with the transfer-learning version of TranslateLSTM (trained on human endogenous HEK 293, pre-trained on the Optimus100 data set). Matching the transcript variants and predictions to transcripts for which we have TE measurements left us with 7238 transcripts. Model architecturesWe implemented previous published models that predict the translation output from 5′UTR sequences according to the their description in the respective studies. We used tensorflow 2.10, along with cuda toolkit 11.8.0 on NVIDIA titanx GPUs with 12GB of graphics memory. Optimus 5-PrimeOptimus 5-Prime was the first neural network trained to predict translation initiation efficiency [ 14 ]. It consist of three convolutional layers with 120 8nt-long filters each. They all feature a relu activation function and are succeeded by two dense layers, one reducing the input dimensionality to 40 with another relu nonlinearity, and a last dense layer reducing to a single number. The two last layers are separated by a dropout layer that stochastically ignores 20% of the input signals during training. The configuration allowing predictions for 5′UTR s up to 100 nts in length has 714,681 parameters. Two different configurations of Optimus 5-Prime were proposed: one trained of a pool of \(\sim 280,000\) 5′UTR sequences of 50 nts and another trained on a pool of \(\sim 105,000\) 5′UTRs of 25–100 nts. Variable lengths were handled by anchoring the 5′UTR at their 3′end (adjacent to the start codon) and padding the 5′ end with 0s in the one-hot encoded representation. Since endogenous 5′UTR s vary widely in length, we used the latter configuration and data set, considering it to be more realistic. However, the size of the model is also larger. To run the Optimus models on the local scientific computing infrastructure, the model training was re-implemented in a python script, rather than a jupyter notebook as in the git repository cited in [ 14 ]. Framepool [ 16 ] technically overcomes the limitation on 5′UTR length. While also relying on convolutional layers and a final two-layer perceptron, Framepool introduced an operation called “framewise pooling.” This was motivated by previous observations that out-of-frame uORFs have a strong impact on the translation output. Framewise pooling involves the pooling of output of convolutional layers separately for the +0, +1, and +2 frames. The subsequent multi-layer perceptron (MLP) takes as input the average and the maximum of each of the three pools (per convolutional filter). This makes the input of the final MLP independent of the input length and allows for varying UTR lengths from a technical standpoint. Trained on the same data sets as Optimus 5-Prime, the performance on data of varying UTR length was increased. The number of parameters in Framepool is only about a third of what Optimus 5-Prime requires for UTR lengths \(\le 100\) nt, namely 282,625 parameters. We pulled Framepool from the git repository referenced in [ 16 ]. A python script related the model to our format of input data. While CNNs are generally not a natural choice when it comes to modeling sequences of variable length, recurrent neural networks (RNNs) were developed exactly for this purpose. Conventional RNNs suffer from the so-called vanishing gradient problem, whereby memory of distant context is lost. Moreover, they can only memorize the left-side context, since they process sequences from left to right. These problems are solved by long-short term memory units (LSTM) [ 25 ] and bidirectional layers. However, as there is no correspondence between output cells and position in the sequence, the interpretability of this type of model is more challenging. MTtrans [ 17 ] has been proposed as an attempt to get the best of both CNN and LSTM worlds. It follows the general idea of detecting motifs, with four convolutional layers stacked on top of each other, batch normalization, L2 regularization, and dropout layers in between to avoid over-fitting and ensure stability. This component of MTtrans is called “shared encoder” and is followed by two bidirectional gated recurrent unit (GRU) [ 60 ] layers and two dense layers to make the final prediction. GRUs are quite similar to LSTMs, but they do not feature an output-gate [ 25 , 60 ] and therefore have fewer weights to adjust than LSTM layers. This second component of MTtrans is called “task-specific tower,” because it is re-trained for each data set (task), while the encoder is shared across all tasks. By training the encoder on data sets of different organisms and cells, the authors aim to capture general features of translation that apply to all of the studied systems. This is an example of transfer learning, hence the “trans” in the name MTtrans. MTtrans appears to be considerably bigger than its two predecessors, with \(\sim 2,100,000\) parameters. A re-evaluation of the results in [ 17 ] was unfortunately not possible since the code in the provided github repository was still work in progress. Therefore, we attempted reconstructing MTtrans in our own ML framework, but will quote the numbers reported in [ 17 ], wherever available. TranslateLSTMThe start of the coding region has a biased nucleotide composition that also plays a role in translation initiation ( c.f. [ 61 ]). Putting the first 100 nts into another bidirectional LSTM model therefore provides additional information about initiation likelihood. These three models, bidirectional LSTM for 5′UTRs, bidirectional LSTM for beginning of ORF, and non-sequential features, can now be concatenated into a big model. There is, of course, a lot of redundance in these inputs, as the folding energy of the 5′UTR is determined by its nucleotide sequence, GC-content, and length of the 5′UTR. One way to mitigate this redundance is to use high dropout rates after the final bidirectional LSTM layer of both RNNs (5′UTR and ORF). For training from scratch, we used dropout rates of 0.8 for the 5′UTR model and 0.2 for the CDS model. After concatenating the numerical input data with the two bidirectional LSTMs, a final dense layer computes a single number, the logarithm of the translation efficiency, scaled to a Gaussian distribution with unit variance and expectation value 0. The network was implemented in python, using the keras API with a tensorflow backend [ 62 , 63 ]. We used the adam algorithm [ 64 ] for training, with the default learning rate of 0.001 that proved to superior even to more sophisticated learning rate schedules. Beyond dropout layers, overfitting is prevented by imposing an early stopping criterion. To this end, we used a keras callback object. This object monitors the validation loss an terminates training once it sees a consistent increase in the validation loss over a given number of epochs (“patience” parameter). We set the patience to 10 epochs and restored the best weights within these 10 epochs after termination. Of the randomly reshuffled training data, 20% serve validation purposes. To further improve the performance of the model, we pretrained it on the variable-length Optimus100 data set before training on the endogenous data. In that scenario, we used slightly lower dropout rates for the 5′UTR LSTM of 0.5. Linear modelsAs the translation initiation efficiency was reported to be explained, to a large extent, by a small number of mRNA features [ 6 ], we have included in our study two variants of a small linear model. The features were as follows. First, upstream open reading frames (uORFs) were reported in many studies to reduce the initiation of translation at the main ORF [ 65 ]. The effect was found to be largely due to uORFs that are in a different frame than the main ORF, which we have referred to as “out-of-frame ORFs” or “out-of-frame AUGs,” because the presence and position of stop codons matching these ORFs is not generally considered. Thus, one of the linear models included the numbers of out-of-frame and in-frame AUGs, while the other only the former. The secondary structure of 5′ cap-proximal region of the mRNA is known to interfere with the binding of the eIF4F cap-binding-complex [ 5 ], and thus a weak positive correlation has been observed between the free energy of folding of the first 80 5′UTR nts and the translation initiation efficiency of yeast mRNAs [ 6 , 7 , 8 ]. A more minor impact on yeast translation has also been attributed to the ORF length (negative correlation with TIE) [ 6 , 66 ], 5′UTR length, and G/C content [ 6 ]. For human cells, the number of exon-exon junctions has also been linked to TIE [ 23 ]. Additional file 1: Fig. S6 shows density plots of these parameters, comparing the major data sets we used, i.e., the three MPRA data sets, DART, and the two endogenous ones. The linear models are of compelling simplicity: they only have as many parameters as features they cover, plus a single global bias term. For instance, the linear model describing the Optimus50 data set consists of weights multiplying the normalized 5′UTR folding energy, the G/C-content, the number of IF and OOF upstream AUGs, and the bias term, totaling to 5 parameters. Availability of data and materialsSequencing data from ribosome footprinting and RNA sequencing in the HepG2 cell line are available under BioProject ID PRJNA1045106 [ 58 ]. The evaluated clinical variants from the clinvar data base are attached in Additional File 1: Tab. S1. MPRA measurements in HEK 293 cells from [ 14 ] are publicly available from GEO repository GSE114002 [ 45 ], MPRA measurements in yeast (see [ 21 ]) GSE104252 [ 48 ]. DART measurements from yeast data are available from Supp. Tab. 2 of [ 15 ]. Yeast RNA sequencing and ribosome footprinting data from [ 19 ] were retrieved from the European Nucleotide Archive under accession number PRJNA338918 [ 53 ]. RNA sequencing and ribosome footprinting data from HEK 293 cells (c.f. [ 20 ]) for this study were downloaded from the European Nucleotide Archive under accession number PRJNA591214 [ 56 ]. Code availabilityThe code for TranslateLSTM and data-preprocessing is publicly available under the MIT license in the github repository [ 67 ], with a Zenodo version at the time of publication under [ 68 ]. The scripts to create the plots are available under [ 69 ]. Galloway A, Cowling VH. mRNA cap regulation in mammalian cell function and fate. Biochim Biophys Acta Gene Regul Mech. 2019;1862(3):270–9. Article CAS PubMed PubMed Central Google Scholar Ingolia NT, Ghaemmaghami S, Newman JR, Weissman JS. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 2009;324(5924):218–23. de Smit MH, van Duin J. Secondary structure of the ribosome binding site determines translational efficiency: a quantitative analysis. Proc Natl Acad Sci U S A. 1990;87(19):7668–72. Article PubMed PubMed Central Google Scholar Kozak M. An analysis of 5’-noncoding sequences from 699 vertebrate messenger RNAs. Nucleic Acids Res. 1987;15(20):8125–48. Kertesz M, Wan Y, Mazor E, Rinn JL, Nutter RC, Chang HY, et al. Genome-wide measurement of RNA secondary structure in yeast. Nature. 2010;467(7311):103–7. Article CAS PubMed Google Scholar Weinberg DE, Shah P, Eichhorn SW, Hussmann JA, Plotkin JB, Bartel DP. Improved ribosome-footprint and mRNA measurements provide insights into dynamics and regulation of yeast translation. Cell Rep. 2016;14(7):1787–99. Shah P, Ding Y, Niemczyk M, Kudla G, Plotkin JB. Rate-limiting steps in yeast protein translation. Cell. 2013;153(7):1589–601. Godefroy-Colburn T, Ravelonandro M, Pinck L. Cap accessibility correlates with the initiation efficiency of alfalfa mosaic virus RNAs. Eur J Biochem. 1985;147(3):549–52. Loo LS, Cohen RE, Gleason KK. Chain mobility in the amorphous region of nylon 6 observed under active uniaxial deformation. Science. 2000;288(5463):116–9. Schwanhaeusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, et al. Global quantification of mammalian gene expression control. Nature. 2011;473(7347):337–42. Article CAS Google Scholar Tierney J, Swirski M, Tjeldes H, Carancini G, Kiran A, Michel A, et al. RiboSeq.Org. 2016. https://riboseq.org . Accessed July 2024. Li GW, Burkhardt D, Gross C, Weissman JS. Quantifying absolute protein synthesis rates reveals principles underlying allocation of cellular resources. Cell. 2014;157(3):624–35. Bohlen J, Fenzl K, Kramer G, Bukau B, Teleman AA. Selective 40S footprinting reveals cap-tethered ribosome scanning in human cells. Mol Cell. 2020;79(4):561–74. Sample PJ, Wang B, Reid DW, Presnyak V, McFadyen IJ, Morris DR, et al. Human 5’ UTR design and variant effect prediction from a massively parallel translation assay. Nat Biotechnol. 2019;37(7):803–9. Niederer RO, Rojas-Duran MF, Zinshteyn B, Gilbert WV. Direct analysis of ribosome targeting illuminates thousand-fold regulation of translation initiation. Cell Syst. 2022;13(3):256–64. Karollus A, Avsec Z, Gagneur J. Predicting mean ribosome load for 5’UTR of any length using deep learning. PLoS Comput Biol. 2021;17(5):e1008982. Zheng W, Fong JHC, Wan YK, Chu AHY, Huang Y, Wong ASL, et al. Translation rate prediction and regulatory motif discovery with multi-task learning. In: Tang H, editor. Research in Computational Molecular Biology. Cham: Springer Nature Switzerland; 2023. pp. 139–54. Ingolia NT, Brar GA, Rouskin S, McGeachy AM, Weissman JS. The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat Protoc. 2012;7(8):1534–50. Mittal N, Guimaraes JC, Gross T, Schmidt A, Vina-Vilaseca A, Nedialkova DD, et al. The Gcn4 transcription factor reduces protein synthesis capacity and extends yeast lifespan. Nat Commun. 2017;8(1):457. Alexaki A, Kames J, Hettiarachchi GK, Athey JC, Katneni UK, Hunt RC, et al. Ribosome profiling of HEK293T cells overexpressing codon optimized coagulation factor IX. F1000Res. 2020;9:174. Cuperus JT, Groves B, Kuchina A, Rosenberg AB, Jojic N, Fields S, et al. Deep learning of the regulatory grammar of yeast 5’ untranslated regions from 500,000 random sequences. Genome Res. 2017;27(12):2015–24. Wang L, Nie J, Sicotte H, Li Y, Eckel-Passow JE, Dasari S, et al. Measure transcript integrity using RNA-seq data. BMC Bioinformatics. 2016;17:58. Nott A, Le Hir H, Moore MJ. Splicing enhances translation in mammalian cells: an additional function of the exon junction complex. Genes Dev. 2004;18(2):210–22. Ho JWK. MTtrans. Github. 2022. https://github.com/holab-hku/MTtrans . Accessed Aug 2023. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80. Li K, Kong J, Zhang S, Zhao T, Qian W. Distance-dependent inhibition of translation initiation by downstream out-of-frame AUGs is consistent with a Brownian ratchet process of ribosome scanning. Genome Biol. 2022;23(1):254. Russell PJ, Slivka JA, Boyle EP, Burghes AHM, Kearse MG. Translation reinitiation after uORFs does not fully protect mRNAs from nonsense-mediated decay. RNA. 2023;29(6):735–44. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, et al., editors. Advances in Neural Information Processing Systems. vol. 30. Red Hook: Curran Associates, Inc.; 2017. p. 4765–74. Nikolados EM, Wongprommoon A, Aodha OM, Cambray G, Oyarzún DA. Accuracy and data efficiency in deep learning models of protein expression. Nat Commun. 2022;13(1):7755. May GE, Akirtava C, Agar-Johnson M, Micic J, Woolford J, McManus J. Unraveling the influences of sequence and position on yeast uORF activity using massively parallel reporter systems and machine learning. Elife. 2023;12:e69611. Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46(D1):D1062–7. Landrum MJ, Chitipiralla S, Brown GR, Chen C, Gu B, Hart J, et al. ClinVar: improvements to accessing data. Nucleic Acids Res. 2020;48(D1):D835–44. Riba A, Di Nanni N, Mittal N, Arhné E, Schmidt A, Zavolan M. Protein synthesis rates and ribosome occupancies reveal determinants of translation elongation rates. Proc Natl Acad Sci U S A. 2019;116(30):15023–32. Hinnebusch AG. Translational regulation of yeast GCN4. A window on factors that control initiator-trna binding to the ribosome. J Biol Chem. 1997;272(35):21661–4. Jang SK, Kräusslich HG, Nicklin MJ, Duke GM, Palmenberg AC, Wimmer E. A segment of the 5’ nontranslated region of encephalomyocarditis virus RNA directs internal entry of ribosomes during in vitro translation. J Virol. 1988;62(8):2636–43. Pelletier J, Sonenberg N. Internal initiation of translation of eukaryotic mRNA directed by a sequence derived from poliovirus RNA. Nature. 1988;334(6180):320–5. Weber R, Ghoshdastider U, Spies D, Duré C, Valdivia-Francia F, Forny M, et al. Monitoring the 5’UTR landscape reveals isoform switches to drive translational efficiencies in cancer. Oncogene. 2023;42(9):638–50. Calviello L, Mukherjee N, Wyler E, Zauber H, Hirsekorn A, Selbach M, et al. Detecting actively translated open reading frames in ribosome profiling data. Nat Methods. 2016;13(2):165–70. VanInsberghe M, van den Berg J, Andersson-Rolf A, Clevers H, van Oudenaarden A. Single-cell Ribo-seq reveals cell cycle-dependent translational pausing. Nature. 2021;597(7877):561–5. Riba A, Emmenlauer M, Chen A, Sigoillot F, Cong F, Dehio C, et al. Explicit modeling of siRNA-dependent on- and off-target repression improves the interpretation of screening results. Cell Syst. 2017;4(2):182–93. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A, Kaiser L, Polosukhin I. Attention is all you need. 2017. http://arxiv.org/abs/1706.03762 . Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. Tharakan R, Ubaida-Mohien C, Piao Y, Gorospe M, Ferrucci L. Ribosome profiling analysis of human skeletal muscle identifies reduced translation of mitochondrial proteins with age. RNA Biol. 2021;18(11):1555–9. Hornstein N, Torres D, Das Sharma S, Tang G, Canoll P, Sims PA. Ligation-free ribosome profiling of cell type-specific translation in the brain. Genome Biol. 2016;17(1):149. Sample PJ, Wang B, Seelig G. Human 5’ UTR design and variant effect prediction from a massively parallel translation assay. GSM3130435, GSM4084997, GSM3130443. Gene Expression Omnibus; 2018. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE114002 . Accessed Sept 2021. Lorenz R, Bernhart SH, Zu Siederdissen CH, Tafer H, Flamm C, Stadler PF, et al. ViennaRNA Package 2.0. Algorithms Mol Biol. 2011;6:26. Cock PJ, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009;25(11):1422–3. Cuperus J, Groves B, Kuchina A, Rosenberg AB, Jojic N, Fields S, et al. Learning the regulatory grammar of yeast 5’ untranslated regions from a large library of random sequences. GSM2793751. Gene Expression Omnibus; 2017. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE104252 . Accessed Sept 2021. Martin FJ, Amode MR, Aneja A, Austine-Orimoloye O, Azov AG, Barnes I, et al. Ensembl 2023. Nucleic Acids Res. 2023;51(D1):D933–41. Pertea G, Pertea M. GFF utilities: GffRead and GffCompare. F1000Res. 2020;9. Liachko I, Youngblood RA, Keich U, Dunham MJ. High-resolution mapping, characterization, and optimization of autonomously replicating sequences in yeast. Genome Res. 2013;23(4):698–704. Pelechano V, Wei W, Steinmetz LM. Extensive transcriptional heterogeneity revealed by isoform profiling. Nature. 2013;497(7447):127–31. Mittal N. The Gcn4 transcription factor reduces protein synthesis capacity and extends yeast lifespan. GSM2278862, GSM2278863, GSM2278864, GSM2278844, GSM2278845, GSM2278846. Bioproject; 2016. https://www.ncbi.nlm.nih.gov/bioproject/?term=338918 . Accessed Apr 2024. Banerjee A, Ataman M, Smialek MJ, Mookherjee D, Rabl J, Mironov A, et al. Ribosomal protein RPL39L is an efficiency factor in the cotranslational folding of proteins with alpha helical domains. Nucleic Acids Res. 2024:gkae630. https://doi.org/10.1093/nar/gkae630 . Katsantoni M, Gypas F, Herrmann CJ, Burri D, Bak M, Iborra P, et al. ZARP: an automated workflow for processing of RNA-seq data. bioRxiv. 2021. https://doi.org/10.1101/2021.11.18.469017 . Alexaki A. Ribosome profiling of HEK-293T cells stably expressing wild-type and codon optimized coagulation factor IX. Bioproject; 2019. https://www.ncbi.nlm.nih.gov/bioproject/591214 . Accessed Sept 2023. Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14(4):417–9. Schlusser N, Gonzalez A, Pandey M, Zavolan M. Current limitations in predicting mRNAtranslation with deep learning models. Bioproject; 2023. https://www.ncbi.nlm.nih.gov/bioproject/?term=1045106 . Accessed July 2024. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2. Cho K, van Merrienboer B, Gülçehre Ç, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR. 2014;abs/1406.1078. http://arxiv.org/abs/1406.1078 . Accessed July 2024. Archer SK, Shirokikh NE, Beilharz TH, Preiss T. Dynamics of ribosome scanning and recycling revealed by translation complex profiling. Nature. 2016;535(7613):570–4. Chollet F, et al. Keras. 2015. https://keras.io . Accessed Aug 2021. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow: large-scale machine learning on heterogeneous systems. 2015. Software available from tensorflow.org. https://www.tensorflow.org/ . Accessed Aug 2021. Kingma DP, Ba J. Adam: a method for stochastic optimization. 2017. https://arxiv.org/abs/1412.6980 . Accessed Aug 2021. Zur H, Tuller T. New universal rules of eukaryotic translation initiation fidelity. PLoS Comput Biol. 2013;9(7):e1003136. Arava Y, Wang Y, Storey JD, Liu CL, Brown PO, Herschlag D. Genome-wide analysis of mRNA translation profiles in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A. 2003;100(7):3889–94. Schlusser N. predicting-translation-initiation-efficiency. GitHub; 2024. https://git.scicore.unibas.ch/zavolan_group/data_analysis/predicting-translation-initiation-efficiency/-/tree/main?ref_type=heads . Schlusser N. Version of GitHub repository: predicting-translation-initiation-efficiency. Zenodo; 2024. https://doi.org/10.5281/zenodo.13133725 . Schlusser N. Plot scripts for: Current limitations in predicting mRNA translation with deep learning models. Zenodo; 2024. https://doi.org/10.5281/zenodo.10463090 . Download references AcknowledgementsWe would like to thank Aleksei Mironov for providing us with yeast 5′UTR annotations and, along with Meric Ataman, for helpful discussions. Review historyThe review history is available as Additional file 2. Peer review informationAndrew Cosgrove was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team. Open access funding provided by University of Basel. This work has been supported by the Swiss National Science Foundation grant #310030_204517 to M.Z. Calculations were performed at sciCORE ( http://scicore.unibas.ch/ ) scientific computing core facility at University of Basel. Author informationAuthors and affiliations. Biozentrum, University of Basel, Spitalstrasse 41, 4056, Basel, Switzerland Niels Schlusser, Asier González, Muskan Pandey & Mihaela Zavolan Departament de Bioquímica i Biologia Molecular and Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, 08193, Cerdanyola del Vallès, Spain Asier González Current address: Institute of Molecular Biology and Biophysics, Department of Biology, ETH Zurich, 8093, Zurich, Switzerland Muskan Pandey You can also search for this author in PubMed Google Scholar ContributionsM.Z. and N.S. conceived the study. N.S. implemented the models and carried out the analysis with input from M.Z. M.P. and A.G. provided experimental data. Corresponding authorsCorrespondence to Niels Schlusser or Mihaela Zavolan . Ethics declarationsEthics approval and consent to participate. Not applicable. Consent for publicationCompeting interests. The authors declare no competing interests. Additional informationPublisher's note. Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Supplementary Information13059_2024_3369_moesm1_esm.zip. Additional file 1: Contains Supplementary Figures S1 - S8 as well as the description of Supplementary Table S1. For the sake of readability, we did not display all 7238 lines in the additional file. Instead, the table is provided extra as supp_tab_1.tsv . Additional file 2. Contains the review historyRights and permissions. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Reprints and permissions About this articleCite this article. Schlusser, N., González, A., Pandey, M. et al. Current limitations in predicting mRNA translation with deep learning models. Genome Biol 25 , 227 (2024). https://doi.org/10.1186/s13059-024-03369-6 Download citation Received : 26 December 2023 Accepted : 07 August 2024 Published : 20 August 2024 DOI : https://doi.org/10.1186/s13059-024-03369-6 Share this articleAnyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. Provided by the Springer Nature SharedIt content-sharing initiative
Genome BiologyISSN: 1474-760X
A bimodal deep learning network based on CNN for fine motor imagery
Cite this article
Explore all metrics Motor imagery (MI) is an important brain-computer interface (BCI) paradigm. The traditional MI paradigm (imagining different limbs) limits the intuitive control of the outer devices, while fine MI paradigm (imagining different joint movements from the same limb) can control the mechanical arm without cognitive disconnection. However, the decoding performance of fine MI limits its application. Electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) are widely used in BCI systems because of their portability and easy operation. In this study, a fine MI paradigm including four classes (hand, wrist, shoulder and rest) was designed, and the data of EEG-fNIRS bimodal brain activity was collected from 12 subjects. Event-related desynchronization (ERD) from EEG signals shows a contralateral dominant phenomenon, and there is difference between the ERD of the four classes. For fNIRS signal in the time dimension, the time periods with significant difference can be observed in the activation patterns of four MI tasks. Spatially, the signal peak based brain topographic map also shows difference of these four MI tasks. The EEG signal and fNIRS signal of these four classes are distinguishable. In this study, a bimodal fusion network is proposed to improve the fine MI tasks decoding performance. The features of these two modalities are extracted separately by two feature extractors based on convolutional neural networks (CNN). The recognition performance was significantly improved by the bimodal method proposed in this study, compared with the performance of the single-modal network. The proposed method outperformed all comparison methods, and achieved a four-class accuracy of 58.96%. This paper demonstrates the feasibility of EEG and fNIRS bimodal BCI systems for fine MI, and shows the effectiveness of the proposed bimodal fusion method. This research is supposed to support fine MI-based BCI systems with theories and techniques. This is a preview of subscription content, log in via an institution to check access. Access this articleSubscribe and save.
Price includes VAT (Russian Federation) Instant access to the full article PDF. Rent this article via DeepDyve Institutional subscriptions Similar content being viewed by othersMultimodal fNIRS-EEG Classification Using Deep Learning Algorithms for Brain-Computer Interfaces PurposesA CNN-based modular classification scheme for motor imagery using a novel EEG sampling protocol suitable for IoT healthcare systemsPortable deep-learning decoder for motor imaginary EEG signals based on a novel compact convolutional neural network incorporating spatial-attention mechanismExplore related subjects.
Data availabilityData archiving is not mandated but data will be made available on reasonable request. Alazrai R, Alwanni H, Baslan Y et al (2017) EEG-Based brain-computer interface for Decoding Motor Imagery tasks within the same hand using Choi-Williams time-frequency distribution. Sensors-Basel. 17(9) Bhattacharyya S, Konar A, Tibarewala DN (2017) Motor imagery and error related potential Induced position control of a robotic arm. IEEE/CAA J Automatica Sinica 4(4):639–650 Article Google Scholar Botalb A, Moinuddin M, Al-Saggaf UM et al (2018) Contrasting convolutional neural network (CNN) with multi-layer perceptron (MLP) for big data analysis. 2018 International conference on intelligent and advanced system (ICIAS). IEEE Brunne C, Delorme A, Makei S (2013) Eeglab – an Open source Matlab Toolbox for Electrophysiological Research. Biomedical Eng / Biomedizinische Technik 58:000010151520134182 SI-1-Track-G) Google Scholar Chen J (2023) Xia, fNIRS-EEG BCIs for Motor Rehabilitation: a review. Bioeng (Basel Switzerland). 10(12) Chiarelli AM, Croce P, Merla A et al (2018) Deep learning for hybrid EEG-fNIRS brain-computer interface: application to motor imagery classification. J Neural Eng. 15(3) Cooney C, Folli R, Coyle D (2021) A bimodal deep learning architecture for EEG-fNIRS decoding of overt and imagined speech. IEEE Trans Biomed Eng 69(6):1983–1994 Coyle SM, Ward TE, Markham CM (2007) Brain-computer interface using a simplified functional near-infrared spectroscopy system. J Neural Eng 4(3):219–226 Article PubMed Google Scholar Deng J, Yao J, Dewald J (2005) Classification of the intention to generate a shoulder versus elbow torque by means of a time-frequency synthesized spatial patterns BCI algorithm. J Neural Eng 2(4):131 Doud AJ, Lucas JP, Pisansky MT et al (2011) Continuous three-Dimensional Control of a virtual helicopter using a motor imagery based brain-computer interface. PLoS ONE 6(10):e26322 Article CAS PubMed PubMed Central Google Scholar Edelman BJ, Baxter B, He B (2016) EEG Source Imaging Enhances the Decoding of Complex Right Hand Motor Imagery Tasks. IEEE Trans Bio Med Eng 63(1):4–14 Erdogan SB, Ozsarfati E, Dilek B et al (2019) Classification of motor imagery and execution signals with population-level feature sets: implications for probe design in fNIRS based BCI. J Neural Eng 16(2):026029 Erdoĝan SB, Özsarfati E, Dilek B et al (2019) Classification of motor imagery and execution signals with population-level feature sets: implications for probe design in fNIRS based BCI. J Neural Eng 16(2):026029 Fazli S, Mehnert J, Steinbrink J et al (2012) Enhanced performance by a hybrid NIRS–EEG brain computer interface. NeuroImage 59(1):519–529 Guger C, Edlinger G, Harkam W et al (2003) How many people are able to operate an EEG-based brain-computer interface (BCI)? IEEE Trans Neural Syst Rehabilitation Eng 11(2):145–147 Article CAS Google Scholar Hasan MAH, Khan MU, Mishra D (2020) A Computationally Efficient Method for Hybrid EEG-fNIRS BCI Based on the Pearson Correlation. Biomed Res Int 2020 Hramov A, Pitsik E, Chholak P et al (2019) A MEG Study of Different Motor Imagery Modes in Untrained Subjects for BCI Applications. 16th International Conference on Informatics in Control, Automation and Robotics Hu S, Wang H, Zhang J et al (2014) Causality from Cz to C3/C4 or between C3 and C4 revealed by Granger causality and new causality during motor imagery, 2014 International joint conference on neural networks (IJCNN) IntriligatorJ, Polich J (1994) On the relationship between background EEG and the P300 event-related potential. Biol Psychol 37(3):207 Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. InInternational conference on machine learning Irimia DC, Ortner R, Poboroniuc MS et al (2018) High classification accuracy of a motor imagery based brain-computer interface for Stroke Rehabilitation Training. Front Rob AI 5:130 Kai KA, Zhang YC, Zhang H et al (2008) Filter Bank Common Spatial Pattern (FBCSP) in Brain-Computer Interface. IEEE International Joint Conference on Neural Networks. IEEE Khalid QN, Noman N, Majeed NF et al (2017) Enhancing classification performance of Functional Near-Infrared spectroscopy- brain–computer interface using adaptive estimation of General Linear Model coefficients. Front Neurorobotics 11:33 Kim HS, Min HC, Hong JL et al (2013) A comparison of classification performance among the various combinations of motor imagery tasks for brain-computer interface. International IEEE/EMBS Conference on Neural Engineering. IEEE Kim HJ, Wang IN, Kim YT et al (2020) Comparative Analysis of NIRS-EEG Motor Imagery Data Using Features from Spatial, Spectral and Temporal Domain. 2020 8th International Winter Conference on Brain-Computer Interface (BCI) Kwon J, Im CH (2021) Subject-independent functional Near-Infrared spectroscopy-based brain–computer interfaces based on convolutional neural networks. Front Hum Neurosci 15:646915 Article PubMed PubMed Central Google Scholar Lawhern VJ, Solon AJ, Waytowich NR et al (2018) EEGNet: a compact convolutional neural network for eeg-based brain–computer interfaces. J Neural Eng 15(5):056013 Lin C-L, Chen L-T (2023) Improvement of brain–computer interface in motor imagery training through the designing of a dynamic experiment and FBCSP. Heliyon 9(3):e13745 Lin Z, Zhang C, Wu W et al (2006) Frequency Recognition Based on Canonical Correlation Analysis for SSVEP-Based BCIs. IEEE Trans Biomed Eng 53(12):2610–2614 Liu YH, Lin LF, Chou CW et al (2018) Analysis of Electroencephalography Event-Related Desynchronisation and Synchronisation Induced by Lower-Limb Stepping Motor Imagery. J Med Biol Eng 39(4):1–16 Lühmann AV, Ortega-Martinez A, Boas DA et al (2020) Using the General Linear Model to improve performance in fNIRS single trial analysis and classification: a perspective. Front Hum Neurosci 14:30 Makeig S (1993) Auditory event-related dynamics of the eeg spectrum and effects of exposure to tones. Electroencephalogr Clin Neurophysiol 86(4):283–293 Article CAS PubMed Google Scholar Mallat SG (1999) A Wavelet Tour of Signal Processing. Academic, New York Mattia D, Astolfi L, Toppi J et al (2016) Interfacing brain and computer in neurorehabilitation. 4th International Winter Conference on Brain-Computer Interface (BCI). IEEE McGeady C, Vučković A, Zheng YP et al (2021) EEG monitoring is feasible and reliable during simultaneous transcutaneous electrical spinal cord stimulation. Sensors 21(19):6593 Nikhil S, Jean-Claude B (2013) Does motor imagery share neural networks with executed movement: a multivariate fMRI analysis. Front Hum Neurosci 7(564):564 Nirthika R, Manivannan S, Ramanan A et al (2022) Pooling in convolutional neural networks for medical image analysis: a survey and an empirical study. Neural Comput Appl 34(7):5321–5347 O’shea K, Nash R (2015) An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 Ortega P, Faisal A (2021) HemCNN: Deep Learning enables decoding of fNIRS cortical signals in hand grip motor tasks. 10th International IEEE/EMBS Conference on Neural Engineering Pfurtscheller G, Neuper C (2001) Motor imagery and direct brain-computer communication. Proc IEEE 89(7):1123–1134 Pfurtscheller G, Flotzinger D, Kalcher J (1993) Brain-computer Interface—a new communication device for handicapped persons. J Microcomputer Appl 16(3):293–299 Pfurtscheller G, Neuper C, Schlogl A et al (1998) Separability of EEG signals recorded during right and left motor imagery using adaptive autoregressive parameters. IEEE Trans Rehabilitation Eng 6(3):316–325 Pfurtscheller G, Guger C, Muller G et al (2000) Brain oscillations control hand orthosis in a tetraplegic. Neurosci Lett 292(3):211–214 Pfurtscheller G, Brunner C, SchlOGl A et al (2006) Mu rhythm (de)synchronization and EEG single-trial classification of different motor imagery tasks. NeuroImage 31(1):153–159 Pinti P, Merla A, Aichelburg C et al (2017) A novel GLM-based method for the Automatic IDentification of functional events (AIDE) in fNIRS data recorded in naturalistic environments. NeuroImage 155:291–304 Pinti P, Tachtsidis I, Hamilton A et al (2020) The present and future use of functional near-infrared spectroscopy (fNIRS) for cognitive neuroscience, vol 1464. ANNALS OF THE NEW YORK ACADEMY OF SCIENCES, pp 5–29. 1 Pollonini L, Olds C, Abaya H et al (2014) Auditory cortex activation to natural speech and simulated cochlear implant speech measured with functional near-infrared spectroscopy. Hear Res Int J 309:84–93 Qi F, Wang W, Xie X et al (2021) Single-trial eeg classification via orthogonal wavelet decomposition-based feature extraction. Front NeuroSci 15:715855 Ramos-Murguialday A, Schurholz M, Caggiano V et al (2012) Proprioceptive feedback and Brain Computer Interface (BCI) based neuroprostheses. PLoS ONE. 7(10) Saadati M, Nelson J, Ayaz H (2020) Multimodal fNIRS-EEG Classification Using Deep Learning Algorithms for Brain-Computer Interfaces Purposes. Proc. ADVANCES IN NEUROERGONOMICS AND COGNITIVE ENGINEERING Sakhavi S, Guan C, Yan S (2018) Learning temporal information for brain-computer interface using convolutional neural networks. IEEE Trans Neural Networks Learn Syst 29(11):5619–5629 Schirrmeister RT, Springenberg JT, Fiederer LDJ et al (2017) Deep learning with convolutional neural networks for eeg decoding and visualization. Hum Brain Mapp 38(11):5391–5420 Schlögl A, Lee F, Bischof H et al (2005) Characterization of four-class motor imagery EEG data for the BCI-competition 2005. J Neural Eng 2(4):L14 Shin J, Lühmann AV, Blankertz B et al (2017) Open Access dataset for EEG + NIRS single-trial classification. IEEE Trans Neural Syst Rehabil Eng 25(10):1735–1745 Song X, Meng L, Shi Q et al (2015) Learning Tensor-Based Features for Whole-Brain fMRI Classification. International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer International Publishing Sun Z, Huang ZH, Duan F et al (2020) A Novel Multimodal Approach for Hybrid Brain&x2013Computer interface. IEEE ACCESS 8:89909–89918 Tabar YR, Halici U (2017) A novel deep learning approach for classification of EEG motor imagery signals. J Neural Eng 14(1):016003 Tang Z, Li C, Sun S (2017) Single-trial EEG classification of motor imagery using deep convolutional neural networks. Int J Light Electron Opt 130:11–18 Townsend G, Graimann B, Pfurtscheller G (2004) Continuous EEG classification during motor imagery–simulation of an asynchronous BCI. IEEE Trans Neural Syst Rehabil Eng 12(2):258–265 Verma P, Heilinger A, Reitner P et al (2019) Performance Investigation of Brain-Computer Interfaces that Combine EEG and fNIRS for Motor Imagery Tasks. 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). IEEE von Lühmann A, Ortega-Martinez A, Boas DA et al (2020) Using the General Linear Model to improve performance in fNIRS single trial analysis and classification: a perspective. Front Hum Neurosci. 14 Wainer J, Cawley G (2021) Nested cross-validation when selecting classifiers is overzealous for most practical applications. Expert Syst Appl 182:115222 Wang Y, Hong B, Gao X et al (2007) Design of electrode layout for motor imagery based brain computer interface. Electron Lett 43(10):1–2 Wang Q, Xie J, Zuo W et al (2020) Deep CNNs meet global covariance pooling: better representation and generalization. IEEE Trans Pattern Anal Mach Intell 43(8):2582–2597 Yang H, Sakhavi S, Kai KA et al (2015) On the use of convolutional neural networks and augmented CSP features for multi-class motor imagery of EEG signals classification. 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society IEEE EMBS. IEEE Ye JC, Tak S, Jang KE NIRS-SPM: Statistical parametric mapping for near-infrared spectroscopy, NeuroImage. 44(2):428–447 Yi W, Qiu S, Qi H et al (2013) EEG feature comparison and classification of simple and compound limb motor imagery. J Neuro Eng Rehabilitation 10(1):106 Yücel MA, Selb JJ, Huppert TJ et al (2017) Functional Near Infrared Spectroscopy: enabling routine functional brain imaging. Curr Opin BIOMEDICAL Eng 4:78–86 Zhang C, Pan X, Li H et al (2018) A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification. ISPRS J Photogrammetry Remote Sens 140:133–144 Zhang R, Zhu F, Liu J et al (2019) Depth-wise separable convolutions and multi-level pooling for an efficient spatial CNN-based steganalysis. IEEE Trans Inf Forensics Secur 15:1138–1150 Zhao HB, Cooper RJ (2018) Review of recent progress toward a fiberless, whole-scalp diffuse optical tomography system. Neurophotonics. 5(1) Zhong MJ, Lotte F, Girolami M et al (2008) Classifying EEG for brain computer interfaces using gaussian processes. Pattern Recognit Lett 29(3):354–359 Zhu Y, Jayagopal JK, Mehta RK et al (2020) Classifying major depressive disorder using fNIRS during Motor Rehabilitation. IEEE Trans Neural Syst Rehabil Eng 28(4):961–969 Download references AcknowledgementsThis work was supported in part by the National Natural Science Foundation of China (Grant numbers: U21A20388 and 62276262), and in part by the Beijing Natural Science Foundation (J210010 and 7222311). The authors would like to thank the subjects for participating in these experiments. Author informationAuthors and affiliations. Laboratory of Brain Atlas and Brain-Inspired Intelligence, Key Laboratory of Brain Cognition and Brain-inspired Intelligence Technology, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China Chenyao Wu, Shuang Qiu & Huiguang He University of Chinese Academy of Sciences, Beijing, 100049, China Chenyao Wu, Yu Wang, Shuang Qiu & Huiguang He National Engineering & Technology Research Center for ASIC Design, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China You can also search for this author in PubMed Google Scholar ContributionsAll authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by [Shuang Qiu], and [Yu Wang]. The first draft of the manuscript was written by [Chenyao Wu] and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript. Corresponding authorsCorrespondence to Shuang Qiu or Huiguang He . Ethics declarationsEthical approval. All authors claim that there are no conflicts of interest. The study was approved by the Research Ethics Committee of the Institute of Automation, Chinese Academy of Sciences. Each subject has signed an informed consent in advance. Additional informationPublisher’s note. Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Rights and permissionsSpringer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. Reprints and permissions About this articleWu, C., Wang, Y., Qiu, S. et al. A bimodal deep learning network based on CNN for fine motor imagery. Cogn Neurodyn (2024). https://doi.org/10.1007/s11571-024-10159-0 Download citation Received : 17 November 2023 Revised : 20 June 2024 Accepted : 03 August 2024 Published : 19 August 2024 DOI : https://doi.org/10.1007/s11571-024-10159-0 Share this articleAnyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. Provided by the Springer Nature SharedIt content-sharing initiative
Harris adviser Deese calls for Marshall Plan on clean energy
Sign up here. Reporting by Timothy Gardner Editing by Marguerita Choy Our Standards: The Thomson Reuters Trust Principles. , opens new tab Thomson Reuters Timothy reports on energy and environment policy and is based in Washington, D.C. His coverage ranges from the latest in nuclear power, to environment regulations, to U.S. sanctions and geopolitics. He has been a member of three teams in the past two years that have won Reuters best journalism of the year awards. As a cyclist he is happiest outside. SustainabilityChina regulates bond market based on market principles, state media saysChina's financial regulators approach bond market oversight based on market principles and from macro-prudential and compliance perspectives, state media on Saturday, rejecting claims of market intervention.
Oops! Looks like we're having trouble connecting to our server.Refresh your browser window to try again. Related Searches
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
What Science and Nature are good for: causing paper cutsPaper that is 65 micrometres thick poses the greatest danger of causing a paper cut, a model shows. Credit: Getty A combination of experiments and theoretical work reveals why only certain types of paper can cut into human skin 1 . Access optionsAccess Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription 24,99 € / 30 days cancel any time Subscribe to this journal Receive 51 print issues and online access 185,98 € per year only 3,65 € per issue Rent or buy this article Prices vary by article type Prices may be subject to local taxes which are calculated during checkout doi: https://doi.org/10.1038/d41586-024-02297-6 Arnbjerg-Nielsen, S. F., Biviano, M. D. & Jensen, K. H. Phys. Rev. E. 110 , 025003 (2024). Google Scholar Download references Attosecond delays in X-ray molecular ionization Article 21 AUG 24 Chiral kagome superconductivity modulations with residual Fermi arcs Observation of the antimatter hypernucleus $${}_{\bar{{\boldsymbol{\Lambda }}}}{}^{{\bf{4}}}\bar{{\bf{H}}}$$ H ¯ Λ ¯ 4 Senior Researcher-Experimental Leukemia Modeling, Mullighan LabMemphis, Tennessee St. Jude Children's Research Hospital (St. Jude) Assistant or Associate Professor (Research-Educator)The Center for Molecular Medicine and Genetics in the Wayne State University School of Medicine (http://genetics.wayne.edu/) is expanding its high-... Detroit, Michigan Wayne State University Postdoctoral Fellow – Cancer ImmunotherapyTampa, Florida H. Lee Moffitt Cancer Center & Research Institute Postdoctoral Associate - SpecialistHouston, Texas (US) Baylor College of Medicine (BCM) Postdoctoral Associate- CAR T Cells, Synthetic BiologySign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily. Quick links
Deviation ActionsExperiment 023 - CycloDescription. |
IMAGES
COMMENTS
Pod Color: Purple. "To pork, or not to pork, THAT is the question!" (Writer's note: Alas poor pun-o-meter, I knew him well, Jookieba. Gender: Male. Here is...okay, Jumba will level with you, this was one of my weakest early prototypes to date. All he does is turn things into ham, and quote Shaking-Spears.
Experiment Number : 024. Pod Colour : Purple. Name : Hamlette. Gender : Male. Primary Function : Designed to turn everything into ham. Special Abilities : Experiment 024 shoots a beam from his eyes that alters the molecular composition of an object transforming it into a pork product. One True Place : Working at the local butcher shop.
Primary Function: {Ham Converter} Designed to turn things into Ham. (Does so by sneezing, the closest inanimate object is immediately transformed into pork). One True Place: With Ms. Hasagawa as one of her cats and periodically seen with Harold the Butcher. Other Info: 024 can be rendered useless by plugging his nose. Activation:
Portrayed by: Kimberly Brooks - Experiment 024, nicknamed Hamlette by Nakine Mahelona, was the twenty-fourth illegal genetic experiment created by Dr. Darlig Kanium, whose one true place was with Lynne Hasegawa.[1] . . . . . . . Darlig Kanium - Creator "Nakine & Blue: A New Ohana" (Experiment...
024 Hamlette This experiment was activated when Mrs. Hasagawa's cats were. 220 025 White Topper: A small yellow star-shaped creature with a little antenna on his head. Designed to be a beacon to signal the alien attack fleet, but the official Disney website states his purpose is to keep individuals awake with his bright light.
Prior to the events of Lilo & Stitch, every experiment created was dehydrated into a small orb called an "experiment pod" and stored in a special container for transport. In Stitch! ... 024 Hamlette An experiment designed to turn objects into ham. She was activated when Mrs. Hasagawa's cats were activated, as indicated by Gantu's experiment ...
Share your thoughts, experiences, and stories behind the art. Literature. Submit your writing
On Gantu's experiment analyzer, it says 024 was activated, but this experiment was never seen in this episode. It is unknown whether or not 024, 090, 201, 235, and 284 are also Mrs. Hasagawa's "cats" because they were not shown in this episode. ... Hamlette (024) Gotchu (031) Forehead (044) Hocker (051) Zawp (077) Fetchit (090) Mulch (111 ...
Oh no, when I think back to those experiments…In short, these experiments, which I created for mass chaos were activated in the house of a lonely old woman and stayed to live with her as a cats.
Lilo and Stitch Wiki. in: Experiments, Genetic experiments, Females, and 7 more. Hamlette. A purple heavy-set hamster-like experiment with a purple nose. Designed to turn objects into ham or pork. She was activated when Mrs. Hasagawa's cats were activated, as indicated by Gantu's experiment computer, but did not physically appear in the episode.
Here is the complete list of Jumba's experiments, as found during the end credits for Leroy and Stitch. Because some of the names of experiments are patently absurd, even for Lilo, the entire list must be taken with a grain of salt; Whatsits Galore does not guarantee its accuracy. MORE DISNEY:
Feb 27, 2019 - This Pin was discovered by Misty Wagner. Discover (and save!) your own Pins on Pinterest
Jumba laughed. "That would be Experiment 024. She is designed to turn objects into pork, ceasing all functionality and causing complete chaos across galaxy! Mwa-ha-ha-ha-ha!" Lilo rubbed her chin thoughtfully. "Then I'm gonna name her Hamlette." "Stitch can help," insisted Stitch, his voice weak and tired.
Dec 18, 2023 - This Pin was discovered by Aisha Alfadhel. Discover (and save!) your own Pins on Pinterest
May require eye-contact, touch or some other action. Effects may be temporary or irreversible. Some targets may be immune. May be limited on how much matter can be changed at once. May not be able to turn the victim back to normal. Power could be constantly on causing the user to kill by accident. Transmutation Immunity / Immutability.
The Kibble-Zurek mechanism is a key framework for describing the dynamics of continuous phase transitions. Recent experiments with ultracold gases, employing alternative methods to create a ...
600 Series. 33. Lilo and Stitch - Experiments. 4. Stitch the movie - Experiments. 7. Lilo and Stitch the series - Experiments. 81. Leroy and Stitch - Experiments.
2024 Aug 21. doi: 10.1007/s12325-024-02955-1. Online ahead of print. Authors Brett Hauber 1 , Agnes Hong 1 , Elke Hunsche 2 , Martine C ... employing a discrete choice experiment to assess preferences for ADT attributes. For each choice task, respondents were asked to select the hypothetical treatment profile that they preferred out of two ...
HAMLETTE (024) 🐷👑: Hamlette, the adventurous pig princess from "Lilo & Stitch: The Series," fearlessly takes on royal responsibilities while maintaining her playful spirit. 👑 We have a wide variety of Disney Fantasy Pins for you to choose from, including the latest Disney films such as Moana and Coco, as well as classic films like Aristocats, Hercules, Rescuers, Peter Pan, Atlantis ...
a, Dry-air mole fractions in ppt (pmol mol −1) are shown for the duration of the tracer release experiment (16 June to 7 August 2022).HFC-161 mole fractions are normalized to a reference release ...
Hamlette is the name of the genetic experiment 024, a fictional alien experiment from the television program. Designed to turn objects into ham or pork.
"Mrs. Hasagawa's Cats" is the first segment of the sixty-second episode of Lilo & Stitch: The Series. It aired on May 19, 2006. Lilo and Stitch try to help Mrs. Hasagawa out by organizing her fruit stand and cleaning her yard. However, when she tries to offer them a bowl of apricots for their help, they discover that the bowl is actually full of experiment pods, and her house is later full of ...
Share your thoughts, experiences, and stories behind the art. Literature. Submit your writing
The design of nucleotide sequences with defined properties is a long-standing problem in bioengineering. An important application is protein expression, be it in the context of research or the production of mRNA vaccines. The rate of protein synthesis depends on the 5′ untranslated region (5′UTR) of the mRNAs, and recently, deep learning models were proposed to predict the translation ...
Subjects. Twelve healthy right-handed subjects (6 males and 6 females, ages 22-27) were recruited for this experiment. The Mann-Whitney U test revealed that males and females did not significantly differ in terms of mean age (U = 13.5, p > 0.05).The study was approved by the Research Ethics Committee of the Institute of Automation, Chinese Academy of Sciences.
Brian Deese, an economic adviser for Vice President Kamala Harris' presidential campaign, called on Thursday for an economic program to loan allies money to buy U.S. green energy technologies as ...
Experiment 024 - Hamlette Experiment 025 - Topper Experiment 026 - Pawn Experiment 027 - Plushy Experiment 028 - Lori Experiment 029 - Checkers ... Experiment 049 - Picker. Experiment 061 - Anachronator . Experiment 140 - Changeling . Experiment 144 - Samolean. Experiment 194 - Trax.
Are you a Disney Fantasy Pin collector?. Look no further! Pins Handmade ?. Best quality ?. Our pins are limited edition, with only 50 available for each design, making them highly sought after and rare.
Their experiment showed that paper thinner than 50 micrometres is unlikely to inflict a cut because it lacks the structural integrity needed to resist buckling.
AnxiousAlex2004 on DeviantArt https://www.deviantart.com/anxiousalex2004/art/Experiment-022-Hertz-Donut-922717229 AnxiousAlex2004