The domains are structurally defined using the DomainParser method [ 26 ]. Only distances within domains are evaluated. The subdivision into domains is crucial to avoid bias due to the size and domain architecture of the protein. The odds ratio is calculated as observed over expected clustering value of the mutations in a gene.
The statistical significance of the observations was assessed by calculating the p-value under the null-model assumption of a uniform distribution of the mutations. For spatial clustering and proximity to functional sites it has to be obtained from the random control population. Let f be such an empirical null-model distribution with mean m. To assess the robustness of the data against outliers, we applied a jackknife test.
This test is a bootstrapping procedure where the results are being recalculated multiple times, each time leaving out one gene from the original dataset. Taking the maximum and the minimum over this procedure for all genes yields an interval around the value of the original dataset. These intervals are shown as error bars in the Figure 1. Structural impact of mutations. The columns show the structural properties of random mutations Rnd , natural variations Snp and cancer mutations Mut. Cancer mutations are further analyzed separately as mutations of oncogenes Onc and mutations of tumor suppressor genes Sup.
The error bars indicate the variability of the data under the jackknife test. The reported values are the odds ratios averaged over the genes in the dataset. The p-values are calculated over all mutations within a dataset. A, observed over expected fraction of mutations occurring at the protein surface. Onc show significantly more and Sup significantly less solvent accessible mutations. B, observed over expected fraction of destabilizing mutations. Onc mutations are less often destabilizing, while Sup mutations disrupt stability far more often than the controls.
Structural Impact - Norman Jones - Bok () | Bokus
C, observed over expected functional site mutations. Functional sites are more frequently mutated in Onc than in Sup. D, observed over expected spatial clustering of mutations. Mutations particularly in Onc are significantly more clustered than expected by chance. Linear classifiers were automatically calculated using Fisher's linear discriminant method, which provides a good compromise between finding the optimal solution in the linearly separable case and being robust to outliers [ 27 ].
To test the robustness of the classification we applied a leave-one-out cross validation procedure. In each step, one gene is temporarily removed from the training set. The classifier is recalculated on the subset and we test whether it is able to correctly predict the class membership of the excluded gene.
Information on genes, mutations, SNPs and functional annotations that were used in the analysis is available in electronic form as Additional Tables S1-S4 Additional Files 2 , 3 , 4 , 5. In this study we analyze the structural impact of a large number of cancer mutations in oncogenes and tumor suppressors. We evaluate the impact with respect to four structural features. We focused on eight selected tumor entities that are among the most frequent and lethal types.
This set contains many classical cancer genes that are involved in major signaling pathways i. The genes with their corresponding mutations were subdivided into the classes of tumor suppressor Sup and oncogenes Onc as shown in Table 1 , representing two common mechanisms through which tumorigenesis is initiated: via gain-of-function of oncogenes and loss-of-function of tumor suppressors [ 28 ]. In the following we present the results for the four structural properties. In Figure 1 we report the average odds ratios over the genes in the respective set Snp, Mut, Onc, Sup.
- StruSoft | Structural Design Software.
- The Sweetness of Tears: A Novel;
- Structural Impact of Glycan Binding on Viral Particles;
- Pan e Pomodor - My Passage To Puglia.
- StruSoft | Structural Design Software!
As the first property, we investigated whether mutations occur at the surface or in the core of the protein. However, a separate analysis of oncogenes and tumor suppressors reveals that mutations in oncogenes occur significantly more often at the surface 1. We calculated the impact that the mutations of the different datasets have on protein stability.
The calculations were performed with the FoldX software [ 21 ].
A recent assessment has shown that this method is currently among the best methods for calculating stability changes upon mutation [ 30 ]. The results of this analysis Figure 1B show a distinct difference between oncogenes and tumor suppressors. Tumor suppressors display a significant overrepresentation of mutations that destabilize the protein 1. Next we assessed whether the mutations in our dataset occur proximal to known functional sites and thus are likely to directly influence protein function. For this we extracted annotated functional sites from public databases.
The results are shown in Figure 1C. Cancer mutations in oncogenes Onc have a tendency to specifically target functional sites 1. Functional site mutations are also significantly underrepresented in the Snp data set 0.
Further, we investigated whether particular types of functional sites are more often mutated than expected. Figure 2 shows the observed distribution of functional site mutations in oncogenes and tumor suppressors compared to the distribution expected for randomized mutations. The results for tumor suppressors show no apparent differences between observed and random distribution. Distribution of functional site mutations. Distribution of mutations affecting functional sites in oncogenes Onc and tumor suppressors Sup compared to distribution of random mutations. A and B, distribution obtained by random sampling of positions in Onc and Sup , respectively.
C, distribution of functional site mutations in Onc. D, distribution of functional site mutations in Sup.
Observed distribution does not differ significantly from expected random distribution. Next we wanted to test whether cancer mutations have a tendency to co-localize in spatial clusters. Figure 1D shows that cancer mutations in oncogenes are highly clustered 1. The small error bar for Sup indicates that all tumor suppressors have similar clustering behavior. In this case, the p-values result from the fact that a spatial clustering as high as the one for either of the sets Snp, Mut, Onc or Sup was never observed in the random reference population of size Hence, the p-value is at most 1e The dataset contains three members of the RAS family, which exhibit high sequence similarity.
This is a result of the automatic gene selection. To check for a possible bias introduced by this gene family we recalculated the average values with only one RAS gene and found that the conclusions are unchanged and are still supported by the significance values. Given the distinct average behavior of the two cancer gene classes, we investigated to what extend this behavior is reflected at the individual gene level and to what extend it can be used for predictive purposes. To examine the discriminatory power of the structural features, the features were plotted in pairwise combinations Figure 3.
Each data point corresponds to one individual gene with oncogenes and tumor suppressors shown as blue dots and red diamonds, respectively. The values on the axes are the odds ratios for the feature values. We calculated linear classifiers trained on the two sets using Fisher's discriminant method [ 27 ]. Linear classification of cancer genes. The different pairs of structural features are shown as scatter plots in A-F. Oncogenes are depicted as blue dots, tumor suppressors as red diamonds.
- Impact of GST: Grand implementation; Structural impact; Taxation uniformity.
- Two Fleas & No Dog.
- Aries (Astrology) - How to Find Love and Compatibility in All Your Relationships: Aries Horoscope Boxed Set (Relationship Books for Dating Couples).
- CORDIS | European Commission?
The separating linear functions have been calculated using Fisher's linear discriminant method. The classifiers in A, D and E show the best training performance. Visually, the two classes are well-separated for feature combinations shown in Figure 3A , 3D and 3E.