SS vs Sequence Similarity (BLAST result)
Correlation co-efficients between BLAST bit scores, and semantic similarity.
| Aspect | Resnik | Lin | Jiang |
|---|---|---|---|
| Molecular Function | 0.577 | 0.541 | -0.483 |
| Biological Process | 0.280 | 0.303 | -0.312 |
| Cellular Component | 0.368 | 0.452 | -0.414 |
Correlation co-efficients for semantic similarity scores over different aspects of GO.
| Aspect | Resnik | Lin | Jiang |
|---|---|---|---|
| Molecular Function - Cellular Component | 0.290 | 0.318 | 0.087 |
| Molecular Function - Biological Process | 0.219 | 0.244 | 0.269 |
| Biological Process - Cellular Component | 0.202 | 0.175 | 0.166 |
The Resnik measure shows the highest correlation, as well as having the lowest correlation for the other two aspects, so it may be the most discriminatory.
SS vs Gene Expression
Correlation co-efficients between Gene Expression Correlation and Semantic Similarity.
| Correlation | ||||
|---|---|---|---|---|
| /-\ | Resnik | Jiang | Lin | |
| Marsha | MF | 0.04 | -0.05 | 0.04 |
| /-\ | CC | 0.05 | -0.06 | 0.05 |
| /-\ | BP | 0.06 | -0.03 | 0.05 |
| RAD | MF | 0.12 | 0.00 | 0.10 |
| /-\ | CC | 0.14 | -0.06 | 0.10 |
| /-\ | BP | 0.14 | -0.05 | 0.12 |
Correlation Coefficients between Gene Expression Correlation and Semantic Similarity When Average Correlations Are Computed over 100 Semantic Similarity Intervals.
| Correlation | ||||
|---|---|---|---|---|
| /-\ | Resnik | Jiang | Lin | |
| Marsha | MF | 0.63 | -0.59 | 0.24 |
| /-\ | CC | 0.72 | -0.32 | 0.12 |
| /-\ | BP | 0.77 | -0.22 | 0.39 |
| RAD | MF | 0.47 | 0.16 | 0.28 |
| /-\ | CC | 0.51 | -0.23 | 0.34 |
| /-\ | BP | 0.59 | -0.14 | 0.41 |
SS vs (Random permutation of GO annotation, Gene Expression)
Correlation between Gene Expression Correlation and Semantic Similarity for Resnik Distance
in Two Randomized Experiments.
| Correlation | ||||
|---|---|---|---|---|
| /-\ | Resnik | GO Random | Exp Random | |
| Marsha | MF | 0.63 | -0.13 | 0.10 |
| /-\ | CC | 0.72 | 0.09 | 0.05 |
| /-\ | BP | 0.77 | -0.08 | 0.20 |
| RAD | MF | 0.47 | 0.16 | -0.03 |
| /-\ | CC | 0.53 | -0.23 | -0.15 |
| /-\ | BP | 0.61 | -0.14 | -0.16 |
These results suggest that there is an underlying relationship between gene expression and GO annotation. They also validate the use of Resnik semantic similarity as a measure that is well correlated to gene expression and can be used to augment the biological knowledge achieved from other sources. For instance, in the same way that we have tools that characterize genes according to their expression profiles or similar criteria, tools could be developed that take advantage of semantic similarity to enhance existing information. Semantic similarity could also be used to improve current clustering algorithms as well as in the development of a "semantic search" tool
ROC curve analysis
It comprises pairwise interactions among proteins of the same complex and interactions of neighboring proteins within KEGG human regulatory pathways. After discarding proteins with indirect interaction effect, the interaction nature of neighboring proteins includes activation, inhibition, binding/association, dissociation, state change, phosphorylation, dephosphorylation, glycosylation, ubiquitination and methylation.
we randomly choose two distinct human proteins from Entrez Gene database as a non-interacting protein pair. This is valid since the chance of identifying protein–protein interactions at random is very small (0.024% based on the two-hybrid data by Utez et al., 2000).
SS vs Sequence similrity
In summary, these results confirm that functionally related proteins tend to have higher sequence similarity. This is more evident for the MFscore. Nevertheless, a considerable percentage of protein pairs that are orthologous and that have a high sequence similarity show no functional similarity. The comparison with Lord's approach to combine semantic similarity scores shows significantly different results. In particular, the proposed approach is expected to provide a better discrimination between nonhomologous and orthologous proteins.
Finding functionally related proteins
MDS for yeast-yeast comparison
<latex>NS={{\sum_{ij}d_ij\prime - d_{ij})}^2}}\over{\sum_{ij}d_ij^2}}}</latex>
<latex>{d_{ij}}\prime</latex> is the distance of proteins i and j in the low dimensional space.
<latex>d_{ij}</latex> is the respective distance in the original space.
<latex>CR_k = {{(NS_k - NS_{k-1})}\over{(NS_{k+1} - NS_k )}}</latex>
<latex>k</latex> is the number if dimensions.
Classification of HDG using GO categories
SS vs pathway
SS vs pathway
| # of transversal networks | describe | |
|---|---|---|
| 6 | 4 | KEGG annotations are identical or correspond to sibling KEGG pathways |
| /-\ | 1 | KEGG annotations correspond to closely related two-level terms |
| /-\ | 1 | annotations are different but reflect the composition of the networks into subnetworks |
| 4 | Heterogenous. However, from a biological point of view, the KEGG annotations are complementary | |