Cancer Genomics Object Model: An object model for multiple functional genomics data for cancer research

Yu Rang Park1, Hye Won Lee1, Sung Bum Cho1, Ju Han Kim1,2*

1 Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine,
2 Human Genome Research Institute, Seoul National University College of Medicine, Seoul 110-799, Korea

Abstract

The development of functional genomics including transcriptomics, proteomics and metabolomics allow us to monitor a large number of key cellular pathways simultaneously. Several technology-specific data models have been introduced for the representation of functional genomics experimental data, including the MicroArray Gene Expression-Object Model (MAGE-OM), the Proteomics Experiment Data Repository (PEDRo), and the Tissue MicroArray-Object Model (TMA-OM). Despite the increasing number of cancer studies using multiple functional genomics technologies, there is still no integrated data model for multiple functional genomics experimental and clinical data. We propose an object-oriented data model for cancer genomics research, Cancer Genomics Object Model (CaGe-OM). We reference four data models: Functional Genomic-Object Model, MAGE-OM, TMA-OM and PEDRo. The clinical and histopathological information models are created by analyzing cancer management workflow and referencing the College of American Pathology Cancer Protocols and National Cancer Institute Common Data Elements. The CaGe-OM provides a comprehensive data model for integrated storage and analysis of clinical and multiple functional genomics data.

UML Diagrams

 


Table

Table 1 - Previous approaches for integrated model.

 

Method

Target data

Reference model

Implementation

FGE-OM

(Jones A et al., 2004)

Integrated object model

Transcriptomics, and proteomics (2DE and MS)

MAGE-OM, PEDro, Gla-PSI

RAPAD

(microarray, 2DE and MS data)

SysBio-OM

(Xirasagar S, et al., 2004)

Integrated object model

Transcriptomics, proteomics and metabolomics

MAGE-OM, PEDro, and a model for protein-protein interaction and metabolite

 

CEBS

(only for microarray data)

Genotype Shared Model

(HL7 clinical genomics SIG)

Document (XML)

Transcriptiomics,  proteomics, sequence and clinical data

HL7 CDA

Genetic testing : BRCA

Tissue typing: BMT

IBM GMS

(Robson B, et al., 2004)

Document (XML)

Clinical and genomics (protein structure) data

HL7 CDA

Genomic Messing System Language (GMSL)

caCORE

(Covitz PA, et al., 2003)

Object oriented API (caBIO)

Clinical and genomics data

Object Model

caBIG, CGAP, MMHCC, caArray etc..

XDesc

(Shifman MA, et al., 2004)

EAV and Relational model

Clinical and genomics (Transcriptomics) data

TrialDB

YMD

 

 

 

 

 

 

 

 

 

 

 

 


Figure

Figure 1 - Workflow diagram of clinical management of cancer. Diamonds indicate events and rectangles are physical entities.

 

Figure 2 - The Relationships of 26 packages in Cancer Genomics Object Model (CaGe-OM). Most packages in this model are categorized into three namespaces