Old projects from SNUBI

Microarray Information System Construction

Xperanto is basically designed to collect all contents defined by the MIAME standard and meet the specific requirements of laboratories. The database is comprised of six components: experiment, hybridization, biomaterial, array, protocol, and gene expression data (experimentally measured and computationally processed). By experiment we understand a set of one or more hybridizations that are in some way related, often corresponding to a particular publication. Values for data items come from look-up table that manages controlled terms from the MGED Ontology.

Pipelining Microarray Data Analysis

The goal of this project is to create a high-throughput platform for microarray data analysis. This work includes:
    • Develop algorithms for removing systematic effects
    • Develop algorithms for supervised analysis
    • Develop algorithms for unsupervised analysis
    • Develop algorithms for recognizing patterns
    • Develop algorithms for visualization
    • Create software package for the above purposes

MITree

MITree is a clustering algorithm based on a geometric principle. Initially it was designed to be a binary hierarchical clustering algorithm for gene expression analysis. It is well described in Kim JH, Ohno-Machado L, Kohane IS. Unsupervised learning from complex data: the matrix incision tree algorithm. Pac Symp Biocomput 2001;:30-41. We also applied Evolution Strategy (i.e., a genetic algorithm) to the same problem. pdf Implementation of MITree is available here with a typical input data format example (Fisher's iris data set), where data starts from the 4th column and 2nd row and tab-delimited ASCII text file. Visualization and Evaluation of Clustering Structures for Gene Expression Data Analysis. Kim JH, Ohno-Machado L, Kohane IS. (2002) J Biomed Inform 2002;35(1):25-36

BioEMR

임상정보와 바이오정보의 국제표준기반 종합 모델링 및 바이오 EMR 연동기술 개발

임상-바이오정보의 통합을 위한 XML 미들웨어 및 BioEMR 연동엔진 개발

주요 암환자의 실시간 임상시험을 위한 BioEMR 파일럿 시스템 구축 및 운영

국제표준에 기반한 바이오정보 통합모델 개발 및 다 기관 바이오 임상정보 시스템 개발

BioEMR의 XML 기반 고 가용성 질의처리기 및 지식기반 자료처리 엔진 개발

Bioinformatics for Functional Annotation of Genome

Using automatic annotation pipeline,

High throughput finding noble genes, non-coding RNA, regulatory regions, repetitive sequence, sequence variation

Annotation at gene functional and protein level

Functional prediction and automatic annotation by data mining

Database building and informative integration of functional genomics and proteomics
　

Transcriptional Control and Microarrays

1) For this scheme, the microarray data used are mostly time series data. But k-means algorithm considers only the similarity in expression patterns and does not have a sense of Time. We want to develop a clustering method that better reflects the temporal patterns in the expression data. It will more reliably reconstruct the transcriptional control network.
2) Extension of this scheme to the microarray data of species other than the yeast. This will include the development of Gibbs Sampler alignment program optimized for the species.

Clinical Document Management System (CDMS)

The goal of the our research; Communication between Hospital Information Systems (HIS) using the XML-based clinical document representation standards