Title page for ETD etd-12302009-142944

Type of Document Dissertation
Author Jin, Ying
URN etd-12302009-142944
Title New Algorithms for Mining Network Datasets: Applications to Phenotype and Pathway Modeling
Degree PhD
Department Computer Science
Advisory Committee
Advisor Name Title
Ramakrishnan, Naren Committee Chair
Fox, Edward Alan Committee Member
Heath, Lenwood S. Committee Member
Helm, Richard Frederick Committee Member
Murali, T. M. Committee Member
  • partial orders
  • biclusters
  • graph separators
  • relative importance methods
  • Biological networks
Date of Defense 2009-12-08
Availability restricted
Biological network data is plentiful with practically every experimental methodology giving ‘network

views’ into cellular function and behavior. Bioinformatic screens that yield network data

include, for example, genome-wide deletion screens, protein-protein interaction assays, RNA interference

experiments, and methods to probe metabolic pathways. Efficient and comprehensive

computational approaches are required to model these screens and gain insight into the nature of biological

networks. This thesis presents three new algorithms to model and mine network datasets.

First, we present an algorithm that models genome-wide perturbation screens by deriving relations

between phenotypes and subsequently using these relations in a local manner to derive genephenotype

relationships. We show how this algorithm outperforms all previously described algorithms

for gene-phenotype modeling. We also present theoretical insight into the convergence and

accuracy properties of this approach. Second, we define a new data mining problem—constrained

minimal separator mining—and propose algorithms as well as applications to modeling gene perturbation

screens by viewing the perturbed genes as a graph separator. Both of these data mining

applications are evaluated on network datasets from S. cerevisiae and C. elegans. Finally, we

present an approach to model the relationship between metabolic pathways and operon structure in

prokaryotic genomes. In this approach, we present a new pattern class—biclusters over domains

with supplied partial orders—and present algorithms for systematically detecting such biclusters.

Together, our data mining algorithms provide a comprehensive arsenal of techniques for modeling

gene perturbation screens and metabolic pathways.

  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
[VT] Jin_Ying_D_2009.pdf 1.59 Mb 00:07:21 00:03:47 00:03:18 00:01:39 00:00:08
[VT] indicates that a file or directory is accessible from the Virginia Tech campus network only.

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.