Title page for ETD etd-09092004-152600

Type of Document Master's Thesis
Author Shukla, Maulik
Author's Email Address mshukla@vt.edu
URN etd-09092004-152600
Title GeneSieve: A Probe Selection Strategy for cDNA Microarrays
Degree Master of Science
Department Computer Science
Advisory Committee
Advisor Name Title
Heath, Lenwood S. Committee Chair
Grene, Ruth Committee Member
Murali, T. M. Committee Member
Ramakrishnan, Naren Committee Member
  • EST annotation
  • cDNA microarrays
  • probe selection
Date of Defense 2004-09-08
Availability unrestricted
The DNA microarray is a powerful tool to study expression levels of

thousands of genes simultaneously. Often, cDNA libraries representing expressed

genes of an organism are available, along with expressed sequence tags (ESTs).

ESTs are widely used as the probes for microarrays. Designing custom microarrays,

rich in genes relevant to the experimental objectives, requires selection of

probes based on their sequence. We have designed a probe selection method,

called GeneSieve, to select EST probes for custom microarrays. To assign

annotations to the ESTs, we cluster them into contigs using PHRAP. The larger contig

sequences are then used for similarity search against known proteins in model

organism such as Arabidopsis thaliana. We have designed three different methods to

assign annotations to the contigs: bidirectional hits (BH), bidirectional best

hits (BBH), and unidirectional best hits (UBH). We apply these methods to pine and

potato EST sets. Results show that the UBH method assigns unambiguous annotations

to a large fraction of contigs in an organism. Hence, we use UBH to assign

annotations to ESTs in GeneSieve. To select a single EST from a contig, GeneSieve assigns a

quality score to each EST based on its protein homology (PH), cross

hybridization (CH), and relative length (RL). We use this quality score to rank ESTs

according to seven different measures: length, 3' proximity, 5' proximity, protein

homology, cross hybridization, relative length, and overall quality score. Results for

pine and potato EST sets indicate that EST probes selected by quality score are

relatively long and give better values for protein homology and cross

hybridization. Results of the GeneSieve protocol are stored in a database and linked with

sequence databases and known functional category schemes such as MIPS and GO. The

database is made available via a web interface. A biologist is able to select

large number of EST probes based on annotations or functional categories in a quick

and easy way.

  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  GeneSieve.pdf 1.32 Mb 00:06:06 00:03:08 00:02:45 00:01:22 00:00:07

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.