Title page for ETD etd-11012004-003013

Type of Document Dissertation
Author Das Neves, Fernando Adrian
Author's Email Address fdasneve@vt.edu
URN etd-11012004-003013
Title Stepping Stones and Pathways:Improving Retrieval by Chains of Relationships between Documents
Degree PhD
Department Computer Science
Advisory Committee
Advisor Name Title
Fox, Edward Alan Committee Chair
Kafura, Dennis G. Committee Member
Kriz, Ronald D. Committee Member
North, Christopher L. Committee Member
Ramakrishnan, Naren Committee Member
  • Information retrieval
  • Literature-based discovery
  • Combination of sources of evidence
  • Indexing of scientific literature
Date of Defense 2004-09-16
Availability unrestricted
The information retrieval (IR) field has been successful in developing techniques to address many types of information needs. However, there are cases in which traditional approaches to IR are not able to produce adequate results. Examples include: when a small set of (2-3) documents is needed as an answer rather than a single document, or when "query splitting" is required to satisfactorily explore the document space. We explore an alternative model of building and presenting retrieval results for such cases. In particular, we research effective methods for handling information needs that may:

1. Include multiple topics: A typical query is interpreted by current IR systems as a request to retrieve documents that each discusses all topics included in that query. We propose an alternative interpretation based on query splitting. It allows queries to be interpreted as requests to retrieve sets of documents rather than individual documents, with meaningful relationships among the members of each such set.

2. Be interpreted as parts in a chain of relationships: Suppose a query concerns topics t1 and tm. Is there a relation between topics t1 and tm that involves t2 and possibly other topics as in {t1, t2, … tm}? Thus, we propose an alternative interpretation of user queries and presentation of the results. Our interpretation has the potential to improve retrieval results whenever there is a mismatch between the user's understanding of the collection and the actual collection content. We define and refine a retrieval scheme that enhances retrieval through a framework that combines multiple sources of evidence.

Query results in our interpretation are networks of document groups representing topics, each group relating to and connecting to other groups in the network that partially answer the user's information need. We devise new and more effective representations and techniques to visualize results, and incorporate the user as part of the retrieval process.

We also evaluate the improvement of the query results based on multiple measures. In particular, we verify the validity of our approach through a study involving a collection of Operating Systems research papers that was specially built for this dissertation.

  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  dissertation.PDF 1.45 Mb 00:06:43 00:03:27 00:03:01 00:01:30 00:00:07

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.