Institutional Repository

Automatic text summarization using lexical chains : algorithms and experiments

DSpace/Manakin Repository

Show simple item record

dc.contributor.advisor Chali, Yllias
dc.contributor.author Kolla, Maheedhar
dc.contributor.author University of Lethbridge. Faculty of Arts and Science
dc.date.accessioned 2007-05-13T20:28:36Z
dc.date.available 2007-05-13T20:28:36Z
dc.date.issued 2004
dc.identifier.uri http://hdl.handle.net/10133/226
dc.description viii, 80 leaves : ill. ; 29 cm. en
dc.description.abstract Summarization is a complex task that requires understanding of the document content to determine the importance of the text. Lexical cohesion is a method to identify connected portions of the text based on the relations between the words in the text. Lexical cohesive relations can be represented using lexical chaings. Lexical chains are sequences of semantically related words spread over the entire text. Lexical chains are used in variety of Natural Language Processing (NLP) and Information Retrieval (IR) applications. In current thesis, we propose a lexical chaining method that includes the glossary relations in the chaining process. These relations enable us to identify topically related concepts, for instance dormitory and student, and thereby enhances the identification of cohesive ties in the text. We then present methods that use the lexical chains to generate summaries by extracting sentences from the document(s). Headlines are generated by filtering the portions of the sentences extracted, which do not contribute towards the meaning of the sentence. Headlines generated can be used in real world application to skim through the document collections in a digital library. Multi-document summarization is gaining demand with the explosive growth of online news sources. It requires identification of the several themes present in the collection to attain good compression and avoid redundancy. In this thesis, we propose methods to group the portions of the texts of a document collection into meaningful clusters. clustering enable us to extract the various themes of the document collection. Sentences from clusters can then be extracted to generate a summary for the multi-document collection. Clusters can also be used to generate summaries with respect to a given query. We designed a system to compute lexical chains for the given text and use them to extract the salient portions of the document. Some specific tasks considered are: headline generation, multi-document summarization, and query-based summarization. Our experimental evaluation shows that efficient summaries can be extracted for the above tasks. en
dc.language.iso en_US en
dc.publisher Lethbridge, Alta. : University of Lethbridge, Faculty of Arts and Science, 2004 en
dc.relation.ispartofseries Thesis (University of Lethbridge. Faculty of Arts and Science) en
dc.subject Dissertations, Academic en
dc.subject Automatic abstracting en
dc.subject Cluster analysis -- Computer programs en
dc.subject Computational linguistics en
dc.title Automatic text summarization using lexical chains : algorithms and experiments en
dc.type Thesis en
dc.publisher.faculty Arts and Science
dc.publisher.department Department of Mathematics and Computer Science

Files in this item

This item appears in the following Collection(s)

Show simple item record

Related Items

Search DSpace


Advanced Search

Browse

My Account

Statistics