NDSU logo

NDSU Repository

View Item 
  •   NDSU Repository Home
  • NDSU Theses & Dissertations
  • Engineering
  • Computer Science
  • Computer Science Masters Papers
  • View Item
  •   NDSU Repository Home
  • NDSU Theses & Dissertations
  • Engineering
  • Computer Science
  • Computer Science Masters Papers
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

DOM Structure Based Web Pattern Mining

Thumbnail
Author/Creator
Pillarikuppam, Naresh
More Information
Show full item record

View/Open

DOM Structure Based Web Pattern Mining (2.539Mb)
Abstract
A rapid expansion in the Web has motivated several studies to understand and recognize the implementation structure underlying the interface. Though the presentation of the Web pages looks different, those Web pages may share the same semantic structure to organize information. Those common semantic structures are referred to as Web patterns. There are no strict rules for implementing the HTML structure of the web pages, and the implementation of each web page might not be consistent across the entire website. Also, the HTML implementation of one website varies from other websites. This makes it difficult to recognize the Web patterns that have been used for implementing the websites. In this paper, Document Object Model (called "DOM" hereafter) structure based web pattern mining has been proposed, where the HTML structure and the common patterns are represented in DOM structure format. As an approach for deriving the common web pattern, the implemented patterns observed across different websites are analyzed and summarized manually. Those Web patterns are represented by using the Pattern Structure Definition (PSD) format. which is derived based on the DTD model. Then, an efficient algorithm has been proposed to recognize Web patterns that match with the definition and comply with all the properties defined in the PSD. To recognize the pattern structure, a tool was developed that can take the URL as an input and recognize summarized patterns. The experiment results and evaluation of the tool show the high accuracy of the approach. The implemented approach achieved 91.35% accuracy in finding the navigation pattern structure in the on line shopping websites.
URI
https://hdl.handle.net/10365/29601
Collections
  • Computer Science Masters Papers
  • Engineering Masters Papers

Student Focused, Land Grant, Research Institution

  • Campus Map
    • Campus Map (pdf)
    • Building list
    • Campus Offices
  • Equity
  • Employment
  • Phone/Email Directory
  • Online Services
    • Blackboard
    • One Stop
    • Campus Connection
    • IT Help Desk
    • Libraries
    • Email
    • Student Success Collaborative
  • Registration And Records
    • Course Schedule
    • Dates and Deadlines
North Dakota State University - Libraries
Circulation: (701) 231-8888 | Reference: (701) 231-8886
Administration: (701) 231-8753
Main Library address: 1201 Albrecht Boulevard
Mailing address: Dept #2080 PO Box 6050, Fargo, ND 58108-6050
Site manager: Site manager
Contact Us | Send Feedback
 
Advanced Search

Browse

All of NDSU RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Student Focused, Land Grant, Research Institution

  • Campus Map
    • Campus Map (pdf)
    • Building list
    • Campus Offices
  • Equity
  • Employment
  • Phone/Email Directory
  • Online Services
    • Blackboard
    • One Stop
    • Campus Connection
    • IT Help Desk
    • Libraries
    • Email
    • Student Success Collaborative
  • Registration And Records
    • Course Schedule
    • Dates and Deadlines
North Dakota State University - Libraries
Circulation: (701) 231-8888 | Reference: (701) 231-8886
Administration: (701) 231-8753
Main Library address: 1201 Albrecht Boulevard
Mailing address: Dept #2080 PO Box 6050, Fargo, ND 58108-6050
Site manager: Site manager
Contact Us | Send Feedback