Show simple item record

dc.contributor.authorPillarikuppam, Naresh
dc.description.abstractA rapid expansion in the Web has motivated several studies to understand and recognize the implementation structure underlying the interface. Though the presentation of the Web pages looks different, those Web pages may share the same semantic structure to organize information. Those common semantic structures are referred to as Web patterns. There are no strict rules for implementing the HTML structure of the web pages, and the implementation of each web page might not be consistent across the entire website. Also, the HTML implementation of one website varies from other websites. This makes it difficult to recognize the Web patterns that have been used for implementing the websites. In this paper, Document Object Model (called "DOM" hereafter) structure based web pattern mining has been proposed, where the HTML structure and the common patterns are represented in DOM structure format. As an approach for deriving the common web pattern, the implemented patterns observed across different websites are analyzed and summarized manually. Those Web patterns are represented by using the Pattern Structure Definition (PSD) format. which is derived based on the DTD model. Then, an efficient algorithm has been proposed to recognize Web patterns that match with the definition and comply with all the properties defined in the PSD. To recognize the pattern structure, a tool was developed that can take the URL as an input and recognize summarized patterns. The experiment results and evaluation of the tool show the high accuracy of the approach. The implemented approach achieved 91.35% accuracy in finding the navigation pattern structure in the on line shopping websites.en_US
dc.publisherNorth Dakota State Universityen_US
dc.rightsNDSU Policy 190.6.2
dc.titleDOM Structure Based Web Pattern Miningen_US
dc.typeMaster's paperen_US
dc.date.accessioned2019-04-17T19:04:28Z
dc.date.available2019-04-17T19:04:28Z
dc.date.issued2011
dc.identifier.urihttps://hdl.handle.net/10365/29601
dc.subject.lcshDocument Object Model (Web site development technology)en_US
dc.subject.lcshPattern recognition systems.en_US
dc.subject.lcshData mining.en_US
dc.rights.urihttps://www.ndsu.edu/fileadmin/policy/190.pdf
ndsu.degreeMaster of Science (MS)en_US
ndsu.collegeEngineeringen_US
ndsu.departmentComputer Scienceen_US
ndsu.programComputer Scienceen_US
ndsu.advisorKong, Jun


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record