EXTRACTION OF STRUCTURED INFORMATION FROM UNSTRUCTURED OR SEMI- STRUCTURED MACHINE READABLE WEB PAGES

Vinod Kumar Raavi; Satya P Kumar Somayajula

Abstracto

EXTRACTION OF STRUCTURED INFORMATION FROM UNSTRUCTURED OR SEMI- STRUCTURED MACHINE READABLE WEB PAGES

Vinod Kumar Raavi and Satya P Kumar Somayajula

In now a days the extraction of structured information from unstructured or semi- structured machine readable documents extemporaneously plays a vital role hence many of the websites using ordinary templates with contents which produce the information to accomplish a well publishing productivity, but the major resource for extracting the information is WWW.Recently template detection approach has attained a lot of consolidation of effort in order to reform in various conditions like clustering and classification of web documents, performance of search engine as templates decrease the performance and the efficiency of web application for machines as a result of irrelevant template terms. We want to present a novel algorithm in this paper for extracting templates from a excessive number of web documents that are achieved from heterogeneous templates. By understanding the similarities of the basic template structure in the document we group the web documents so that template for each group has been simultaneously extracted. Hence the algorithms proposed in this paper can be considered as the best among all of the template detection algorithms.

Descargo de responsabilidad: este resumen se tradujo utilizando herramientas de inteligencia artificial y aún no ha sido revisado ni verificado.

Aspectos destacados de la revista

Aprendizaje automático Arquitectura de Computadores Biología Computacional Cibernética Ciencias de la Computación ComputadoraInteracción humana Gráficos Ingeniería Informática Inteligencia artificial La seguridad informática Lenguaje de programación Procesamiento de datos Realidad virtual Red de comunicacion Redes neuronales Sistemas de gestión de bases de datos Sistemas de información Tecnologías de la información Teoría de la Computación

Indexado en

Google Académico

Academic Journals Database

Open J Gate

Academic Keys

ResearchBible

CiteFactor

Biblioteca de revistas electrónicas

Búsqueda de referencia

Universidad Hamdard

director académico

Factor de impacto de revistas innovadoras internacionales (IIJIF)

Instituto Internacional de Investigación Organizada (I2OR)

Cosmos

Revistas internacionales

Ciencias farmacéuticas Ciencias Generales Ciencias Médicas Ingeniería

Revista de investigación global en ciencias de la computación

Abstracto

EXTRACTION OF STRUCTURED INFORMATION FROM UNSTRUCTURED OR SEMI- STRUCTURED MACHINE READABLE WEB PAGES

Aspectos destacados de la revista

Indexado en

Revistas internacionales

Dirección