Data alignment, Data annotation, Web databases, Wrapper generation
The internet give a great level of good knowledge which is usually formatted for its users, which make it troublesome to extract relevant data from various sources. The WWW (World Wide Web) plays a major role as all kinds of information repository and has been so far very successful in broadcasting information to humans. For the encoded data units to be machine processable which is indispensible for many applications, like deep web data collection and internet comparison shopping , they need to grouping and allot a meaningful labels. An automatic annotation approach, first align a data units on a result page into dissimilar groups, such that same group data have the same meaning or semantics. For each group annotate it from different feature and collective the different annotations to predict a final annotation label.