Abstracto

Graph Clustering and Feature Selection for High Dimensional Data

K. Jaganath, Mr. P. Sasikumar

Feature selection techniques are used to select important items in the transactional data values. The features are used for the classification process. Clustering techniques are used for the feature selection process. Graph based clustering techniques are used to group up the transactional data with similarity values. Correlation similarity measures are used to identify the relevant and irrelevant features. Features And Subspace on Transactions (FAST) clustering-based feature selection algorithm is used to cluster the high dimensional data and feature selection process. FAST algorithm is divided into two steps. In the first step, features are divided into clusters by using graph-theoretic clustering methods. In the second step, the most representative feature is selected from each cluster to form a subset of features. Features in different clusters are relatively independent. The clustering-based strategy of FAST has a high probability of producing a subset of useful and independent features. Minimum-Spanning Tree (MST) clustering method is adopted to ensure the efficiency of FAST. Feature subset selection algorithm is used to identify the features from the clusters. The feature selection process is improved with a set of correlation measures. Dynamic feature intervals can be used to distinguish features. Redundant feature filtering mechanism is used to filter the similar features. Custom threshold is used to improve the cluster accuracy

Descargo de responsabilidad: este resumen se tradujo utilizando herramientas de inteligencia artificial y aún no ha sido revisado ni verificado.