This Github repository is generated for our work on UBIS: Unigram Bigram Importance Score for feature extraction and selection using graph of words which is accepted in Expert Systems with Applications.
It is interesting to note that the importance of uni-grams and bi-grams may contribute more efficiently in determining the feature space vector. In this research work, the Graph of Words (GoW) based selective feature extraction technique is proposed as Uni-gram Bi-gram Importance Score (UBIS) as obtained from node score and edge score in Graph of Words.
- Proposed the Unigram Importance Score (UIS) and Bi-gram Importance Score (BIS) for UBIS Feature Extraction technique. The proposed feature extraction technique gives reduced feature set and thus, the reduced complexity
- Theoretical validation of the semantics of Graph of Words evolved from short text for UBIS feature extraction technique. The study of the stability of GoW and modelling the Assortativity of GoW gives useful insight about the data
- Analysis of the UBIS and baseline techniques over three different dataset for varying Ratio of Training and Testing (RTT) set. The experiments show that the reduced set of features is extracted with significantly improved results.
We improved the results with UBIS upto 5.7% and 5.3% of F-measure and Accuracy, respectively.