-
Notifications
You must be signed in to change notification settings - Fork 1
Documentation
class ReviewMiner(df: pd.DataFrame = None, id_column: str = None, review_column: str = None)
df: pd.DataFrame, default=None
a data frame where each row is a comment/review; The data frame should have at least an ID column that stores the unique IDs of the comments, and a review column where the actual comments/reviews are stored. You can initialize the class without df
if you just want to use some of its methods to analyze external datasets. You can assign values to df
later by <class>.df = <your_data_frame>
.
id_column: str, default=None
the name of the column that stores the unique IDs of the comments.
review_column: str, default=None
the name of the column where the actual comments/reviews are stored.
- one_time_analysis(report_interval: int = None)
One time analysis to display popular aspects and opinions, distribution of sentiment scores of each comment, sentiment scores for common aspects, and aspects with the most negative comments.
- Parameters:
report_interval: int, default=None
It might take quite a while to extract the aspects and opinions if the dataset is very large. When extracting all the aspects and opinions, the function will report progress for every report_interval
comments. When there're more than 500 comments and there's no specified report interval, the function will report progress every 10% of the comments. When there's no more 500 comments and no specified report interval, the function will only report when it finishes for all the comments.
- aspect_extractor(sentence: str)
Extract aspects (noun phrases and nouns) from a sentence
- Parameters:
sentence: int
The sentence for analyzing
- Returns:
candidate_aspects: list
a list of aspects in the sentence
- aspect_opinion_for_one_comment(comment: str)
Extract aspects and opinions for one comment (which can consist of many sentences)
- Parameters:
comment: int
The sentence for analyzing
- Returns:
aspect_opinion_dict: dict
a dictionary with the aspects as keys and the opinions wrapped up as a single string of words separated with ' ' e.g. {'bedroom': 'sunny spacious', 'wardrobe': 'beautiful'}
-
aspect_opinon_for_all_comments(report_interval: int)
-
Parameters:
report_interval: int, default=None
It might take quite a while to extract the aspects and opinions if the dataset is very large. When extracting all the aspects and opinions, the function will report progress for every report_interval
comments. When there're more than 500 comments and there's no specified report interval, the function will report progress every 10% of the comments. When there's no more 500 comments and no specified report interval, the function will only report when it finishes for all the comments.