Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. It is very important for an ecommerce like Olist to enable such a method to use all the different product reviews in a smart way to boost the overall sales, either by developing a recommendation algorithm or get better feedback from reviews with no score.
In this article Sentiment Analysis technique to the reviews data of Olist dataset is going to be applied to identify reviews as positive, negative or neutral.
Afterwards a prediction model is trained. Finally predictive performance of the algorithm is assessed by comparing our predictions with the actual score as we have the actual scores for each review.
To achieve that, we used this auxiliary dataset. The dataset includes 785814 tweets in Portuguese that are labeled either as negative or as positive tweets. The main idea here is to use those tweets reviews as data to train our prediction algorithm and then use this algorithm to predict the sentiments of the product reviews of the store. Lets get to work!
In the end we were able to achieve a score of 91% for predicting the sentiment of the reviews of products. That is not bad. Next is to develop Recommendation System.