Game Review Classifier
Introduction
Data mining is a process of finding patters and correlations in large datasets to predict the outcomes using wide range of techniques like machine learning, statistics and databases.
The goal of this project is - Given a Game review predict if its a Positive or Negative review.
The dataset used is a Steam Game Review Dataset. As we see the gaming community is growing really fast and during this pandemic the steam had its highest usage hit. As I am one of the contributor in its usage I decided to take this dataset.
This dataset has a lot of scope for future developments as there are lots of games linked to it. If we consider 1 particular game and take the games data it would allow us to explore a larger dataset and also build models that can predict the way players play and also the outcome of the games based on the data extracted from the game
The Game Review Dataset has the Reviews, Recommendations which helped me in training the model and build the classifier
The process followed is Data Preprocessing --> Data Visualization --> Data Cleaning --> Building Classification Model
In the preprocessing step we checked for the null and duplicate values which could have affected our classifier and deleted all those rows. In the visualization step we were able to look at the data in different perspectives which helped me in understanding the data better. In the data cleaning step I converted all the reviews in the datasets into lower case and also removed all the special characters from it as they would affect the model and also assigned integer to the recommendation column so it would make it easier to move further.
The reviews posted are text data we have to convert them into TFIDF form
tf-idf(t, d) = tf(t, d) * idf(t),
and the idf is computed as
idf(d, t) = log [ (1 + n) / (1 + df(d, t)) ] + 1
Model Building
I have used Multinomial Navie Bayes, Support Vector Machine, Random Forest Classsifier as these are text classifiers
Out of these SVM gave me the best accuracy.
Why?
As mentioned above steam has a lot of scope and this review classifier can help me in further building the game recommender. As steam has millions of user who are trying out new games everyday people will be looking for recommendations so this app can help them select the right game. This classifer acts as the starting point of the recommender which can further be developed into complete working application