TL;DR
Internet is full of “fake reviews”. In order to spot them look at the review dates clustering, reviewers profiles and reviews relevancy to the specific product.
Always Analyze the Reviews
Just like many of us, I rely a lot on the product reviews while deciding what to buy. What could be a better indication of an awesome product than reviews of people like myself? Well, that is if the reviews are written by real people, i.e. people who are genuinely interested in sharing their good or bad personal experiences. Unfortunately, many sellers weaponized the reviews system to their own benefit and flooded internet stores with “fake reviews”. This post is about spotting these “fake reviews”.
So let’s take a look at one example from Amazon.com, here is the link: https://www.amazon.com/Lean-Mastery-Collection-Manuscripts-Enterprise/dp/B07LHJXPHK/ref=cm_cr_arp_d_product_top. Let’s call it “the first book”. What sold this book to me is the overall rating of this book. 4.6 out of 5 for a collection of 8 books sounds like a great deal, doesn’t it? I was on the go and needed something to listen while being bored traveling between multiple countries and meetings, didn’t have time for the reviews analysis and I got a good promo price for this book ($0 is a very good price for it), so I bought it on the spot.
After I finished reading the book, I started wondering how could it possibly receive such a high praise from the people who read it. That totally didn’t correlate with my impressions about it. I’ve read enough books to know what a 4+ start book is supposed to be and this book totally wasn’t worth more than 3* in my opinion. Let’s click on the ratings link and see if there is anything suspicious about these reviews: https://www.amazon.com/Lean-Mastery-Collection-Manuscripts-Enterprise/dp/B07LHJXPHK/ref=cm_cr_arp_d_product_top?ie=UTF8#customerReviews
Dates Distribution
First of all, select sorting by “Most Recent” (the default is “Top Reviews”) and start looking at the dates. See how the distribution of the reviews dates is so uneven? For instance, there were:
Date Range | Days in the range | Number of reviews |
Oct 17 – Oct 27, 2019 | 11 | 11 |
Jul 4 – Oct 1, 2019 | 90+ | 1 |
Jul 1 – Jul 4, 2019 | 4 | 4 |
Apr 25 – Jul 1, 2019 | 70+ | 0 |
Apr 12 – Apr 25, 2019 | 14 | 4 |
Dec 13 – Dec 17, 2018 | 5 | 5 |
Let’s compare this distribution with the reviews distribution on another book (let’s call it “the second book”): https://www.amazon.com/Startup-Way-Companies-Entrepreneurial-Management/product-reviews/B074G5T77M/ref=cm_cr_arp_d_viewopt_srt?ie=UTF8&reviewerType=all_reviews&sortBy=recent&pageNumber=1
- 1 review in August 2019
- 1 review in June 2019
- 1 review in April 2019
- 1 review in October 2018
Both books are written on the same topic (Lean startups). Interestingly, both are written by a person with the same last name Ries. The second person is a very well-known enterpreneur , blogger, author and inventor of the Lean Startup concept Eric Ries. The first one is a person I wasn’t able to find much information about at all – Jeffrey Ries. I still wonder if this is a coincidence and if these two people are related in any way.
The second book has a slightly worse rating than the first one: 4.3 vs 4.6. It also has 75 ratings vs 29 ratings the first book has. But the distribution of ratings is completely different! While the first book’s reviews look like they were submitted as a series of campaigns (exactly 1 review per day clustered within just several days, centered around few dates), the other one has even distribution across the entire period of this book’s existence on Amazon’s virtual shelves.
You may argue that these dates could coincide with some promotional events when the author runs paid advertising companies for the book. Well, even in this situation the promo dates are the dates when people buy and, hopefully, start reading the book. There is no way people would spend the same time reading these books and decide to start positing review at the rate of exactly 1 review per day, that’s not how humans work. So this clustering is at least very suspicious.
Reviewers
Ok, let’s keep going and look at the authors of the reviews for the first book.
- Rizwan Khan (https://www.amazon.com/gp/profile/amzn1.account.AHMUNBRPUIQNCSM7GTHQ45VFQVEQ/ref=cm_cr_getr_d_gw_btm?ie=UTF8) is a genius and a bookworm. Just today (Oct 27th, 2019) he finished reading and submitted reviews for 9 books varying from Keto Meal Prep to Python Machine Learning to High School Placement Test. And he totally loved them all! Overall Mr. Khan has 2483 reviews of various products.
- Sara Blech is a very shy person, she hid all her 1593 reviews (https://www.amazon.com/gp/profile/amzn1.account.AFEDFIURTHOZYVPA4QOUE5MNQQYA/ref=cm_cr_getr_d_gw_btm?ie=UTF8)
- Nusirat Ishola (https://www.amazon.com/gp/profile/amzn1.account.AFQ7KLACOCRDQOHVEOFQ3VPFZEVQ/ref=cm_cr_getr_d_gw_btm?ie=UTF8) is a dieting cryptocurrency miner currently looking for a job that requires leadership skills while studying Spanish and investing in stocks. At least that’s her interests based on 14 reviews left on Oct 26. No wonder that with this spectrum of interests and insane reading speed she already left 3100 reviews on Amazon. I really enjoyed her reviews too, my favorite ones are “High quality made books” and “Very suggest this book”. So Amaze! Much Grammar!
- Comparing to others, Nana Vai is a bit boring (https://www.amazon.com/gp/profile/amzn1.account.AG5QPHN3FZE4KZEN3WQSPDCU4PEQ/ref=cm_cr_getr_d_gw_btm?ie=UTF8). Most of her 1758 reviews are titled either “nice book” or “good book”, but still, she totally loves them all
In addition to having supernatural abilities to read mind-boggling amount of books per day every day, most of the reviewers who decided to share their photos are in modeling business as a side job. At least that’s what their avatars look like: very professional and stock-quality.
In contrast, Eric Ries’ book reviewers are not superhumans. Most of them only have less than 200 reviews. Some of them didn’t like this book. Most of them liked some things and didn’t like others in their reviews. Some of them use stock photos and some used standard smartphone-quality photos. I can’t even pinpoint and highlight any one specific reviewer, they all look like ordinary people to me.
Reviews
Ok, what about the reviews. Let’s compare the texts of the reviews for the first book and the second one and see how the language of superhuman reviewers is different from what looks like regular humans:
The first book:
- “Nice !! The information provided on these pages, as well as the suggestion it gathered, while other readers just starting in business can understand the material,Keep up the excellent work.Highly recommended. it will take some research of terminology on their part.” – note how this review can be about virtually any book. It doesn’t have any specifics. Also watch the grammar and completeness of the sentences!
- “Nice ! Great book really. everything is clear, in principle the book is quite worth its money!” – the author of this review is another person, but it basically mirrors the previous abstract review
- “Loved it. I have best found this book. I really enjoyed this book. Lot of helpful book” – Hmmmm … what say you did?! I confuse.
- “helpful book. Easy and quick to read. ‘Beginner’ level presentation of a new concept. Repetitiveness is likely intentional to help learn a new way of thinking.” – another abstract review, but this time at least it’s somewhat accurate. The book is absolutely basic (if not primitive) and the author repeats himself a lot.
- “Use it frequently at work. I work Lean projects every week at work and I refer to this book often. Well worth its price, especially as an intro to the basics and a quick desk reference.” – one doesn’t simply “work Lean projects”. The person who wrote this didn’t read the book and has no idea about its subject.
The second book … well, I’d rather not copypaste those reviews here, you need to click this link and take a look yourself: https://www.amazon.com/Startup-Way-Companies-Entrepreneurial-Management/product-reviews/B074G5T77M/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews. Reviews for the second book are a lot longer and a lot more detailed. You can’t just copypaste those reviews into any other book review because they rely on the context of this specific book and contain a lot of references to its contents.
Bottom Line
First of all, let me say that I have no idea why the first book has so many reviews from so many unusual superhuman beings. I’m leaving these reasons to your imagination.
Second, I’m not necessarily saying that the first book was absolutely terrible. In fact, I found some useful information in it. Personally, I didn’t like it much because of how repetitive and shallow it was and that it sounded like the author of the book never actually dealt with real projects and all his knowledge about agile and lean is based on reading other, better books. So to me this book is a solid 2-star read.
The point of this post is to illustrate three simple methods to identify “fake reviews”. I’m using term “fake reviews” in quotation marks to name the reviews that will be useless to most, if not all, readers because they were not written by people like you an me. They were written by someone (or something) who didn’t even bother reading the book with the primary goal to leave a 5-star review for reasons I’m leaving to your imagination.
The 3 methods are:
- Sort the reviews by “Most Recent”, look if the dates are distributed evenly or clustered around certain dates. In the latter case the reviews are very likely fake
- Sort reviews by “Top Rated”, check the profiles of the top 3-4 reviewers. If they do not look like regular humans, then the reviews are most likely fake
- Read several “Top Rated” reviews. If they are extremely short or or look like these could be reviews for virtually anything then these reviews are most likely fake
There are more methods to identify fake content, but these three methods are so basic that it makes me wonder why Amazon doesn’t use them. After all, they are offering so many AI/ML solutions to their AWS customers, why not use all these awesome technologies for its own web store? And these three methods listed above do not even require AI/ML, it’s just basic statistics.
Also, please be aware that these methods only work on the cheapest and most primitive fake reviews. There are actually organizations for hire that are capable of producing a lot more realistic reviews and it’s a lot more difficult to identify those. But this is topic worth a separate post.