I have published this post yesterday on LinkedIn >>
One week ago, Google announced that it was using a machine-learning artificial intelligence (or AI) system called “RankBrain” to reduce human intervention within its search results. RankBrain is the nickname given to the proprietary AI used by Google to handle search queries. RankBrain is not to be confused with PageRank which is the name that Google gave to its first ranking algorithm back in 1998.
I read a post published by Danny Sullivan entitled “ FAQ: All About The New Google RankBrain Algorithm”. I read another by Dan Shewan called “RankBrain: A Primer on Google’s Artificial Intelligence Technology”. I read several more including one by Bloomberg News. What is interesting is that they all missing a fundamental understanding of Google computing. Not only that they did not even check some related Google products.
Bottom line: Google’s RankBrain is an old hat (and not a good one) Artificial Intelligence machine!
Let me explain. I am developing an exciting brand new site search software package called OBS that uses an AI module. To save time and effort, I used AI and other modules located in the Google Cloud Platform. I handed the task to my engineer who is a Software Engineering PhD Candidate to make it work. After spending several days with the Google Prediction API he concluded that it was impossible to make it work for what we wanted it to do and concluded that “it suffer bugs and glitches”. Google Prediction is nothing new under the sun as announced at “Google I/O” in 2011. Have they made work? (ER: check the last couple of sentences to see if they still mean what you intended)
There are two other challenges we met with OBS machine learning development (we ended up developing our own AI system):
Number one is the training set. In a nutshell, you have to “train” the algorithm against a dataset. The network then processes the inputs and compares its resulting outputs against the desired outputs. This process occurs over and over as the weights are continually tweaked. The set of data which enables the training is called the “training set”. During the training of a network, the same set of data is processed many times as the connection weights are ever so refined (source: University of Toronto).
Number two is how do you tweak a dataset within a dynamic e-commerce database? How often do you refresh the database? Can Google really build a dataset to cover all its huge index topics?
RankBrain – The Unasked Questions
- Google Prediction. Is Google using the ’Google Prediction’ machine or it is an entirely new system?
- Training Set. The training set has to be an essential component of RankBrain. So how often is Google refreshing (retraining) the data set? How big could this data set possibly be?
- Natural Language Processing (NLP). For many years Google has been using an NLP system. What is the different between NLP and RankBrain?
RankBrain – The Real Truth
In my humble opinion the above questions lead to one and only one answer: RankBrain is built to work on a very tiny data set that does not have to be refreshed frequently like Google’s Knowledge Graph. The Knowledge Graph display results without need of an extra click. E.g. Answer Boxes (“what is knowledge graph”), Personal Information (“bruce smith”), Nutritional Facts, etc. Thus, it has very little impact on Search Engine Optimization (SEO).