The Google Sandbox
The Google Sandbox is a term invented
to describe the failure of certain websites to be ranked by Google for certain search
phrases as highly as one might expect. Ideas about what the sandbox is and what it
isn't have varied somewhat since the original "discovery", but the following
features are relevant to my own
website (http://whatismusic.info/) and it's
Google ranking for the term "what is music?":
- Site has reasonable pagerank (in this case 4).
- Low ranking for the search itself. In
my case, the website does not occur anywhere in the top 1000 results (and Google never returns more than 1000 results).
- High ranking if you use one of various special qualifiers, for example intext:"What is Music?" puts my site number 2.
- High rank for Yahoo (number 16)
and MSN (number 3).
- Looking in the logs, more search engine referrals from Yahoo
(about 75%) than from Google (about 13%). Compare to about 79% and 14%
for my personal website (www.1729.com).
- Looking again in the logs, for referrals on the exact search
phrase, unfortunately my traffic analysis package breaks search phrases
into individual words, but the following is a percentage of hits on the
word "what" for a certain time period:
- Yahoo: 68
- Google: 5
- MSN: 4
- AltaVista: 3
Possible Causes
The major underlying causes of the sandbox are assumed to be filters that Google applies to
new websites to prevent various forms of "link spamming". Since Google gives you rank based on
the rank of incoming links, a quick and easy way to promote a new website is to suddenly point a
whole lot of links at it. The purpose of the filters is
supposedly to distinguish between "legitimate" and "illegitimate" incoming links.
Since not every new website suffers from the "sandbox", or not in such an extreme
fashion as some websites appear to suffer, various speculations have arisen as to specific
aggravating factors. In relation to my own case, some plausible theories include:
- Too many links from one site to the target site. In this case it would be from my
personal www.1729.com website to this site. After observing that
many of the visitors to www.1729.com visited individual pages directly (from search engines or
other referrals), I decided to put a mini text-ad on each page of the www.1729.com. I had
no thought of spamming Google; in fact I assumed that Google would automatically count multiple
links from one website as being essentially equivalent to just one link. To make things even
worse, I decided to notify new pages on the whatismusic.info site in an RSS file that I had
on www.1729.com, since apparently some people were subscribed to that RSS file. I have responded
to this possibility by deleting these links, and for RSS I will need to set up a separate file
on the whatismusic.info site. If this factor does contribute to sand-boxing, then it
implies that you get penalised for being "advertised", where the intention of the advertisements
has nothing to do with trying to game Google. Presumably Google does not punish sites that
are advertised on Google Ads.
- Too many hits on individual words in the search phrase.
A specific number is 80,000,000.
In the case of "what is music?", the first two words are words normally left out of unquoted searches,
and "music" is also a common word. Exact hit counts on Google are:
- what: 869,000,000
- is: 3,888,000,000
- music: 480,000,000
Possible Cures or Work-arounds
- Patience. Sites are apparently "released" from the sandbox as time passes. Whether this
is due to them passing some time limit, or refinement by Google of their link-spam-detecting strategies,
or something else entirely, nobody really knows (at least nobody outside of Google).
- Avoid multiple links from site A to site B, if that might be a contributing factor. Done, even
though it loses me the opportunity to "self-advertise".
- Write more content. I have been adding content, but this has failed to get me into
the top 1000 for "what is music?".
- Write a letter to Google asking them to explain. (I guess they already get a lot of those.)
- Add a page on an existing older site which is targeted for the given search query,
and have that point to the new site. I haven't yet tried this, but I am tempted to, as long as
I can be sure of doing it in a way that doesn't create some new risk of being identified by Google
as a website that is "trying too hard".
- Pick a new search phrase. It's a bit late for that, given that I have spent over a year writing a book
called "What is Music? Solving a Scientific Mystery", the purpose of which is to explain my theory
which attempts to answer precisely that question.