How to incorporate User Experience into Web Design and overall SEO Strategy Part 1 | Empathy and SEO Keywords

First we should consider, what is the focus of the user experience? Since the user is going to define your business with their experience more than any other aspect of your marketing plan, you should first assume that the focus is to ensure that your user's experiences meet "AT LEAST" your basic expectations of your business model. This means that in every aspect of your marketing approach, your user experience should become the focus of your efforts, not just conversions and sales.   Using this definition, let’s consider how best to ensure a quality user experience from start to finish.  The user experience has to start with empathy for what the user would wish to encounter in your digital business profile.  More and more digital customers want the same effort for an experience placed into digital media as they would expect for arriving at your business in person. For too many years, the marketing method of welcoming users to a site was to simply inundate them with data and articulate information.  While both are important, the same can be garnered using a billboard.  Instead, we need to consider how to welcome them to our sites, even if we are ranking the top for a keyword. (what is the point of ranking the highest if you also have the highest bounce rate?). So, we start with empathy: Why did this customer search this term? Choosing keywords needs to be user oriented.  For too many years, marketing director and SEO gurus have chosen keywords solely based on sales and conversions with little regard to additional reasons a customer may come to their site, as a result, the market has been marginalized when it comes to keyword targeting. Users are not completely satisfied with Search Engine Results now due to the mixed bag of outcomes in the results.  Much of this is the fault of our SEO Strategies that have pushed businesses to the tops of search results by providing the best answer in text, even when they may not be the “best answer” in reality.  This has caused the modern consumer to become less and less impressed by search engine placement. Don’t get me wrong, local services and immediate need sales are still in vital need of top placement in the search engines, but the non-immediate purchases are becoming more and more thought based. Local searches for plumbers will certainly still benefit from the top listing in organic and map, but what about for their non-emergency sales?  The searching customers not requiring an impulse purchase are becoming more reliant on user investigation and “purchase assurance” than just the need to address the issue in the easiest way possible. As a society, we are becoming more diligent about the items we individually care most for, and as a result, our marketing methods have to shift to accommodate for digital narcissism. We are obsessed with the desire to garner the attention of our fellows and become recognized as individual “truth givers” and “knowledgeable people” to our fellows. People inherently work for the acceptance of others, and with the advent of the digital age, people have begun basing self-worth and assessment on how others view their ability to be knowledgeable resources of data and information.  As a result, you should start by considering what the individual will be looking for to satisfy that internal need and proceed with a marketing strategy that incorporates this knowledge. Google coined the term the “0” Moment marketing Truth, I would go to say that this is the precursor to it. Empathizing and breaking down the average user to his core understanding has to come first. Doing so allows us to choose keywords based off of what the user actually wants in the way of an experience, not just what we wish to sell them. There are many deeper aspects to this line of thought.  Next, we will discuss why finding the customer/visitors interests before they do is a vital use of this empathy and the next step in your User Experience Based marketing strategy.

How SEO and Reputation Content Countermeasures Are Ruining The Internet Part 1

Marketing has always consisted of presenting a narrative of your product, service, or even ideas and crafting them in a manner more pleasing to the audience.  With the increase in data sharing and interactions online, this has only grown.  The problem with this growth is that it comes with an increasing rate of growth for Content Countermeasures that are intended to stifle, erase, or completely distort the truth about specific content online. Content Countermeasures are nothing more than the attempt to deceive the public with exaggerated, inflated and, in some cases, invented information. In some cases, this serves a very legitimate purpose, like incentivizing positive reviews and ratings from clients to overcome someone who griped about not getting his water and hot bread fast enough when entering a restaurant.  There are certainly legitimate cases like this for businesses to present a positive image of themselves to the public. There is a duality of this topic as there are those who purposefully mislead the public with positive or negative spam of business content, causing a countermeasure to any possible chance of receiving accurate information.  While this might seem like a trifle act, it damages the ability to have a reasonable expectation of receiving correct and verifiable data in search results. "Reputation Management Scams" When it comes to the dirtier uses of Content Countermeasures, the "reputation managers" of the world are almost always going to be one of the top offenders.  The ads heard on talk radio and pushed for local companies are usually nothing short of bragging on how they can spam the public with disinformation about your company or yourself.  Sure, it's sold as "restoring your good name", but if you are going to go through extreme steps(and reputation management often requires extreme steps) to garner enough faked or duplicated content to push down the negative reviews and ratings, then the entire concept of public relation is in shambles. Again, this isn't being critical of those who have a solid and honest goal of ensuring honest content about their business (because we will talk below of the content assassins below).  Let's be clear, if you have to market your company or yourself with false or spammed information, then you or your product simply aren't worth what you are trying to present them as. Scam Spam The same holds true for people who want to push down legitimate information to hide their concerning content from the public.  Content writers are notorious for this one. We get a different one filling our spam folders every week with messages bragging about how they can spam the highest ranking blogs on the internet ... for a price of course.  This is a two-pronged version of Content Countermeasure SEO.  First, they are spamming mass content about themselves on the front end of the conversation, then, they have to do something to remove data from all of the people who are complaining about the unsolicited content.  There will be the guys who have dozens, sometimes hundreds of, blogger, webly, Tumbler...etc, sites with several variations of their own name and all with content claiming to be the most relevant.  We've found the same thing done lately with people who have warrants and want to confuse police (from identifiable IP addresses though, so not sure how well that works).  Additionally, they will play the game of buying any and all variations of ones name in domain format, and again, spamming the internet with redundant and often obscure data to attempt to divide relevancy on the person's identity.  We ran into one of these guys recently who had over 200 variations of his own name out there, all to push down the 40-50 ripoff reports filed about his poor business practices. The spam data traces act as a SEO Countermeasure to prevent people from finding about his actual business practices, and instead focus on a false narrative that is completely opposite to reality. No, I am not coming down on narratives themselves.  As I said in the intro to this post, narratives are shaped by the marketer, but as soon as they are completely falsified narratives, then nothing is left but a dishonest scam being perpetrated on the reader.   The petty marketers who believe that there is a magical line of lies they can hover on and still have their integrity intact are some of the most genuine and shining examples of cognitive dissonance available. SEO Assassins These are the lowest bottom-feeders of he internet. They include "Yelpers", "Competing Reviewers", and all the others willing to destroy.  Content Assassins, or SEO Assassins are often written off as competitors or disgruntled employees, but we've found more instances lately of little to no association being the culprit. The internet gives strength to those who wish to do damage to others with impunity. A restaurant client of ours apparently slighted a patron by serving his therapist.  The individual saw this as a slight against him, and went on to commit a fake review and spam campaign to destroy the restaurant's reputation. Using a photoshopped comment image, the spammer made it appear that the restaurant (run by a gay man) had made anti-gay statements to him on Facebook. This one fake image, posted in several LGBT social media groups, caused over 500 negative reviews in one night. By the time the  restaurant came to us, we had over 900 review accounts to send individual requests and explanations to.    

Low-Cost, HIgh-Impact Ways Businesses Can Boost Local Sales

If you're a small business owner, you've likely felt some pains because of the tighter lending regulations from banks.  Your advertising budgets are already small or non-existent, making advertising even harder now.  Fortunately, digital advertising has some low-cost options, allowing small businesses to be able to expand their limited budgets.

Many small businesses have Facebook and Twitter pages, and maybe a few listings in local directories, but that is the extent of their advertising. This is not for lack of interest, however.  In a recent study released by market research firm BIA/Kelsey, 40 percent of small and midsized businesses plans to increase digital spending in the next year.

So how can businesses get the best return on investment from digital, internet and social media advertising methods?

Low Cost Internet Marketing Ideas

It is understandable that companies do not want to put up the cash for traditional outbound marketing techniques, but there are actually a lot of inexpensive (and free) ways to market a business online that are underutilized. If you want to maximize your online presence, look beyond the typical social networks and directories and make use of these options as well:

Register your company with Google’s Places for Business. It's estimated that 97 percent of consumers decide what local businesses to frequent based on an online search? This free service from Google, Places for Business, literally puts you “on the map” so people can find the products or services you offer on a local level. If you do not have an office that is open to the public, you can choose a service area option during signup and hide your physical address. If you have multiple locations, you will want to sign up separately for each spot. Additionally, you will want to be added to several citation sites, to reinforce the Google Places listing. Ensure that the location, phone, and name of your business are the same on each of these sites. Bet results come from consistency.

Join your local Chamber of Commerce. For a small annual fee, you can take advantage of the many networking opportunities your local Chamber of Commerce provides. This is an excellent way to learn about the ways area laws will affect your business operations and also to rub elbows with potential clients. Usually new members get the chance to promote their business on the Chamber’s website or in its newsletter which could lead to a boost in business.

Seek out barter swaps with other local companies. This can take many different forms depending on the resources that you have available to trade. Consider comparable online banner ads or just simply leaving business cards at each other’s physical locations. You may also want to look into exchanging guest blog posts with other area businesses. By linking to each other, you will build up search engine credibility for people searching businesses in your area. There are really no limits here and business swap ideas are free. Try to find companies that make sense with which to trade – for example, a home inspector may find value in a swap relationship with a local Realtor or property management office.

Make a business video. People love to watch videos online, so why not ride that wave? Make a brief video explaining your services or products and post it several places, including your official website and YouTube.  Show your expertise in your field with a video that explains what you do and why you are the best at it.  People like to put a face with a business name, especially when it comes to local companies, so give them a reason to pick you over the other options.


5 SEO Tips for Launching a New Website and New Brand

You may feel that you have the perfect search strategy, but there are still more tactics to consider. Here are some basic SEO tips for launching a new website and new brand: Rebuilding a Website
  •      As you transition from your old website to a new SEO targeted Website, you can maintain the link equity in which you have invested in already. This is vital for getting your new website to rank quickly. The most effective way to achieve this would be to leave your URLs untouched. However, this is not always possible y if you make an attempt to improve your URL structure or moving to a new domain. Often, updating to a strong Database Built Web Design will result in some changes to link mapping.
  •     Images are ideal to make your new website engaging and attractive. Unfortunately, search engines have not reached a point yet where they can decode content that is image-based. The use of alt tags can enable search engines to comprehend the meaning of images; this can also be valuable for those who are visually impaired. Alt tags should be vivid and entail the same keywords for which your page aims to rank.
  •      External links can enable link authority to your website, while internal links can distribute that particular authority across your website pages. In page cross-linking, there are opportunities for appropriate referrals to additional pages with relevant data; they are typically unheeded during a redesign. Page copy can be an ideal place to link to deeper website pages with anchor text to point to long-tail keywords.
  •    Local results have taken a greater standing in the SERPs. Naturally, it depends on your brand; so, local can be your strategy instead of a small portion of your overall plan.  Keep in mind, locally focused works only if your business model is only focused locally. There are several locally relevant recommendations to consider in combination with your fresh new website launch, but most importantly, your maps should always be the focus of local SEO. Our Naperville Web Design and SEO Team targets citation listings (map directories) and locally relevant review sites to ensure for a cohesive and universal set of data descriptions.
  •      A new Google search feature in authorship is expected to increase in relevance over time. Author-Ranking is the future of authorship where pages are ranked based on the authority of the writer, as opposed to the authority of the page/website. An investment in authorship can help you not only in the short term, but the long term as well. We suggest that your content creators work on and off of your sites to develop a stronger authorship ranking.
To find out more information about how SEO can help improve your web visibility, contact our Chicago and Naperville SEO Team.

Avoiding Spam When Building Strong Backlinks Part 2

Last we spoke of the dangers of building backlinks through spammy techniques.  Here are examples of those dastardly techniques and the proper methods of replacing them when dealing with blogs.  Saturday we will discuss the manners of building strong directory references without being spammy. You will need to first download the SEOmoz toolbar for Firefox.  This toolbar has a function that shows “No Follow” tags on page links.  Using this tool will allow you to determine which blogs are worth commenting and interacting with for links and track-backs. Most everyone with a blog has visited or even pursued Tecnorati… (if you haven’t you will after reading this.)  Within Techgnorati, you have access to blogs that are separated by category and relevance.  There are other blog index sites that maintain a strong list of blogs by content, but Tecnorati tends to be the most reliable for the search engines. Once you are on the site, search for a blog that resembles the topic of whatever page you wish to link to.  If you are choosing to link to your main page, choose a blog that is closely associated with the Keywords of your entire site.  If you are using the link of a specific page, search for a page that has that specific topic.  Tecnorati has a ranking system that allows the user to comb through different levels of relevance. Obviously, you want to get links from the most relevant site to the topic being covered. Using this method, you’ll want to start by performing a “Time Honored Blogging Tradition” RTFA!!! Read the article people.  Seriously, if you don’t know what the topic is, it will show in what you’ve written. Once you read the article, then it’s time to interact. Make sure that the comment you leave is more than the following junk: “nice post” “I agree with the points you made” “good thoughts but I take issue with your points” These comments are common and quite annoying pieces of spam.  These statements will likely be caught in any filter and removed from any site that cares about their relevancy.   It would always be suggested to be Part of a conversation.  If you’ve read a post, use enough of the material within it to make a valid statement. Here’s a list of other rules that will avoid negative treatment for blog comments:
  • Only leave comments that are a full, comprehensible sentence.
  • Sign up for updates on future comments.  This can ensure many links and a future relationship between your site and the site you are commenting on.
  • Leave only on link on the site, the one in your name description.  Leaving tons of links in the content of your comments makes it strongly resemble an unwanted communication, leaving many to list your comments as spam.  If your site links or email addresses becomes associated as spam, it’s very likely that your future comments on other blogs will be filtered as well.
  • Leave the auto commenting software to those who don’t mind being banned from the search engines.  It’s just not worth it.
  • Read the articles!!! There will always be a better exchange of ideas when you do and you’ll likely receive more convertible visitors to your website if they believe your communication to be respectful.

Ensuring Quality Backlinks For Your Website Part 1

This is the first of three post that will be added over the next few days.  There is very little chance that the thoughts will stay completely congruent, but there's just far more than one post within this topic... While linking structure has always been an important piece of the SEO equation, it seems that a more intuitive linking structure is becoming more and more necessary for modern SEO development.  The day of submitting to random directories or commenting on unrelated blogs is long gone.  Spam comments and unrelated directory listings cause damage to website relevance and search engines are constantly looking for ways to counter their effects. That being said, there are highly effective methods of getting related backlinks, they just require more work than the spammers and scammers are likely to give. If you’re wondering, “is Christian about to rail on the Spammers again?” the answer is always, yes!  Spammers not only made the system harder for all of us, they’ve caused themselves to become endangered.  Yes, you can still find tons of affiliate websites offering their “Link Building Software”.  This software is usually some form of Blog Spamming software.  While these sites can be found all over the place, they’re owned by only a small group of spam-jockeys who and are becoming marginalized by people becoming more aware of affiliate scams and MLM’s.` Ok, now that we got that out of the way, blog backlink software is going to do nothing for your site but possibly get it banned from Google.  Not only is the software disastrous for your reputation with the SERPS, they almost always have a very short and mitigated list of sites to choose from.  They claim that their program makes searches through Google, but this is only a half truth, they are most often pre-rigged with a select amount of search criteria.  The resulting searches leave the user with the same tired group of blogs that were used 1,000’s of times before by other Spammers. Most of these sites and pages that have been bombarded by spam are already flagged by Google.  Add your link to this page, and you risk being considered spam as well. That out of the way, the question of how to get the quality links and avoid being all spammy is going to be asked.  Unfortunately, the answer will have to wait until tomorrow. We're out of coffee.

Excellent Free SEO Monitoring Websites

Like everyone else, we are constantly looking for ways to save money on our bottom line.  We are often testing and grading monitoring and Analytic websites and software to determine which is the most effective and gives the most accurate results If you've played with even half of the major names, you've likely run into and identified one major problem... they all give different numbers.  Many find themselves wondering which ones to trust and what to make of the numbers they've received.  Here is our take on the top 6 site Analytics and why we use them.
  1. Google Analyticor- OK,  To those who always wish to bash the resource, I start by agreeing with you that Google looses a lot of information when scripts are blocked.  That being said, the number of people using script blockers is still not that high, so it's really a moot point.  Google Analyticor is the most widely used website traffic index and is at least one must for web traffic monitoring. -to avoid the debate, that's all we're saying about Google today-
  2. Majestic SEO Tools-  While Majestic has been around and has had a major resource for a few years now, the past year has offered a lot of new and useful tools within the site.  While it requires an account to gain access to some of the better tools, they are well worth the free account and the ten minutes it takes to set it up.  After a month of indexed tracking, Majestic provides a complete  breakdown of all your website backlinks.  This is quite useful in determining and targeting where to make more backlinks and what anchor text to use with them. Granted, there are plenty of costly and buggy forms of software that will perform this function, but most of them accompany an affiliate scam or someone's get rich quick "Pyramid Scam".  While the data for the links may be lagging, it's fairly respectable in determining the  needs of a backlink campaign.
  3. Quantcast -  Same as below...
  4. Compete-    Do they help? Honestly, not so much.  What they do offer is valued information for those considering advertising on your website.  While many would argue, and often I'd agree with them that the information in both sites is a random guess, they are free and offer a strong form of traffic evidence when looking to sell ad-space on your site.  Whether you like them or not, it would be suggested to get listed within their database.  While you may not like or trust their services, most advertising groups are going to want to see their numbers to put ads on your site.
  5. SEOmoz Free Tools-  These are an incredible assortment of custom tools designed for and by SEOmoz.  The SEO tools and link structure tools provided here are much stronger than most of their competition offers.  While we have yet to try their Pro Tools for review, we will soon be putting it to the test.  All I can say it that if the pro tools excel well above the free tools, we will likely remain a customer and refer our clients.  The free tools include several strong tools, but the best one and most unique is the "Trifecta Tool".  This tool gives a full breakdown of a site in Page, Domain, or Blog format.   The information given here is usually free in it's individual formats, but the way that they collate and arrange the data makes for a clean and valuable view of a websites performance.  It's well worth signing up for a free account to try it out.
  6. HUBSpot Program - What can't be said for the brilliant strategy of Inbound Marketing that HUBSpot has highlighted with this and many of its other websites. is one of the most widely used website evaluation websites in use today.  With over 2.6 Million sites graded to date and an Alexa ranking around 2,000, they are becoming the most widely use "quick check" for determining any websites effectiveness.  We, like anyone else who has received it, value the 99% grade we were given by and view and updated score from them on a weekly basis to see what adjustments need to be made.  If you have yet to run your site through their system, give it a try.  The information given will only serve to increase the reach and effectiveness of your website.

PrestaShop Devlopment for e-Commerce in Chicago

We are pleased to announce that Naper Design will be developing our first major PrestaShop Template and Website.  While we've enjoyed playing with it for some time now, we will finally have the opportunity to develop a major site with this highly versatile E-Commerce solution. Here's a description of PrestaShop from the core site: "PrestaShop e-Commerce Solution was built to take advantage of essential Web 2.0 innovations such as dynamic AJAX-powered features and next-generation ergonomy. PrestaShop guides users through your product catalog intelligently and effortlessly, turning intrigued visitors into paying customers" PrestaShop began as a network and community maintained e-Commerce Software. The script is maintained in much the same way Wordpress is.  The most up to date version of PrestaShop can be downloaded here. It offers the Following:


  • Featured products on homepage
  • Top sellers on homepage
  • New items on homepage
  • 'Free shipping' offers
  • Cross-selling (Accessories)
  • Product image zoom
  • Order out-of-stock items
  • Customer subscription & user accounts
  • Payment by bank wire
  • Google Checkout module
  • Cash-On-Delivery (COD)
  • Preconfigured for Paypal


  • Unlimited categories & subcategories, product attribute combinations, product specs, images, currencies, tax settings, carriers & destinations
  • Real-time currency exchange rates
  • Inventory management
  • Unlimited languages & dialects
  • 100% modifiable graphic themes
  • Newsletter contact export
  • Alias search
  • SSL encryption
  • Visitors online
  • Customer groups
Our first site built in the platform (fr mainstream anyway) will be Malloys Finest. - the site that we are preparing for them can be found here-

Template Monster Releases New PrestaShop Themes

Brooklyn, New York, July 7th 2010 -, the Internets largest templates provider, introduces brand new PrestaShop Themes which are basically a brand new eCommerce solution designed specifically to provide the customers with the speedy and very lightweight tool in setting up their own PrestaShop stores. PrestaShop is a new feature-rich and open-source eCommerce software which has a very powerful back-end for its size. Combined with decent and affordable designs PrestaShop is an efficient eCommerce shopping cart solution available for small and medium-sized online businesses. PrestaShop multilingual eCommerce system is fully customizable, quick and simple to install. And the site owners will appreciate its flexibility as it supports unlimited categories, sub-categories and image pictures. Besides, the store admins can manage the inventory, customers, orders, and payments easily. According to companys authorities, PrestaShop themes provided by TemplateMonster are developed to take advantages of this new shopping cart system. The themes are HTML/CSS Validated and optimized for fast page loading, plus they support all major browsers and have built-in jQuery elements for even more spectacular design. David Braun, CEO of the Template Monster, said, "Our customers have been asking us to launch PrestaShop themes for many months now. And today we can proudly claim that TemplateMonster now offers innovative PrestaShop themes brushed and polished by our website design pros. Dedicated to providing unbeatable functionality for customers online eCommerce presence, our PrestaShop design solutions ensure youll fully enjoy fantastic functionality of PrestaShop. Not to mention that the themes are extremely easy to modify. And of course we are eager to extend our PrestaShop selection by adding more and more new designs into this product type. So be sure to check back for more premium quality PrestaShop templates!" Previously the company has launched the Free PrestaShop Theme for cell phone store. This template is still available for download at Free PrestaShop Theme download page.

Statistics a Win for SEO

Posted by bhendrickson

We recently posted some correlation statistics on our blog. We believe these statistics are interesting and provide insight into the ways search engines work (a core principle of our mission here at SEOmoz). As we will continue to make similar statistics available, I'd like to discuss why correlations are interesting, refute the math behind recent criticisms, and reflect on how exciting it is to engage in mathematical discussions where critiques can be definitively rebutted.

I've been around SEOmoz for a little while now, but I don't post a lot. So, as a quick reminder, I designed and built the prototype for the SEOmoz's web index, as well as wrote a large portion of the back-end code for the project. We shipped the index with billions of pages nine months after I started on the prototype, and we have continued to improve it since. Recently I made the machine learning models that are used to make Page Authority and Domain Authority, and am working on some fairly exciting stuff that has not yet shipped. As I'm an engineer and not a regular blogger, I'll ask for a bit of empathy for my post - it's a bit technical, but I've tried to make it as accessible as possible.

Why does Correlation Matter?

Correlation helps us find causation by measuring how much variables change together. Correlation does not imply causation; variables can be changing together for reasons other than one affecting the other. However, if two variables are correlated and neither is affecting the other, we can conclude that there must be a third variable that is affecting both. This variable is known as a confounding variable. When we see correlations, we do learn that a cause exists -- it might just be a confounding variable that we have yet to figure out.

How can we make use of correlation data? Let's consider a non-SEO example.

There is evidence that women who occasionally drink alcohol during pregnancy give birth to smarter children with better social skills than women who abstain. The correlation is clear, but the causation is not. If it is causation between the variables, then light drinking will make the child smarter. If it is a confounding variable, light drinking could have no effect or even make the child slightly less intelligent (which is suggested by extrapolating the data that heavy drinking during pregnancy makes children considerably less intelligent).

Although these correlations are interesting, they are not black-and-white proof that behaviors need to change. One needs to consider which explanations are more plausible: the causal ones or the confounding variable ones. To keep the analogy simple, let's suppose there were only two likely explanation - one causal and one confounding. The causal explanation is that alcohol makes a mother less stressed, which helps the unborn baby. The confounding variable explanation is that women with more relaxed personalities are more likely to drink during pregnancy and less likely to negatively impact their child's intelligence with stress. Given this, I probably would be more likely to drink during pregnancy because of the correlation evidence, but there is an even bigger take-away: both likely explanations damn stress. So, because of the correlation evidence about drinking, I would work hard to avoid stressful circumstances. *

Was the analogy clear? I am suggesting that as SEOs we approach correlation statistics like pregnant women considering drinking - cautiously, but without too much stress.

* Even though I am a talented programmer and work in the SEO industry, do not take medical advice from me, and note that I construed the likely explanations for the sake of simplicity :-)

Some notes on data and methodology

We have two goals when selecting a methodology to analyze SERPs:

  1. Choose measurements that will communicate the most meaningful data
  2. Use techniques that can be easily understood and reproduced by others

These goals sometimes conflict, but we generally choose the most common method still consistent with our problem. Here is a quick rundown of the major options we had, and how we decided between them for our most recent results:

Machine Learning Models vs. Correlation Data: Machine learning can model and account for complex variable interactions. In the past, we have reported derivatives of our machine learning models. However, these results are difficult to create, they are difficult to understand, and they are difficult to verify. Instead we decided to compute simple correlation statistics.

Pearson's Correlation vs. Spearman's Correlation: The most common measure of correlation is Pearson's Correlation, although it only measures linear correlation. This limitation is important: we have no reason to think interesting correlations to ranking will all be linear. Instead we choose to use Spearman's correlation. Spearman's correlation is still pretty common, and it does a reasonable job of measuring any monotonic correlation.

Here is a monotonic example: The count of how many of my coworkers have eaten lunch for the day is perfectly monotonically correlated with the time of day. It is not a straight line and so it isn't linear correlation, but it is never decreasing, so it is monotonic correlation.

Here is a linear example: assuming I read at a constant rate, the amount of pages I can read is linearly correlated with the length of time I spend reading.

Mean Correlation Coefficient vs. Pooled Correlation Coefficient: We collected data for 11,000+ queries. For each query, we can measure the correlation of ranking position with a particular metric by computing a correlation coefficient. However, we don't want to report 11,000+ correlation coefficients; we want to report a single number that reflects how correlated the data was across our dataset, and we want to show how statistically significant that number is. There are two techniques commonly used to do this:

  1. Compute the mean of the correlation coefficients. To show statistical significance, we can report the standard error of the mean.
  2. Pool the results from all SERPs and compute a global correlation coefficient. To show statistical significance, we can compute standard error through a technique known as bootstrapping.

The mean correlation coefficient and the pooled correlation coefficient would both be meaningful statistics to report. However, the bootstrapping needed to show the standard error of the pooled correlation coefficient is less common than using the standard error of the mean. So we went with #1.

Fisher Transform Vs No Fisher Transform: When averaging a set of correlation coefficients, instead of computing the mean of the correlation coefficients, sometimes one computes the mean of the fisher transforms of the coefficients (before applying the inverse fisher transform). This would not be appropriate for our problem because:

  1. It will likely fail. The Fisher transform includes a division by the coefficient minus one, and so explodes when an individual coefficient is near one and outright fails when there is a one. Because we are computing hundreds of thousands of coefficients each with small sample sizes to average over, it is quite likely the Fisher transform will fail for our problem. (Of course, we have a large sample of these coefficients to average over, so our end standard error is not large)
  2. It is unnecessary for two reasons. First, the advantage of the transform is that it can make the expect average closer to the expected coefficient. We do nothing that assumes this property. Second, as mean coefficients are near to zero, this property holds without the transform, and our coefficients were not large.

Rebuttals To Recent Criticisms

Two bloggers, Dr. E. Garcia and Ted Dzubia, have published criticisms of our statistics.

Eight months before his current post, Ted Dzubia wrote an enjoyable and jaunty post lamenting that criticism of SEO every six to eight months was an easy way to generate controversy, noting "it's been a solid eight months, and somebody kicked the hornet's nest. Is SEO good or evil? It's good. It's great. I <3 SEO." Furthermore, his twitter feed makes it clear he sometimes trolls for fun. To wit: "Mongrel 2 under the Affero GPL. TROLLED HARD," "Hacker News troll successful," and "mailing lists for different NoSQL servers are ripe for severe trolling." So it is likely we've fallen for trolling...

I am going to respond to both of their posts anyway because they have received a fair amount of attention, and because both posts seek to undermine the credibility of the wider SEO industry. SEOmoz works hard to raise the standards of the SEO industry, and protect it from unfair criticisms (like Garcia's claim that "those conferences are full of speakers promoting a lot of non-sense and SEO myths/hearsays/own crappy ideas" or Dzubia's claim that, besides our statistics, "everything else in the field is either anecdotal hocus-pocus or a decree from Matt Cutts"). We also plan to create more correlation studies (and more sophisticated analyses using my aforementioned ranking models) and thus want to ensure that those who are employing this research data can feel confident in the methodology employed.

Search engine marketing conferences, like SMX, OMS and SES, are essential to the vitality of our industry. They are an opportunity for new SEO consultants to learn, and for experienced SEOs to compare notes. It can be hard to argue against such subjective and unfair criticism of our industry, but we can definitively rebut their math.

To that end, here are rebuttals for the four major mathematical criticisms made by Dr. E. Garcia, and the two made by Dzubia.

1) Rebuttal to Claim That Mean Correlation Coefficients Are Uncomputable

For our charts, we compute a mean correlation coefficient. The claim is that such a value is impossible to compute.

Dr. E. Garcia : "Evidently Ben and Rand don’t understand statistics at all. Correlation coefficients are not additive. So you cannot compute a mean correlation coefficient, nor you can use such 'average' to compute a standard deviation of correlation coefficients."

There are two issues with this claim: a) peer reviewed papers frequently published mean correlation coefficients; b) additivity is relevant for determining if two different meanings of the word "average" will have the same value, not if the mean will be uncomputable. Let's consider each issue in more detail.

a) Peer Reviewed Articles Frequently Compute A Mean Correlation Coefficient

E. Garcia is claiming something is uncomputable that researchers frequently compute and include in peer reviewed articles. Here are three significant papers where the researchers compute a mean correlation coefficient:

"The weighted mean correlation coefficient between fitness and genetic diversity for the 34 data sets was moderate, with a mean of 0.432 +/- 0.0577" (Macquare University - "Correlation between Fitness and Genetic Diversity", Reed, Franklin; Conversation Biology; 2003)

"We observed a progressive change of the mean correlation coefficient over a period of several months as a consequence of the exposure to a viscous force field during each session. The mean correlation coefficient computed during the force-field epochs progressively..." (MIT - F. Gandolfo, et al; "Cortical correlates of learning in monkeys adapting to a new dynamical environment," 2000)

"For the 100 pairs of MT neurons, the mean correlation coefficient was 0.12, a value significantly greater than zero" (Stanford - E Zohary, et al; "Correlated neuronal discharge rate and its implications for psychophysical performance", 1994)

SEOmoz is in a camp with reviewers from the journal Nature, as well as researchers from MIT, Stanford and authors of 2,400 other academic papers that use the mean correlation coefficient. Our camp is being attacked by Dr. E. Garcia's, who argues our camp doesn't "understand statistics at all." It is fine to take positions outside of the scientific mainstream, although when Dr. E. Garcia takes such a position he should offer more support for it. Given how commonly Dr. E. Garcia uses the pejorative "quack," I suspect he does not mean to take positions this far outside of academic consensus.

b) Additivity Relevant For Determining If Different Meanings Of "Average" Are The Same, Not If Mean Is Computable

Although "mean" is quite precise, "average" is less precise. By "average" one might intend the words "mean", "mode", "median," or something else. One of these other things that it could be used as meaning is 'the value of a function on the union of the inputs'. This last definition of average might seem odd, but it is sometimes used. Consider if someone asked "a car travels 1 mile at 20mph, and 1 mile at 40mph, what was the average mph for the entire trip?" The answer they are looking for is not 30mph, which is mean of the two measurements, but ~26mph, which is the mph for the whole 2 mile trip. In this case, the mean of the measurements is different from the colloquial average which is the function for computing mph applied to the union of the inputs (the whole two miles).

This may be what has confused Dr. E. Garcia. Elsewhere he cites Statsweb when repeating this claim. Which makes the point that this other "average" is different than the mean. Additivity is useful in determining if these averages will be different. But even if another interpretation of average is valid for a problem, and even if that other average is different than the mean, it neither makes the mean uncomputable nor meaningless.

2) Rebuttal to Claim About Standard Error of the Mean vs Standard Error of a Correlation Coefficent

Although he has stated unequivocally that one cannot compute a mean correlation coefficient, Garcia is quite opinionated on how we ought to have computed standard error for it. To wit:

E. Garcia: "Evidently, you don’t know how to calculate the standard error of a correlation coefficient... the standard error of the mean and the standard error of a correlation coefficient are two different things. Moreover, the standard deviation of the mean is not used to calculate the standard error of a correlation coefficient or to compare correlation coefficients or their statistical significance."

He repeats this claim even after making the point above about mean correlation coefficients, so he clearly is aware the correlation coefficients being discussed are mean coefficients and not coefficients computed after pooling data points. So let's be clear on exactly what his claim implies. We have some measured correlation coefficients, and we take the mean of these measured coefficients. The claim is that we should have used the same formula for standard error of the mean of these measured coefficients that we would have used for only one. Garcia's claim is incorrect. One would use the formula for the standard error of the mean.

The formula for the mean, and for the standard error of the mean, apply even if there is a way to separately compute standard error for one of the observations the mean was over. If we were computing the mean of the count of apples in barrels, lifespans of people in the 19th century, or correlation coefficients for different SERPs, the same formula for the standard error of this mean applies. Even if we have other ways to measure the standard error of the measurements we are taking the mean over - for instance, our measure of lifespans might only be accurate to the day of death and so could be off by 24 hours - we cannot use how we would compute standard error for an observation to compute standard error of the mean of those observations.

A smaller but related objection is over language. He objects to my using the standard deviations in reference to a count of how far away a point is from a mean in units of the mean's standard error. As wikipedia notes, the "standard error of the mean (i.e., of using the sample mean as a method of estimating the population mean) is the standard deviation of those sample means" So the count of how many lengths of standard error a number is away from the estimate of a mean, according to Wikipedia, would be standard deviations of our mean estimate. Beyond it being technically correct, it also fit the context, which was the accuracy of the sample mean.

3) Rebuttal to Claim That Non-Linearity Is Not A Valid Reason To Use Spearman's Correlation

I wrote "Pearson’s correlation is only good at measuring linear correlation, and many of the values we are looking at are not. If something is well exponentially correlated (like link counts generally are), we don’t want to score them unfairly lower.”

E. Garcia responded by citing a source whom he cited as "exactly right": "Rand your (or Ben’s) reasoning for using Spearman correlation instead of Pearson is wrong. The difference between two correlations is not that one describes linear and the other exponential correlation, it is that they differ in the type of variables that they use. Both Spearman and Pearson are trying to find whether two variables correlate through a monotone function, the difference is that they treat different type of variables - Pearson deals with non-ranked or continuous variables while Spearman deals with ranked data."

E. Garcia's source, and by extension E. Garcia, are incorrect. A desire to measure non-linear correlation, such as exponential correlations, is a valid reason to use Spearman's over Pearson's. The point that "Pearson deals with non-ranked or continuous variables while Spearman deals with ranked data" is true in that to compute Spearman's correlation, one can convert continuous variables to ranked indices and then apply Pearson's. However, the original variables do not need to originally be ranked indices. If they did, Spearman's would always produce the same results as Pearson's and there would be no purpose for it.

My point that E. Garcia objects to, that Pearson's only measure's linear correlation while Spearman's can measure other kinds of correlation such as exponential correlations, was entirely correct. We can quickly quote Wikipedia to show that Spearman's measures any monotonic correlation (including exponential) while Pearson's only measures linear correlation.

The Wikipedia article on Pearson's Correlation starts by noting that it is a "measure of the correlation (linear dependence) between two variables".

The Wikpedia article on Spearman's Correlation starts with an example in the upper right showing that a "Spearman correlation of 1 results when the two variables being compared are monotonically related, even if their relationship is not linear. In contrast, this does not give a perfect Pearson correlation."

E. Garcia's position neither makes sense nor agrees with the literature. I would go into the math in more detail, or quote more authoritative sources, but I'm pretty sure Garcia now knows he is wrong. After E. Garcia made his incorrect claim about the difference between Spearman's correlation and Pearson's correlation, and after I corrected E. Garcia's source (which was in a comment on our blog), E. Garcia has stated the difference between Spearman's and Pearson's correctly. However, we want to make sure there's a good record of the points, and explain the what and why.

4) Rebuttal To Claim That PCA Is Not A Linear Method

This example is particularly interesting because it is about Principle Component Analysis(PCA), which is related to PageRank (something many SEOs are familiar with). In PCA one finds principal components, which are eigenvectors. PageRank is also an eigenvector. But I am digressing, let's discuss Garcia's claim.

After Dr. E. Garcia criticized a third party for using Pearson's Correlation because Pearson's only shows linear correlations, he criticized us for not using PCA. Like Pearson's, PCA can only find linear correlations, so I pointed out his contradiction:

Ben: "Given the top of your post criticizes someone else for using Pearson’s because of linearity issues, isn’t it kinda odd to suggest another linear method?"

To which E. Garcia has respond: "Ben’s comments about... PCA confirms an incorrect knowledge about statistics" and "Be careful when you, Ben and Rand, talk about linearity in connection with PCA as no assumption needs to be made in PCA about the distribution of the original data. I doubt you guys know about PCA...The linearity assumption is with the basis vectors."

But before we get to the core of the disagreement, let me point out that E. Garcia is close to correct with his actual statement. PCA defines basis vectors such that they are linearly de-correlated, so it does not need to assume that they will be. But this a minor quibble.  This issue with Dr. E. Garcia's his position is the implication that the linear aspect of PCA is not in the correlations it finds in the source data like I claimed, but only in the basis vectors.

So, there is the disagreement - analogous to how Pearson's Correlation only finds linear correlations, does PCA also only find linear correlations? Dr. E. Garcia says no. SEOmoz, and many academic publications, say yes. For instance:

"PCA does not take into account nonlinear correlations among the features" ("Kernel PCA for HMM-Based Cursive Handwriting Recognition"; Andreas Fischer and Horst Bunke 2009)

"PCA identifies only linear correlations between variables" ("Nonlinear Principal Component Analysis Using Autoassociative Neural Networks"; Mark A. Kramer (MIT), AIChE Journal 1991)

However, besides citing authorities, let's consider why his claim is incorrect. As E. Garcia imprecisely notes, the basis vectors are linearily de-correlated. As the sources he cites points out, PCA tries to represent the source data as linear combinations of these basis vectors. This is how PCA shows us correlations - by creating basis vectors that can be linearly combined to get close to the original data. We can then look at these basis vectors and see how aspects of our source data vary together, but because it only is combining them linearly, it is only showing us linear correlations. Therefore, PCA is used to provide an insight into linear correlations -- even for non-linear data.

5) Rebuttal To Claim About Small Correlations Not Being Published

Ted Dzubia suggests that small correlations are not interesting, or at least are not interesting because our dataset is too small. He writes:

Dzubia: "out of all the factors they measured ranking correlation for, nothing was correlated above .35. In most science, correlations this low are not even worth publishing. "

Academic papers frequently publish correlations of this size. On the first page of a google scholar search for "mean correlation coefficient" I see:

  1. The Stanford neurology paper I cited above to refute Garcia is reporting a mean correlation coefficient of 0.12.
  2. "Meta-analysis of the relationship between congruence and well-being measures"  a paper with over 200 citations whose abstract cites coefficients of 0.06, 0.15, 0.21, and 0.31.
  3. "Do amphibians follow Bergmann's rule" which notes that "grand mean correlation coefficient is significantly positive (+0.31)."

These papers were not cherry picked from a large number of papers. Contrary to Ted Dzubia's suggestion, the size of a correlation that is interesting varies considerably with the problem. For our problem, looking at correlations in Google results, one would not expect any single high correlation value from features we were looking at unless one believes Google has a single factor they predominately use to rank results with and one is only interested in that factor. We do not believe that. Google has stated on many occasions that they employ more than 200 features in their ranking algorithm. In our opinion, this makes correlations in the 0.1 - 0.35 range quite interesting.

6) Rebuttal To Claim That Small Correlations Need A Bigger Sample Size

Dzubia: "Also notice that the most negative correlation metric they found was -.18.... Such a small correlation on such a small data set, again, is not even worth publishing."

Our dataset was over 100,000 results across over 11,000 queries, which is much more than sufficient for the size of correlations we found. The risk when having small correlations and a small dataset is that it may be hard to tell if correlations are statistical noise. Generally 1.96 standard deviations is required to consider results statistically significant. For the particular correlation Dzubia brings up, one can see from the standard error value that we have 52 standard deviations of confidence the correlation is statistically significant. 52 is substantially more than the 1.96 that is generally considered necessary.

We use a sample size so much larger than usual because we wanted to make sure the relative differences between correlation coefficients were not misleading. Although we feel this adds value to our results, it is beyond what is generally considered necessary to publish correlation results.


Some folks inside the SEO community have had disagreements about our interpretations and opinions regarding what the data means (and where/whether confounding variables exist to explain some points). As Rand carefully noted in our post on correlation data and his presentation, we certainly want to encourage this. Our opinions about where/why the data exists are just that - opinions - and shouldn't be ascribed any value beyond its use in applying to your own thinking about the data sources. Our goal was to collect data and publish it so that our peers in the industry could review and interpret.

It is also healthy to have a vigorous debate about how statistics such as these are best computed, and how we can ensure accuracy of reported results. As our community is just starting to compute these statistics (Sean Weigold Ferguson, for example, recently submitted a post on PageRank using very similar methodologies), it is only natural there will be some bumbling back and forth as we develop industry best practices. This is healthy and to our industry's advantage that it occur.

The SEO community is the target of a lot of ad hominem attacks which try to associate all SEOs with the behavior of the worst. Although we can answer such attacks by pointing out great SEOs and great conferences, it is exciting that we've been able to elevate some attacks to include mathematical points, because when they are arguing math they can be definitively rebutted. On the six points of mathematical disagreement, the tally is pretty clear - SEO community: Six, SEO bashers: zero. Being SEOs doesn't make us infallible, so surely in the future the tally will not be so lopsided, but our tally today reflects how seriously we take our work and how we as a community can feel good about using data from this type of research to learn more about the operations of search engines.

Do you like this post? Yes No


Patience is an SEO Virtue

Posted by Kate Morris

We have all been there once or twice, maybe a few more than that even. You just launched a site or a project,  and a few days pass, you login to analytics and webmaster tools to see how things are going. Nothing is there. 

WAIT. What?!?!?! 

Scenarios start running through your mind, and you check to make sure everything is working right. How could this be?

It doesn't even have to be a new project. I've realized things on clients' sites that needed fixing: XML sitemaps, link building efforts, title tag duplication, or even 404 redirection. The right changes are made, and a week later, nothing has changed in rankings or in webmaster consoles across the board. You are left thinking "what did I do wrong?"

funny pictures of dogs with captions

A few client sites, major sites mind you, have had issues recently like 404 redirection and toolbar PageRank drops. One even had to change a misplaced setting in Google Webmaster Tools pointing to the wrong version of their site (www vs non-www). We fixed it, and there was a drop in their homepage for their name.

That looks bad. Real bad. Especially to the higher ups. They want answers and the issue fixed now ... yesterday really.

Most of these things are being measured for performance and some can even have a major impact on the bottom line. And it is so hard to tell them this, even harder to do, but the changes just take ...


That homepage drop? They called on Friday, as of Saturday night things are back to normal. The drop happened for 2-3 days most likely, but this is a large site. Another client, smaller, had redesigned their entire site. We put all the correct 301 redirects for the old pages and launched the site. It took Google almost 4 weeks to completely remove the old pages from the index. There were edits to URLs that caused 404 errors, fixed within a day, took over a week to reflect in Google Webmaster Tools. 

These are just a few examples where changes were made immediately, but the actions had no immediate return. We live in a society that thrives on the present, immediate return. As search marketers, we make c-level executives happy with our ability to show immediate returns on our campaigns. But like the returns on SEO, the reflection of changes in SEO take time. 

The recent Mayday and Caffeine updates are sending many sites to the bottom of rankings because of the lack of original content. Many of them are doing everything "right" in terms of onsite SEO, but now that isn't enough. The can change their site all they want to, but until there is relevant and good content plus traffic, those rankings are not going to return for long tail terms. 

There has also been a recent crack down on over optimized local search listings. I have seen a number of accounts suspended or just not ranking well because they are in effect trying too hard. There is a such thing as over optimizing a site, and too many changes at once can raise a flag with the search engines. 

One Month Rule

funny pictures of cats with captions

Here is my rule: Make a change, leave it, go do social media/link building, and come back  to the issue a month later. It may not take a month, but for smaller sites, 2 weeks is a good time to check on the status of a few things. A month is when things should start returning to normal if there have been no other large changes to the site. 

We say this all the time with PPC accounts. It's like in statistical analysis, you have to have enough data to work with to see results. And when you are waiting for a massive search engine to make some changes, once they do take effect in the system, you then have to give it time to work. 

So remember the next time something seems to be not working in Webmaster Tools or SERPs:

  1. If you must, double check the code (although you’ve probably already done this 15 times) to ensure it’s set up correctly. But then,
  2. Stop. Breathe. There is always a logical explanation. (And yes, Google being slow is a logical one)
  3. When did you last change something to do with the issue?
  4. If it's less than 2 weeks ago, give it some more time.
  5. Major changes, give it a month. (Think major site redesigns and URL restructuring)

Do you like this post? Yes No


Whiteboard Friday – Facebook’s Open Graph WON’T Replace Google

Posted by great scott!

Earlier this week Facebook announced its 'Open Graph' at F8. There was all sorts of hubbub (much of it the bye-product of well-orchestrated buzz) about Facebook finally making strides to kill Google's dominance of the web.  So should you hangup your white hat, your black hat, your grey hat, and trade it all in for a blue hat?  Much as we love Facebook, the answer, dear reader, is no: SEO is not dead. 

Watch this week's video to hear Rand's take on how Facebook's 'Open Graph' will impact web marketing and all the ways it won't.  There are all sorts of opportunities that will likely emerge out of this new technology, so you should pay attention. So go ahead and keep an eye out for a nice fitting blue hat in the near future, but don't plan to throw away your white hat anytime soon.





Facebook Sticker
The sticker we received

Do you like this post? Yes No