SEOmoz Feed

Statistics a Win for SEO

Posted by bhendrickson

We recently posted some correlation statistics on our blog. We believe these statistics are interesting and provide insight into the ways search engines work (a core principle of our mission here at SEOmoz). As we will continue to make similar statistics available, I'd like to discuss why correlations are interesting, refute the math behind recent criticisms, and reflect on how exciting it is to engage in mathematical discussions where critiques can be definitively rebutted.

I've been around SEOmoz for a little while now, but I don't post a lot. So, as a quick reminder, I designed and built the prototype for the SEOmoz's web index, as well as wrote a large portion of the back-end code for the project. We shipped the index with billions of pages nine months after I started on the prototype, and we have continued to improve it since. Recently I made the machine learning models that are used to make Page Authority and Domain Authority, and am working on some fairly exciting stuff that has not yet shipped. As I'm an engineer and not a regular blogger, I'll ask for a bit of empathy for my post - it's a bit technical, but I've tried to make it as accessible as possible.

Why does Correlation Matter?

Correlation helps us find causation by measuring how much variables change together. Correlation does not imply causation; variables can be changing together for reasons other than one affecting the other. However, if two variables are correlated and neither is affecting the other, we can conclude that there must be a third variable that is affecting both. This variable is known as a confounding variable. When we see correlations, we do learn that a cause exists -- it might just be a confounding variable that we have yet to figure out.

How can we make use of correlation data? Let's consider a non-SEO example.

There is evidence that women who occasionally drink alcohol during pregnancy give birth to smarter children with better social skills than women who abstain. The correlation is clear, but the causation is not. If it is causation between the variables, then light drinking will make the child smarter. If it is a confounding variable, light drinking could have no effect or even make the child slightly less intelligent (which is suggested by extrapolating the data that heavy drinking during pregnancy makes children considerably less intelligent).

Although these correlations are interesting, they are not black-and-white proof that behaviors need to change. One needs to consider which explanations are more plausible: the causal ones or the confounding variable ones. To keep the analogy simple, let's suppose there were only two likely explanation - one causal and one confounding. The causal explanation is that alcohol makes a mother less stressed, which helps the unborn baby. The confounding variable explanation is that women with more relaxed personalities are more likely to drink during pregnancy and less likely to negatively impact their child's intelligence with stress. Given this, I probably would be more likely to drink during pregnancy because of the correlation evidence, but there is an even bigger take-away: both likely explanations damn stress. So, because of the correlation evidence about drinking, I would work hard to avoid stressful circumstances. *

Was the analogy clear? I am suggesting that as SEOs we approach correlation statistics like pregnant women considering drinking - cautiously, but without too much stress.

* Even though I am a talented programmer and work in the SEO industry, do not take medical advice from me, and note that I construed the likely explanations for the sake of simplicity :-)

Some notes on data and methodology

We have two goals when selecting a methodology to analyze SERPs:

  1. Choose measurements that will communicate the most meaningful data
  2. Use techniques that can be easily understood and reproduced by others

These goals sometimes conflict, but we generally choose the most common method still consistent with our problem. Here is a quick rundown of the major options we had, and how we decided between them for our most recent results:

Machine Learning Models vs. Correlation Data: Machine learning can model and account for complex variable interactions. In the past, we have reported derivatives of our machine learning models. However, these results are difficult to create, they are difficult to understand, and they are difficult to verify. Instead we decided to compute simple correlation statistics.

Pearson's Correlation vs. Spearman's Correlation: The most common measure of correlation is Pearson's Correlation, although it only measures linear correlation. This limitation is important: we have no reason to think interesting correlations to ranking will all be linear. Instead we choose to use Spearman's correlation. Spearman's correlation is still pretty common, and it does a reasonable job of measuring any monotonic correlation.

Here is a monotonic example: The count of how many of my coworkers have eaten lunch for the day is perfectly monotonically correlated with the time of day. It is not a straight line and so it isn't linear correlation, but it is never decreasing, so it is monotonic correlation.

Here is a linear example: assuming I read at a constant rate, the amount of pages I can read is linearly correlated with the length of time I spend reading.

Mean Correlation Coefficient vs. Pooled Correlation Coefficient: We collected data for 11,000+ queries. For each query, we can measure the correlation of ranking position with a particular metric by computing a correlation coefficient. However, we don't want to report 11,000+ correlation coefficients; we want to report a single number that reflects how correlated the data was across our dataset, and we want to show how statistically significant that number is. There are two techniques commonly used to do this:

  1. Compute the mean of the correlation coefficients. To show statistical significance, we can report the standard error of the mean.
  2. Pool the results from all SERPs and compute a global correlation coefficient. To show statistical significance, we can compute standard error through a technique known as bootstrapping.

The mean correlation coefficient and the pooled correlation coefficient would both be meaningful statistics to report. However, the bootstrapping needed to show the standard error of the pooled correlation coefficient is less common than using the standard error of the mean. So we went with #1.

Fisher Transform Vs No Fisher Transform: When averaging a set of correlation coefficients, instead of computing the mean of the correlation coefficients, sometimes one computes the mean of the fisher transforms of the coefficients (before applying the inverse fisher transform). This would not be appropriate for our problem because:

  1. It will likely fail. The Fisher transform includes a division by the coefficient minus one, and so explodes when an individual coefficient is near one and outright fails when there is a one. Because we are computing hundreds of thousands of coefficients each with small sample sizes to average over, it is quite likely the Fisher transform will fail for our problem. (Of course, we have a large sample of these coefficients to average over, so our end standard error is not large)
  2. It is unnecessary for two reasons. First, the advantage of the transform is that it can make the expect average closer to the expected coefficient. We do nothing that assumes this property. Second, as mean coefficients are near to zero, this property holds without the transform, and our coefficients were not large.

Rebuttals To Recent Criticisms

Two bloggers, Dr. E. Garcia and Ted Dzubia, have published criticisms of our statistics.

Eight months before his current post, Ted Dzubia wrote an enjoyable and jaunty post lamenting that criticism of SEO every six to eight months was an easy way to generate controversy, noting "it's been a solid eight months, and somebody kicked the hornet's nest. Is SEO good or evil? It's good. It's great. I <3 SEO." Furthermore, his twitter feed makes it clear he sometimes trolls for fun. To wit: "Mongrel 2 under the Affero GPL. TROLLED HARD," "Hacker News troll successful," and "mailing lists for different NoSQL servers are ripe for severe trolling." So it is likely we've fallen for trolling...

I am going to respond to both of their posts anyway because they have received a fair amount of attention, and because both posts seek to undermine the credibility of the wider SEO industry. SEOmoz works hard to raise the standards of the SEO industry, and protect it from unfair criticisms (like Garcia's claim that "those conferences are full of speakers promoting a lot of non-sense and SEO myths/hearsays/own crappy ideas" or Dzubia's claim that, besides our statistics, "everything else in the field is either anecdotal hocus-pocus or a decree from Matt Cutts"). We also plan to create more correlation studies (and more sophisticated analyses using my aforementioned ranking models) and thus want to ensure that those who are employing this research data can feel confident in the methodology employed.

Search engine marketing conferences, like SMX, OMS and SES, are essential to the vitality of our industry. They are an opportunity for new SEO consultants to learn, and for experienced SEOs to compare notes. It can be hard to argue against such subjective and unfair criticism of our industry, but we can definitively rebut their math.

To that end, here are rebuttals for the four major mathematical criticisms made by Dr. E. Garcia, and the two made by Dzubia.

1) Rebuttal to Claim That Mean Correlation Coefficients Are Uncomputable

For our charts, we compute a mean correlation coefficient. The claim is that such a value is impossible to compute.

Dr. E. Garcia : "Evidently Ben and Rand don’t understand statistics at all. Correlation coefficients are not additive. So you cannot compute a mean correlation coefficient, nor you can use such 'average' to compute a standard deviation of correlation coefficients."

There are two issues with this claim: a) peer reviewed papers frequently published mean correlation coefficients; b) additivity is relevant for determining if two different meanings of the word "average" will have the same value, not if the mean will be uncomputable. Let's consider each issue in more detail.

a) Peer Reviewed Articles Frequently Compute A Mean Correlation Coefficient

E. Garcia is claiming something is uncomputable that researchers frequently compute and include in peer reviewed articles. Here are three significant papers where the researchers compute a mean correlation coefficient:

"The weighted mean correlation coefficient between fitness and genetic diversity for the 34 data sets was moderate, with a mean of 0.432 +/- 0.0577" (Macquare University - "Correlation between Fitness and Genetic Diversity", Reed, Franklin; Conversation Biology; 2003)

"We observed a progressive change of the mean correlation coefficient over a period of several months as a consequence of the exposure to a viscous force field during each session. The mean correlation coefficient computed during the force-field epochs progressively..." (MIT - F. Gandolfo, et al; "Cortical correlates of learning in monkeys adapting to a new dynamical environment," 2000)

"For the 100 pairs of MT neurons, the mean correlation coefficient was 0.12, a value significantly greater than zero" (Stanford - E Zohary, et al; "Correlated neuronal discharge rate and its implications for psychophysical performance", 1994)

SEOmoz is in a camp with reviewers from the journal Nature, as well as researchers from MIT, Stanford and authors of 2,400 other academic papers that use the mean correlation coefficient. Our camp is being attacked by Dr. E. Garcia's, who argues our camp doesn't "understand statistics at all." It is fine to take positions outside of the scientific mainstream, although when Dr. E. Garcia takes such a position he should offer more support for it. Given how commonly Dr. E. Garcia uses the pejorative "quack," I suspect he does not mean to take positions this far outside of academic consensus.

b) Additivity Relevant For Determining If Different Meanings Of "Average" Are The Same, Not If Mean Is Computable

Although "mean" is quite precise, "average" is less precise. By "average" one might intend the words "mean", "mode", "median," or something else. One of these other things that it could be used as meaning is 'the value of a function on the union of the inputs'. This last definition of average might seem odd, but it is sometimes used. Consider if someone asked "a car travels 1 mile at 20mph, and 1 mile at 40mph, what was the average mph for the entire trip?" The answer they are looking for is not 30mph, which is mean of the two measurements, but ~26mph, which is the mph for the whole 2 mile trip. In this case, the mean of the measurements is different from the colloquial average which is the function for computing mph applied to the union of the inputs (the whole two miles).

This may be what has confused Dr. E. Garcia. Elsewhere he cites Statsweb when repeating this claim. Which makes the point that this other "average" is different than the mean. Additivity is useful in determining if these averages will be different. But even if another interpretation of average is valid for a problem, and even if that other average is different than the mean, it neither makes the mean uncomputable nor meaningless.

2) Rebuttal to Claim About Standard Error of the Mean vs Standard Error of a Correlation Coefficent

Although he has stated unequivocally that one cannot compute a mean correlation coefficient, Garcia is quite opinionated on how we ought to have computed standard error for it. To wit:

E. Garcia: "Evidently, you don’t know how to calculate the standard error of a correlation coefficient... the standard error of the mean and the standard error of a correlation coefficient are two different things. Moreover, the standard deviation of the mean is not used to calculate the standard error of a correlation coefficient or to compare correlation coefficients or their statistical significance."

He repeats this claim even after making the point above about mean correlation coefficients, so he clearly is aware the correlation coefficients being discussed are mean coefficients and not coefficients computed after pooling data points. So let's be clear on exactly what his claim implies. We have some measured correlation coefficients, and we take the mean of these measured coefficients. The claim is that we should have used the same formula for standard error of the mean of these measured coefficients that we would have used for only one. Garcia's claim is incorrect. One would use the formula for the standard error of the mean.

The formula for the mean, and for the standard error of the mean, apply even if there is a way to separately compute standard error for one of the observations the mean was over. If we were computing the mean of the count of apples in barrels, lifespans of people in the 19th century, or correlation coefficients for different SERPs, the same formula for the standard error of this mean applies. Even if we have other ways to measure the standard error of the measurements we are taking the mean over - for instance, our measure of lifespans might only be accurate to the day of death and so could be off by 24 hours - we cannot use how we would compute standard error for an observation to compute standard error of the mean of those observations.

A smaller but related objection is over language. He objects to my using the standard deviations in reference to a count of how far away a point is from a mean in units of the mean's standard error. As wikipedia notes, the "standard error of the mean (i.e., of using the sample mean as a method of estimating the population mean) is the standard deviation of those sample means" So the count of how many lengths of standard error a number is away from the estimate of a mean, according to Wikipedia, would be standard deviations of our mean estimate. Beyond it being technically correct, it also fit the context, which was the accuracy of the sample mean.

3) Rebuttal to Claim That Non-Linearity Is Not A Valid Reason To Use Spearman's Correlation

I wrote "Pearson’s correlation is only good at measuring linear correlation, and many of the values we are looking at are not. If something is well exponentially correlated (like link counts generally are), we don’t want to score them unfairly lower.”

E. Garcia responded by citing a source whom he cited as "exactly right": "Rand your (or Ben’s) reasoning for using Spearman correlation instead of Pearson is wrong. The difference between two correlations is not that one describes linear and the other exponential correlation, it is that they differ in the type of variables that they use. Both Spearman and Pearson are trying to find whether two variables correlate through a monotone function, the difference is that they treat different type of variables - Pearson deals with non-ranked or continuous variables while Spearman deals with ranked data."

E. Garcia's source, and by extension E. Garcia, are incorrect. A desire to measure non-linear correlation, such as exponential correlations, is a valid reason to use Spearman's over Pearson's. The point that "Pearson deals with non-ranked or continuous variables while Spearman deals with ranked data" is true in that to compute Spearman's correlation, one can convert continuous variables to ranked indices and then apply Pearson's. However, the original variables do not need to originally be ranked indices. If they did, Spearman's would always produce the same results as Pearson's and there would be no purpose for it.

My point that E. Garcia objects to, that Pearson's only measure's linear correlation while Spearman's can measure other kinds of correlation such as exponential correlations, was entirely correct. We can quickly quote Wikipedia to show that Spearman's measures any monotonic correlation (including exponential) while Pearson's only measures linear correlation.

The Wikipedia article on Pearson's Correlation starts by noting that it is a "measure of the correlation (linear dependence) between two variables".

The Wikpedia article on Spearman's Correlation starts with an example in the upper right showing that a "Spearman correlation of 1 results when the two variables being compared are monotonically related, even if their relationship is not linear. In contrast, this does not give a perfect Pearson correlation."

E. Garcia's position neither makes sense nor agrees with the literature. I would go into the math in more detail, or quote more authoritative sources, but I'm pretty sure Garcia now knows he is wrong. After E. Garcia made his incorrect claim about the difference between Spearman's correlation and Pearson's correlation, and after I corrected E. Garcia's source (which was in a comment on our blog), E. Garcia has stated the difference between Spearman's and Pearson's correctly. However, we want to make sure there's a good record of the points, and explain the what and why.

4) Rebuttal To Claim That PCA Is Not A Linear Method

This example is particularly interesting because it is about Principle Component Analysis(PCA), which is related to PageRank (something many SEOs are familiar with). In PCA one finds principal components, which are eigenvectors. PageRank is also an eigenvector. But I am digressing, let's discuss Garcia's claim.

After Dr. E. Garcia criticized a third party for using Pearson's Correlation because Pearson's only shows linear correlations, he criticized us for not using PCA. Like Pearson's, PCA can only find linear correlations, so I pointed out his contradiction:

Ben: "Given the top of your post criticizes someone else for using Pearson’s because of linearity issues, isn’t it kinda odd to suggest another linear method?"

To which E. Garcia has respond: "Ben’s comments about... PCA confirms an incorrect knowledge about statistics" and "Be careful when you, Ben and Rand, talk about linearity in connection with PCA as no assumption needs to be made in PCA about the distribution of the original data. I doubt you guys know about PCA...The linearity assumption is with the basis vectors."

But before we get to the core of the disagreement, let me point out that E. Garcia is close to correct with his actual statement. PCA defines basis vectors such that they are linearly de-correlated, so it does not need to assume that they will be. But this a minor quibble.  This issue with Dr. E. Garcia's his position is the implication that the linear aspect of PCA is not in the correlations it finds in the source data like I claimed, but only in the basis vectors.

So, there is the disagreement - analogous to how Pearson's Correlation only finds linear correlations, does PCA also only find linear correlations? Dr. E. Garcia says no. SEOmoz, and many academic publications, say yes. For instance:

"PCA does not take into account nonlinear correlations among the features" ("Kernel PCA for HMM-Based Cursive Handwriting Recognition"; Andreas Fischer and Horst Bunke 2009)

"PCA identifies only linear correlations between variables" ("Nonlinear Principal Component Analysis Using Autoassociative Neural Networks"; Mark A. Kramer (MIT), AIChE Journal 1991)

However, besides citing authorities, let's consider why his claim is incorrect. As E. Garcia imprecisely notes, the basis vectors are linearily de-correlated. As the sources he cites points out, PCA tries to represent the source data as linear combinations of these basis vectors. This is how PCA shows us correlations - by creating basis vectors that can be linearly combined to get close to the original data. We can then look at these basis vectors and see how aspects of our source data vary together, but because it only is combining them linearly, it is only showing us linear correlations. Therefore, PCA is used to provide an insight into linear correlations -- even for non-linear data.

5) Rebuttal To Claim About Small Correlations Not Being Published

Ted Dzubia suggests that small correlations are not interesting, or at least are not interesting because our dataset is too small. He writes:

Dzubia: "out of all the factors they measured ranking correlation for, nothing was correlated above .35. In most science, correlations this low are not even worth publishing. "

Academic papers frequently publish correlations of this size. On the first page of a google scholar search for "mean correlation coefficient" I see:

  1. The Stanford neurology paper I cited above to refute Garcia is reporting a mean correlation coefficient of 0.12.
  2. "Meta-analysis of the relationship between congruence and well-being measures"  a paper with over 200 citations whose abstract cites coefficients of 0.06, 0.15, 0.21, and 0.31.
  3. "Do amphibians follow Bergmann's rule" which notes that "grand mean correlation coefficient is significantly positive (+0.31)."

These papers were not cherry picked from a large number of papers. Contrary to Ted Dzubia's suggestion, the size of a correlation that is interesting varies considerably with the problem. For our problem, looking at correlations in Google results, one would not expect any single high correlation value from features we were looking at unless one believes Google has a single factor they predominately use to rank results with and one is only interested in that factor. We do not believe that. Google has stated on many occasions that they employ more than 200 features in their ranking algorithm. In our opinion, this makes correlations in the 0.1 - 0.35 range quite interesting.

6) Rebuttal To Claim That Small Correlations Need A Bigger Sample Size

Dzubia: "Also notice that the most negative correlation metric they found was -.18.... Such a small correlation on such a small data set, again, is not even worth publishing."

Our dataset was over 100,000 results across over 11,000 queries, which is much more than sufficient for the size of correlations we found. The risk when having small correlations and a small dataset is that it may be hard to tell if correlations are statistical noise. Generally 1.96 standard deviations is required to consider results statistically significant. For the particular correlation Dzubia brings up, one can see from the standard error value that we have 52 standard deviations of confidence the correlation is statistically significant. 52 is substantially more than the 1.96 that is generally considered necessary.

We use a sample size so much larger than usual because we wanted to make sure the relative differences between correlation coefficients were not misleading. Although we feel this adds value to our results, it is beyond what is generally considered necessary to publish correlation results.

Conclusions

Some folks inside the SEO community have had disagreements about our interpretations and opinions regarding what the data means (and where/whether confounding variables exist to explain some points). As Rand carefully noted in our post on correlation data and his presentation, we certainly want to encourage this. Our opinions about where/why the data exists are just that - opinions - and shouldn't be ascribed any value beyond its use in applying to your own thinking about the data sources. Our goal was to collect data and publish it so that our peers in the industry could review and interpret.

It is also healthy to have a vigorous debate about how statistics such as these are best computed, and how we can ensure accuracy of reported results. As our community is just starting to compute these statistics (Sean Weigold Ferguson, for example, recently submitted a post on PageRank using very similar methodologies), it is only natural there will be some bumbling back and forth as we develop industry best practices. This is healthy and to our industry's advantage that it occur.

The SEO community is the target of a lot of ad hominem attacks which try to associate all SEOs with the behavior of the worst. Although we can answer such attacks by pointing out great SEOs and great conferences, it is exciting that we've been able to elevate some attacks to include mathematical points, because when they are arguing math they can be definitively rebutted. On the six points of mathematical disagreement, the tally is pretty clear - SEO community: Six, SEO bashers: zero. Being SEOs doesn't make us infallible, so surely in the future the tally will not be so lopsided, but our tally today reflects how seriously we take our work and how we as a community can feel good about using data from this type of research to learn more about the operations of search engines.


Do you like this post? Yes No

Read more...

Must-Have SEO Recommendations: Step 7 of the 8-Step SEO Strategy

Posted by laura

This post was originally in YOUmoz, and was promoted to the main blog because it provides great value and interest to our community. The author's views are entirely his or her own and may not reflect the views of SEOmoz, Inc.

You know the client.  The one that really needs your help.  The one that gets pumped when you explain how keywords work.  The one that has an image file for a site.  Or maybe the one that insists that if they copy their competitor’s title tags word-for-word, they’ll do better in search results (I had a product manager make his team do that once. Needless to say (I was thrilled when) it didn’t work). 

In Step 6 of the SEO Strategy document I noted that this strategy document we’ve been building isn’t a best practices document, and it’s more than a typical SEO audit.  It is a custom set of specific, often product-focused recommendations and strategies for gaining search traffic.  For that reason I recommended linking out to SEO basics and best practices elsewhere (in an intranet or a separate set of documents).

But most of the time you’ll still need to call out some horizontal things that this client must have put in front of their faces, or else it will be missed completely.  SEO/M is your area of expertise, not theirs, so help them make sure they’ve got their bases covered. You can create an additional section for these call-outs, wherever you feel it is appropriate in your document.

WHAT CAN I INCLUDE HERE?

Here are some examples of things you could include if you felt your client needed this brought to their attention:

  1. Press Release optimization and strategy
  2. SEO resources for specific groups in the company:
    1. SEO for business development (linking strategies in partner deals)
    2. SEO for writers/editorial
    3. SEO for designers
  3. SEO for long term results rather than short term fixes
  4. International rollout recommendations
  5. Content management system – how it is impairing their SEO
  6. Risks and avoidances
  7. Anything that you feel should be covered in more detail for this particular client, that wasn’t covered in your strategy in the last step. This is a catchall – a place to make sure you cover all bases.
  8. Nothing - if you dont feel it's needed.

If the client really needs a lot of help, you’d want to provide training and best practices, either as separate deliverables along with the strategy document, or better yet – work on training and best practices with them first, then dive into more specific strategy. You don’t want to end up with a 15 page (or even 4 page for that matter) best practices document in your strategy doc. Remember, we’re beyond best practices here, unless, in this case there’s something specific that needs to be called out.  

If the client needs more than one thing called out, do it.  If it’s several things, consider either adding an appendix, or as I mentioned, creating a separate best practices document.

The reason I recommend best practices as a separate document is because it is really a different project, often for an earlier phase.

EXAMPLE 1:

Let’s say for example, my client has the type of content the press loves to pick up. They don’t do press releases, mostly because they don’t know how exactly to write them and where to publish them, but they want to.  I‘ll add a Press Releases section after the strategy and I might give them these simple tidbits:

  • High level benefit of doing press releases
  • What person or group in the company might be best utilized to manage press releases
  • Examples of what to write press releases about
  • Channels they can publish press releases to
  • Optimization tips
  • References they can go to for more detailed information

EXAMPLE 2:

My client gets it. They’re pretty good at taking on most SEO on their own. This strategy document I’m doing for them is to really dig in and make sure all gaps are closed, and that they’re taking advantage of every opportunity they should.  Additionally, in a few months they are going to roll out the site to several international regions. 

My dig into the site and its competitors (and search engines) for this strategy have all been for the current site in this country. Because the Intl rollout hasn’t started yet, I will add a section to my document with specific things they need to keep in mind when doing this rollout.

  • Localized keyword research (rather than using translate tools)
  • ccTLD  (country code top level domain) considerations
  • Tagging considerations (like “lang”)
  • Proper use of Google Webmaster Tools for specifying region
  • Potential duplication issues
  • Maybe even a lit of popular search engines in those countries
  • Point to more resources or list as a potential future contract project

Make sense?  Use your judgment here. Like we’ve seen in the rest of the steps, this strategy document is your work of art, so paint it how your own creative noggin sees it, Picasso.

Other suggestions for what you might include here? Love it? Hate it? Think this step stinks or mad I didn’t include music to listen to for this one? Let’s hear about it in the comments!


Do you like this post? Yes No

Read more...

Patience is an SEO Virtue

Posted by Kate Morris

We have all been there once or twice, maybe a few more than that even. You just launched a site or a project,  and a few days pass, you login to analytics and webmaster tools to see how things are going. Nothing is there. 

WAIT. What?!?!?! 

Scenarios start running through your mind, and you check to make sure everything is working right. How could this be?

It doesn't even have to be a new project. I've realized things on clients' sites that needed fixing: XML sitemaps, link building efforts, title tag duplication, or even 404 redirection. The right changes are made, and a week later, nothing has changed in rankings or in webmaster consoles across the board. You are left thinking "what did I do wrong?"

funny pictures of dogs with captions

A few client sites, major sites mind you, have had issues recently like 404 redirection and toolbar PageRank drops. One even had to change a misplaced setting in Google Webmaster Tools pointing to the wrong version of their site (www vs non-www). We fixed it, and there was a drop in their homepage for their name.

That looks bad. Real bad. Especially to the higher ups. They want answers and the issue fixed now ... yesterday really.

Most of these things are being measured for performance and some can even have a major impact on the bottom line. And it is so hard to tell them this, even harder to do, but the changes just take ...

Patience

That homepage drop? They called on Friday, as of Saturday night things are back to normal. The drop happened for 2-3 days most likely, but this is a large site. Another client, smaller, had redesigned their entire site. We put all the correct 301 redirects for the old pages and launched the site. It took Google almost 4 weeks to completely remove the old pages from the index. There were edits to URLs that caused 404 errors, fixed within a day, took over a week to reflect in Google Webmaster Tools. 

These are just a few examples where changes were made immediately, but the actions had no immediate return. We live in a society that thrives on the present, immediate return. As search marketers, we make c-level executives happy with our ability to show immediate returns on our campaigns. But like the returns on SEO, the reflection of changes in SEO take time. 

The recent Mayday and Caffeine updates are sending many sites to the bottom of rankings because of the lack of original content. Many of them are doing everything "right" in terms of onsite SEO, but now that isn't enough. The can change their site all they want to, but until there is relevant and good content plus traffic, those rankings are not going to return for long tail terms. 

There has also been a recent crack down on over optimized local search listings. I have seen a number of accounts suspended or just not ranking well because they are in effect trying too hard. There is a such thing as over optimizing a site, and too many changes at once can raise a flag with the search engines. 

One Month Rule

funny pictures of cats with captions

Here is my rule: Make a change, leave it, go do social media/link building, and come back  to the issue a month later. It may not take a month, but for smaller sites, 2 weeks is a good time to check on the status of a few things. A month is when things should start returning to normal if there have been no other large changes to the site. 

We say this all the time with PPC accounts. It's like in statistical analysis, you have to have enough data to work with to see results. And when you are waiting for a massive search engine to make some changes, once they do take effect in the system, you then have to give it time to work. 

So remember the next time something seems to be not working in Webmaster Tools or SERPs:

  1. If you must, double check the code (although you’ve probably already done this 15 times) to ensure it’s set up correctly. But then,
  2. Stop. Breathe. There is always a logical explanation. (And yes, Google being slow is a logical one)
  3. When did you last change something to do with the issue?
  4. If it's less than 2 weeks ago, give it some more time.
  5. Major changes, give it a month. (Think major site redesigns and URL restructuring)


Do you like this post? Yes No

Read more...

Keyword Research Tools – Build Your Own Adventure

Posted by Sam Crocker

Hi there Mozzers! My name is Sam Crocker and I work for Distilled. This is my first post here at SEOmoz and I am looking forward to your feedback!

Background

My mother used to scold me for misusing my toys, playing with my food and for having a bit too much energy. She was well within her rights, as I was a bit of a handful, but at the moment one particular phrase really sticks out in my mind

“Is that what that was made for Sam? Use it the right way, please.”

Whether I was riding down the stairs in a sleeping bag, having sword fights with paper towel tubes with my sister, or using my skateboard as a street luge- I’ve always been big on using things for purposes other than their intended design. It should be no surprise that I do the same with some of the fancy and powerful tools upon which we have become quite dependent in the SEO world. Much like when I was little, it seems like by using things the “wrong way” there’s scope to have a bit more fun and to discover some new and different ways of accomplishing the same goals.

Young Sam Crocker
Me As a Little Guy. Snow Scraper = Renegade Fighting Stick?

I spoke about my most recent adventures in using things the wrong way at SMX Advanced London. I don’t think too many people who came to the keyphrase research session expecting to hear about how a scraper like Mozenda could be used to save all sorts of time and effort and generate new keyphrase ideas. You may want to have a quick read through that before watching the screencast.*

It's also important to point out that Mozenda is best used as a discover tool in the instance I provide here. If this method were a perfect solution to keyword research you could very easily build a tool that does it better. The beauty of Mozenda, however, is that it can be just about any tool you want. If you need to generate brand new content around a subject area you know nothing about, you can use it to explore tags on delicious or another social media platform.

Given a great deal of interest in this technique that I received from attendees at the presentation and in the twittersphere I decided it was worth providing a full walkthrough to cover some of the nuances I wasn’t able to cover in a 12 minute presentation and to share with the folks who weren't able to attend the conference.

 

 *It’s worth noting that for the sake of consistency I used the same Google Suggest tool in the video as I used for my initial research and discussed at SMX London. Since then Rob Milard built his own keyphrase expander tool based on this work and it is considerably more versatile than the original tool (you can search Google.com or Google.co.uk and export the file as a CSV). The output of this version isn’t in XML and provides the “search volume” data missing from the first tool. So congratulations and a BIG thank you to Rob from me and the search community in general!

Next Steps

The above screencast is an introduction of a technique we have been experimenting with to broaden the keyphrases targeted on a site (particularly, it can be used to increase the number of longtail keyphrases and provide insights into terminology you may not be targeting in your current list of keyphrases). This can be particularly useful if you work for an agency dealing with clients from a number of different sectors. For the sake of demonstration I have only input 7 terms into the Google Suggest tool in an effort to pull out a workable dataset for the screencast and for my presentation but Mozenda is a pretty powerful tool, so there’s really nothing stopping you from using more keyphrases. As a matter of courtesy, however, I would suggest setting up some delays when running any large scraping task to prevent overwhelming servers or hogging bandwidth. For more information on this, please have a read through Rich Baxter's latest piece on indexation.

One of the questions I was asked (by a number of people) was “what next?” As in: “what on earth am I going to do with these extra 10,000 keyphrases?” And although this presentation was intended as a proof of concept, I don’t want anyone to think we are trying to keep anything secret here so here are a few ideas about what you might consider doing next.

Option 1: Ask For Help!

For the people who find themselves thinking “I’m not really sure what to do with this data” I would suggest enlisting the help of a numbers guy or gal (Excel Wizards or other nerdy warriors). Odds are if you find looking at this sort of data daunting, you’re going to need their help making sense of the numbers later anyways.

Option 2: Outsource

The second option, for those of who know exactly what you want to do with this data, but don’t have the time to go through it all, I strongly suggest enlisting the help of cheap labour. Either find yourself an intern or make use of Amazon’s Mechanical Turks to find someone who can accomplish just what you need. The nice thing about services like this is that it’s a 24/7 workforce and you can get a feel for how helpful someone will be fairly quickly and painlessly.

Option 3: Jump Right In

Finally, the third option for those of you with some Excel skillz and a bit of time. There will definitely still be some manual work to be done and some weeding through for terms that are not at all relevant, the suggestions where you usually say aloud “no, Google I did NOT mean...” will clearly need to go.

The best use of this data will be the general themes or "common words" that you can quite easily sort through or filter for using Excel and that you may have been to oblivious to prior to starting.

Ikea Boxcutting Instructions

 Feel Free to Sing Along if You Know The Words! (image via: Kottke)


Step 1: Remove all duplicates. In this example there were no duplicates created though I can only assume that with 10,000 keyphrases run through the tool there will be some duplicate output.

Step 2: Remove URL suggestions. I know we like to think otherwise, but if the user was searching for “gleeepisodes.net” they probably aren’t interested in TV listings from your site. It would also be a fairly cheeky move to try to optimise a page about someone else’s website.

Step 3: Remember your target audience. If you only operate in the UK “Glee schedule Canada” and “Glee schedule Fox” can probably be eliminated as well. Now would be a good time to eliminate any truly irrelevant entries as well (e.g. “Gleevec” – although some of your viewers may have leukemia this probably is not what most visitors to your site are looking for).

Step 4: With the remaining terms and phrases run them through the usual sense checking routines. This is a good time to check global/local search volume for these terms and look at some of the competitiveness metrics as well. Search volume will probably be quite high for most of these terms (at least enough for Google to think someone might be looking for them regularly), though competitiveness probably will be too, so choose wisely.

Identifying the patterns at this stage will be essential to the value of the research you are conducting. You can try to filter for common phrases or suggestions at this stage and if, as in this example you realise "rumors" is a relevant term you've not targeted anywhere on the site, it is high time you consider adding content targeting this area for all of the television shows on the site.

Last Step: Come up with a sensible strategy to attack all this new content. Look at these terms as jumping off points for new content, new blog posts, and new ways of talking about this and other related products/services/subjects on the site.

Conclusions

A lot can be learned through this sort of exercise. In addition to finding some new high volume search terms, it may help you identify trends in search for which you have not been competing and have implications across the whole site rather than on one page. For example, maybe you didn’t think about “spoilers” or “rumors.” For a site dedicated to television programmes this sort of terminology will likely be valuable for a number of other shows as well!

The moral of the story? If you build it they will come.

Sometimes it is worth developing your own tool to make use of existing technology. Whilst I still feel Mozenda is the right tool for the job for handling larger datasets, the tool Rob built is a perfect example of both how a little creativity and building on other’s ideas can lead to benefit for everyone. Rob’s tool effectively rendered my Mozenda workaround unnecessary for most small to medium sites, and that’s awesome.

Doing it Wrong!
Image via: Motivated Photos

A final word of warning: I’m not suggesting that you replace all other keyphrase research with this idea. This technique is best utilised either during creation of a site about an area you know very little about (it’s rare, but it happens), or when you’ve run out of ideas and tried some of the more conventional approaches. It’s all about thinking outside of the box and trying new things to save you time. Onpage optimisation, linkbuilding and more traditional keyphrase research needs to be done but sometimes the best results come from trying something a bit experimental and using things for purposes other than that which they were designed.

If you have any questions, comments or concerns feel free to shame me publicly either in the below section or on Twitter.


Do you like this post? Yes No

Read more...

May 2010 Linkscape Update (and Whiteboard Explanations of How We Do It)

Posted by randfish

As some of you likely noticed, Linkscape's index updated today with fresh data crawled over the past 30 days. Rather than simply provide the usual index update statistics, we thought it would be fun to do some whiteboard diagrams of how we make a Linkscape update happen here at the mozplex. We also felt guilty because our camera ate tonight's WB Friday (but Scott's working hard to get it up for tomorrow morning).

Rand Writing on the Whiteboard

Linkscape, like most of the major web indices, starts with a seed set of trusted sites from which we crawl outwards to build our index. Over time, we've developed more sophisticated methods around crawl selection, but we're quite similar to Google, in that we crawl the web primarily in decending order of (in our case) mozRank importance.

Step 1 - We Crawl the Web

For those keeping track, this index's raw data includes:

  • 41,404,250,804 unique URLs/pages
  • 86,691,236 unique root domains

After crawling, we need build indices on which we can process data, metrics and sort orders for our API to access.

Step 2: We Build an Index

When we started building Linkscape in late 2007, early 2008, we quickly realized that the quantity of data would overwhelm nearly every commercial database on the market. Something massive like Oracle may be able to handle the volume, but at an exorbitant price that a startup like SEOmoz couldn't bear. Thus, we created some unique, internal systems around flat file storage that enable us to hold data, process it and serve it without the financial and engineering burdens of a full database application.

Our next step, once the index is in place, is to calculate our key metrics as well as tabulate the standard sort orders for the API

Step 3: We Conduct Processing

Algorithms like PageRank (and mozRank) are iterative and require a tremendous amount of processing power to compute. We're able to do this in the cloud, scaling up our need for number-crunching, mozRank-calculating goodness for about a week out of every month, but we're pretty convinced that in Google's early days, this was likely a big barrier (and may even have been a big part of the reason the "GoogleDance" only happened once every 30 days).

After processing, we're ready to push our data out into the SEOmoz API, where it can power our tools and those of our many partners, friends and community members.

Step 4: Push the Data to the API

The API currently serves more than 2 million requests for data each day (and an average request pulls ~10 metrics/pieces of data about a web page or site). That's a lot, but our goal is to more than triple that quantity by 2011, at which point we'll be closer to the request numbers going into a service like Yahoo! Site Explorer.

The SEOmoz API currently powers some very cool stuff:

  • Open Site Explorer - my personal favorite way to get link information
  • The mozBar - the SERPs overlay, analyze page feature and the link metrics displayed directly in the bar all come from the API
  • Classic Linkscape - we're on our way to transitioning all of the features and functionality in Linkscape over to OSE, but in the meantime, PRO members can get access to many more granular metrics through these reports
  • Dozens of External Applications - things like Carter Cole's Google Chrome toolbar, several tools from Virante's suite, Website Grader and lots more (we have an application gallery coming soon)

Each month, we repeat this process, learning big and small lessons along the way. We've gotten tremendously more consistent, redundant and error/problem free in 2010 so far, and our next big goal is to dramatically increase the depth of our crawl into those dark crevices of the web as well as ramping up the value and accuracy of our metrics.

We look forward to your feedback around this latest index update and any of the tools powered by Linkscape. Have a great Memorial Day Weekend!


Do you like this post? Yes No

Read more...

All Links are Not Created Equal: 10 Illustrations on Search Engines’ Valuation of Links

Posted by randfish

In 1997, Google's founders created an algorithmic method to determine importance and popularity based on several key principles:

  • Links on the web can be interpreted as votes that are cast by the source for the target
  • All votes are, initially, considered equal
  • Over the course of executing the algorithm on a link graph, pages which receive more votes become more important
  • More important pages cast more important votes
  • The votes a page can cast are a function of that page's importance, divided by the number of votes/links it casts

That algorithm, of course, was PageRank, and it changed the course of web search, providing tremendous value to Google's early efforts around quality and relevancy in results. As knowledge of PageRank spread, those with a vested interest in influencing the search rankings (SEOs) found ways to leverage this information for their websites and pages.

But, Google didn't stand still or rest on their laurels in the field of link analysis. They innovated, leveraging signals like anchor text, trust, hubs & authorities, topic modeling and even human activity to influence the weight a link might carry. Yet, unfortunately, many in the SEO field are still unaware of these changes and how they impact external marketing and link acquisition best practices.

In this post, I'm going to walk through ten principles of link valuation that can be observed, tested and, in some cases, have been patented. I'd like to extend special thanks to Bill Slawski from SEO By the Sea, whose recent posts on Google's Reasonable Surfer Model and What Makes a Good Seed Site for Search Engine Web Crawls? were catalysts (and sources) for this post.

As you read through the following 10 issues, please note that these are not hard and fast rules. They are, from our perspective, accurate based on our experiences, testing and observation, but as with all things in SEO, this is opinion. We invite and strongly encourage readers to test these themselves. Nothing is better for learning SEO than going out and experimenting in the wild.

#1 - Links Higher Up in HTML Code Cast More Powerful Votes

Link Valuation of Higher vs. Lower Links

Whenever we (or many other SEOs we've talked to) conduct tests of page or link features in (hopefully) controlled environments on the web, we/they find that links higher up in the HTML code of a page seem to pass more ranking ability/value than those lower down. This certainly fits with the recently granted Google patent application - Ranking Documents Based on User Behavior and/or Feature Data, which suggested a number of items that may considered in the way that link metrics are passed.

Higher vs. Lower Links Principle Makes Testing Tough

Those who've leveraged testing environments also often struggle against the power of the "higher link wins" phenomenon, and it can take a surprising amount of on-page optimization to overcome the power the higher link carries.

#2 - External Links are More Influential than Internal Links

Internal vs. External Links

There's little surprise here, but if you recall, the original PageRank concept makes no mention of external vs. internal links counting differently. It's quite likely that other, more recently created metrics (post-1997) do reward external links over internal links. You can see this in the correlation data from our post a few weeks back noting that external mozRank (the "PageRank" sent from external pages) had a much higher correlation with rankings than standard mozRank (PageRank):

Correlation of PageRank-Like Metrics

I don't think it's a stretch to imagine Google separately calculating/parsing out external PageRank vs. Internal PageRank and potentially using them in different ways for page valuation in the rankings.

#3 - Links from Unique Domains Matters More than Links from Previously Linking Sites

Domain Diversity of Links

Speaking of correlation data, no single, simple metric is better correlated with rankings in Google's results than the number of unique domains containing an external link to a given page. This strongly suggests that a diversity component is at play in the ranking systems and that it's better to have 50 links from 50 different domains than to have 500 more links from a site that already links to you. Curiously again, the original PageRank algorithm makes no provision for this, which could be one reason sitewide links from domains with many high-PageRank pages worked so well in those early years after Google's launch.

#4 - Links from Sites Closer to a Trusted Seed Set Pass More Value

Trust Distance from Seed Set

We've talked previously about TrustRank on SEOmoz and have generally reference the Yahoo! research paper - Combating Webspam with TrustRank. However, Google's certainly done plenty on this front as well (as Bill covers here) and this patent application on selecting trusted seed sites certainly speaks to the ongoing need and value of this methodology. Linkscape's own mozTrust score functions in precisely this way, using a PageRank-like algorithm that's biased to only flow link juice from trusted seed sites rather than equally from across the web.

#5 - Links from "Inside" Unique Content Pass More Value than Those from Footers/Sidebar/Navigation

Link Values Based on Position in Content

Papers like Microsoft's VIPS (Vision Based Page Segmentation), Google's Document Ranking Based on Semantic Distance, and the recent Reasonable Surfer stuff all suggest that valuing links from content more highly than those in sidebars or footers can have net positive impacts on avoiding spam and manipulation. As webmasters and SEOs, we can certainly attest to the fact that a lot of paid links exist in these sections of sites and that getting non-natural links from inside content is much more difficult.

#6 - Keywords in HTML Text Pass More Value than those in Alt Attributes of Linked Images

HTML Link Text vs. Alt Attributes

This one isn't covered in any papers or patents (to my knowledge), but our testing has shown (and testing from others supports) that anchor text carried through HTML is somehow more potent or valued than that from alt attributes in image links. That's not to say we should run out and ditch image links, badges or the alt attributes they carry. It's just good to be aware that Google seems to have this bias (perhaps it will be temporary).

#7 - Links from More Important, Popular, Trusted Sites Pass More Value (even from less important pages)

Link Value Based on Domain

We've likely all experienced the sinking feeling of seeing a competitor with fewer and what appear to be links from less powerful pages outranking us. This may be somewhat explained by the value of a domain to pass along value via a link that may not be fully reflected in page-level metrics. It can also help search engines to combat spam and provide more trusted results in general. If links from sites that rarely link to junk pass significantly more than those whose link practices and impact on the web overall may be questionable, they can much better control quality.

NOTE: Having trouble digging up the papers/patents on this one; I'll try to revisit and find them tomorrow.

#8 - Links Contained Within NoScript Tags Pass Lower (and Possibly No) Value

Noscript Tag Links

Over the years, this phenomenon has been reported and contradicted numerous times. Our testing certainly suggested that noscript links don't pass value, but that may not be true in every case. It is why we included the ability to filter noscript in Linkscape, but the quantity of links overall on the web inside this tag is quite small.

#9 - A Burst of New Links May Enable a Document to Overcome "Stronger" Competition Temporarily (or in Perpetuity)

Temporal Link Values

Apart from even Google's QDF (Query Deserves Freshness) algorithm, which may value more recently created and linked-to content in certain "trending" searches, it appears that the engine also uses temporal signals around linking to both evaluate spam/manipulation and reward pages that earn a large number of references in a short period of time. Google's patent on Information Retrieval Based on Historical Data first suggested the use of temporal data, but the model has likely seen revision and refinement since that time.

#10 - Pages that Link to WebSpam May Devalue the Other Links they Host

Spam and its Impact on Link Value

I was fascinated to see Richard Baxter's own experiments on this in his post - Google Page Level Penalty for Comment Spam. Since then, I've been keeping an eye on some popular, valuable blog posts that have received similarly overwhelming spam and, low and behold, the pattern seems verifiable. Webmasters would be wise to keep up to date on their spam removal to avoid arousing potential ranking penalties from Google (and the possible loss of link value).


But what about classic "PageRank" - the score of which we get a tiny inkling from the Google toolbar's green pixels? I'd actually surmise that while many (possibly all) of the features about links discussed above make their way into the ranking process, PR has stayed relatively unchanged from its classic concept. My reasoning? SEOmoz's own mozRank, which correlates remarkably well  with toolbar PR (off on avg. by 0.42 w/ 0.25 being "perfect" due to the 2 extra significant digits we display) and is calculated with very similar intuition to that of the original PageRank paper. If I had to guess (and I really am guessing), I'd say that Google's maintained classic PR because they find the simple heuristic useful for some tasks (likely including crawling/indexation priority), and have adopted many more metrics to fit into the algorithmic pie.

As always, we're looking forward to your feedback and hope that some of you will take up the challenge to test these on your own sites or inside test environments and report back with your findings.

p.s. I finished this post at nearly 3am (and have a board meeting tomorrow), so please excuse the odd typo or missed link. Hopefully Jen will take a red pen to this in the morning!


Do you like this post? Yes No

Read more...

Overcome the Google Analytics Learning Curve in 20 Minutes

Posted by Danny Dover

 As recently as a month ago I was a victim of a state of mind I call Analytics Dismissal Disorder. This mindset is common after hearing about the importance of analytics, installing the tracking code and then getting overwhelmed by all of the graphs and scary numbers. When I suffered from analytics dismissal disorder (which my doctors called A.D.D. for short), I knew Google Analytics was important but avoided the extra effort necessary to learn how to get the most out of the software. This post explains what I needed to learn to get over this.

Fat Danny Dover

After learning the basics of Google Analytics, you can learn interesting facts like what search terms people use to find your website. In this case, web searchers are more interested in fat people falling than they are in me.

Here is the problem with Google Analytics:

It is obviously potentially useful but who has the time to study how to use a product. I don’t even read the text-less IKEA manuals so why would I read documentation for software. Sounds boring.

This all changed when SEOmoz offered to pay for me to go to WebShare’s Google Analytics Seminar (Wait, you are paying me to leave the office? Mission Accomplished). This 16 hour class walked me through Google Analytics and pushed me through the massive learning curve.

This post distills what I learned in those 16 hours of employer-paid-learning into something you can understand and act on in 20 minutes. Nerd High Five! (*Pushes up glasses*)

Overcome the Google Analytics Learning Curve in 20 Minutes:


An actionable guide to learning what you need to know about Google Analytics.

First Things First:

What are Accounts and Profiles and how are they different?

When you first log in to Google Analytics you need to navigate to your desired data set. This is much more confusing than it ought to be.

Accounts are like folders on a computer. They can contain a lot of different files (profiles) and serve mostly just for organization. An example of an account might be Work Websites or Personal Websites. (Be forewarned, this is not intuitive on setup. Don't make the mistake I did and name an account after a website. That naming convention is more appropriate for a profile).

Accounts

Profiles, on the other hand, are like files on a computer. They can't contain additional profiles or accounts. They represent one view of a website (although not necessarily the only view). An example of a profile might be api.seomoz.org or SEOmoz minus Office IP addresses. You can limit a profile to whatever view of a website you want by using filters.

What are Filters and Segments and how are they different?

This is also more complicated than it ought to be. (grrr)

Filters are attached to website profiles (i.e. "SEOmoz minus office IP addresses") and are permanent. If a profile includes traffic data from all IP addresses except SEOmoz's office computers, there is absolutely no way to reinclude this excluded data in the given profile at a later time. Filters are irreversible and kinda mean (thus the anal in Google Analytics). You can set them up on the profiles page. (See Below)

Filters

Segments are similar to filters except they are profile agnostic and their effects are temporary. In addition, they can be compared against each other. The example segments below shows all visitors (blue line), new visitors (orange line), and returning visitors (green line) and their distribution on the top content of the given website.

Segments

What are "raw" profiles and why use them? (Ctrl+Z won’t save you here) 

Google Analytics is different from other Google products in that it doesn't provide a way to undo certain types of data processing (i.e. filters). In order to give you freedom to explore (and potentially ruin) your profiles, it is important that you create an unfiltered (raw) profile of your website that you can use in case something goes wrong with one of your other profiles. In SEOmoz's case, this profile is literally called "Do Not Touch! Backup Profile". This is the backup profile we will use to get historical data when Joanna Lord screws up our other profiles. (Danny!)

What if I don't trust a specific metric?

Tough beans! The key to getting the most out of Google Analytics is to trust it. This is very similar to how we measure time. We all know that our bedroom clock is probably not exactly synced with our office clock but we trust each time-peice as close enough. You need to make the same leap of faith for Google Analytics. The metrics might not be 100% accurate all of the time, but like a clock, at least they are consistent. This makes Google Analytics metrics good enough. (And quite frankly it is as accurate as all of its competitors)

 

Navigating Google Analytics:


GA Navigation
Google Analytics Navigation

Dashboard (Mostly Useless High-level Metrics)

As you would expect, the dashboard shows you the high-level status of your website. The problem is that these metrics tend not to drastically change very often so if you keep looking at your dashboard, you won't like see any big changes. ZzzzzzzzZZZzzzzz.

Real analytics pros don't let friends rely on the default dashboard stats.

Intelligence (Automated e-mail alerts) - Check Monthly

Intelligence is Google's confusing name for automatic alerts. Did traffic to your homepage jump 1000% over last week? Are visits from New Zealand down 80% from yesterday? Intelligence alerts will, with your permission, e-mail you if anything unexpected happens on your website.

Visitors (The type of people that come to your site) - Check Monthly

As the name implies, this section reveals information about your visitors. Want to know what percentage of your users have Flash enabled or how many people viewed your website on an iPad? This section will tell you. (Long live Steve Jobs!)

Traffic Sources (Where people are coming from to reach your site) - Check Weekly

This section shows you different reports on the sources that drove you traffic.

Content (Metrics on your pages) - Check Weekly

Whereas, Traffic Sources shows you information about other people's pages as they relate to yours, the Content section only shows you information about what happens on your pages.

Goals (Metrics on whether or not people are doing what you want them to do) - Check Daily

Goals are predefined actions on your website that you want others to perform. It is important to note that you must configure these manually. Google can't auto detect these. This section shows metrics on how people completed these goals or where they dropped off if they didn't complete them.

 

Report Interface:


The bread and butter of Google Analytics are the reports. These are the frameworks for learning about how people interact with your website.

Graph:

The graphs/reports in Google Analytics have 6 important options. The first three are detailed below:

Graph Left

  • Export. This is pretty self explanatory. You can export to PDF, XML, CSV, CSV for Excel or if you are too good for commas you can export to TSV.
  • E-mail. This is one of Google Analytics more useful features. This tab allows you to schedule reoccurring e-mails or one time reports for your co-workers. As an added bonus, if you set up these auto-reports, the recipeients don't even need to log into Google Analytics to access this data.
  • Units (in this case Pageviews). This is a report dependent unit that you can change based on the context.

Graph Right

  • Advanced Segments. This is an extremely powerful feature that allows you to slice and dice your data to your likings.
  • Date Range (in this case, Apr 24 2010 - May 24 2010).
  • Graph By. This feature allows you to choose the scope of the graph in relation to time intervals. For some reports you can even break down data to the hour.

 

Data:

Data is your tool to see specifics and and make quantifiable decisions.

  • Views. This feature actually affects the graphs and the data. It dictates the type of graph or the format or the data.
  • ?. This is your source for help on any given metric.
  • Secondary Dimension (in this case, None). This allows you to splice the data table by specific data dimensions (cities, sources, etc...)

 

Which Reports To Track and When:


I recommend using this as a starting point and tailoring it to your needs as you learn more about the unique needs for your website.

Daily

CheckboxGoals -> Total Conversions

CheckboxContent -> Top Content (at the page level)

CheckboxTraffic Sources -> All Traffic Sources

CheckboxTraffic Sources -> Campaigns - (Optional)

Weekly (or bi-weekly if you have a content intensive website)

CheckboxGoals -> Funnel Visualization

CheckboxGoals -> Goal Abandoned Funnels

CheckboxContent -> Site Search

CheckboxTraffic Sources -> Direct Traffic

CheckboxTraffic Sources -> Referring Sites

CheckboxTraffic Sources -> Keywords

Monthly

CheckboxVisitors -> Overview

CheckboxIntelligence -> Overview

CheckboxContent -> Content Drilldown (at the folder level)

CheckboxContent -> Top Landing Pages

CheckboxContent -> Top Exit Pages

CheckboxTraffic Sources -> Adwords - (Optional)

 

Which Reports to Ignore:


CheckboxVisitors -> Benchmarking

From installation validation tools, it's estimated that as many as 70% of Google Analytics installs are either incomplete or incorrect. This means that the data that these benchmarks rely on, is very likely inaccurate.

CheckboxVisitors -> Map Overlay

While this feature is one of the most popular features of Google Analytics, it is also one of the least useful. The data these maps present is not normalized so areas with high populations tend to always dominate the screen. They are not completely useless as they show trends but they are not something that can be relied on heavily either. Use your best judgement when viewing this report.

CheckboxContent -> Site Overlay

This feature seems like a good idea but is not able to be implemented in a way that makes it accurate. Put simply, in order for this tool to work, Google Analytics would need to have more information about the location of a link on a page and a mechanism for tracking which instance of a link gets clicked. Clicktale and Crazy Egg are nice alternatives.

 

Conclusion:


Tracking the metrics above is only the first step. Imagine Google Analytics as a magical yard stick (For you sissies on the metric system, a yard stick is like a meter stick but better). It is essential for measuring the success or failure of a given online strategy but it is not an online strategy alone. It is best used as a supplement to the your current activities and should be treated as such.

I am surely going to get some flak from some Analytics gurus who know more than me. (You want to go Kaushik?) Remember, this guide is intended to help people get over the GA learning curve, not to be a comprehensive guide. If you are looking for the latter, check out the hundreds of blog posts at the Google Analytics Blog.

One last thing, if you’re interested in taking the Seminars for Success classes, here’s the upcoming schedule.

Phoenix, AZ June 9-11, 2010
Chicago, IL June 23-25, 2010
Berkeley, CA July 28-30, 2010
Los Angeles, CA Aug 18-20, 2010
San Diego, CA Sep 1-3, 2010
Salt Lake City, UT Sep 15-17, 2010
Vancouver, BC Oct 6-8, 2010
Atlanta, GA Oct 27-29, 2010
Orlando, FL Nov 3-5, 2010
Washington, DC Dec 8-10, 2010

Danny Dover Twitter

If you have any other advice that you think is worth sharing, feel free to post it in the comments. This post is very much a work in progress. As always, feel free to e-mail me if you have any suggestions on how I can make my posts more useful. All of my contact information is available on my profile: Danny Thanks!


Do you like this post? Yes No

Read more...

Wrong Page Ranking in the Results? 6 Common Causes & 5 Solutions

Posted by randfish

Sometimes, the page you're trying to rank - the one that visitors will find relevant and useful to their query - isn't the page the engines have chosen to place first. When this happens, it can be a frustrating experience trying to determine what course of action to take. In this blog post, I'll walk through some of the root causes of this problem, as well as five potential solutions.

Asparagus Pesto Rankings in Google with the Wrong Page Ranking First

When the wrong page from your site appears prominently in the search results, it can spark a maddening conflict of emotion - yes, it's great to be ranking well and capturing that traffic, but it sucks to be delivering a sub-optimal experience to searchers who visit, then leave unfulfilled. The first step should be identifying what's causing this issue and to do that, you'll need a process.

Below, I've listed some of the most common reasons we've seen for search engines to rank a less relevant page above a more relevant one.

  1. Internal Anchor Text
    The most common issue we see when digging into these problems is the case of internal anchor text optimization gone awry. Many sites will have the keyword they're targeting on the intended page linking to another URL (or several) on the site in a way that can mislead search engines. If you want to be sure that the URL yoursite.com/frogs ranks for the keyword "frogs," make sure that anchor text that says "frogs" points to that page. See this post on keyword cannibalization for more on this specific problem.
    _
  2. External Link Bias
    The next most common issue we observe is the case of external links preferring a different page than you, the site owner or marketer, might. This often happens when an older page on your site has discussed a topic, but you've more recently produced an updated, more useful version. Unfortunately, links on the web tend to still reference the old URL. The anchor text of these links, the context they're in and the reference to the old page may make it tough for a new page to overcome the prior's rankings.
    _
  3. Link Authority & Importance Metrics
    There are times when a page's raw link metrics - high PageRank, large numbers of links and linking root domains - will simply overpower other relevance signals and cause it to rank well despite barely targeting (and sometimes barely mentioning) a keyword phrase. In these situations, it's less about the sources of links, the anchor text or the relevance and more a case of powerful pages winning out through brute force. On Google, this happens less than it once did (at least in our experience), but can still occur in odd cases.
    _
  4. On-Page Optimization
    In some cases, a webmaster/marketer may not realize that the on-page optimization of a URL for a particular keyword term/phrase is extremely similar to another. To differentiate and help ensure the right page ranks, it's often wise to de-emphasize the target keyword on the undesirable page and target it more effectively (without venturing into keyword stuffing or spam) on the desired page. This post on keyword targeting can likely be of assistance.
    _
  5. Improper Redirects
    We've seen the odd case where an old redirect has pointed a page that heavily targeted a keyword term/phrase (or had earned powerful links around that target) to the wrong URL. These can be very difficult to identify because the content of the 301'ing page no longer exists and it's hard to know (unless you have the history) why the current page might be ranking despite no effort. If you've been through the other scenarios, it's worth looking to see if 301 redirects from other URLs point to the page in question and running a re-pointing test to see if they could be causing the issue.
    _
  6. Topic Modeling / Content Relevance Issues
    This is the toughest to identify and to explain, but that won't stop us from trying :-) Essentially, you can think of the search engines doing a number of things to determine the degree of relevancy of a page to a keyword. Determining topic areas and identifying related terms/phrases and concepts is almost certainly among these (we actually hope to have some proof of Google's use of LDA, in particular, in the next few months to share on the blog). Seeing as this is likely the case, the engine may perceive that the page you're trying to rank isn't particularly "on-topic" for the target keyword while another page that appears less "targeted" from a purely SEO/keyphrase usage standpoint is more relevant.

Once you've gone through this list and determined which issues might be affecting your results, you'll need to take action to address the problem. If it's an on-page or content issue, it's typically pretty easy to fix. However, if you run into external linking imbalances, you may need more dramatic action to solve the mistmatch and get the right page ranking.

Next, we'll tackle some specific, somewhat advanced, tactics to help get the right page on top:

  1. The 301 Redirect (or Rel Canonical) & Rebuild
    In stubborn cases or those where a newer page is replacing an old page, it may be wise to simply 301 redirect the new page to the old page (or the other way around) and choose the best-converting/performing content for the page that stays. I generally like the strategy of maintaining the older, ranking URL and redirecting the newer one simply because the metrics for that old page may be very powerful and a 301 does cause some loss of link juice (according to the folks at Google). However, if the URL string itself isn't appropriate, it can make sense to instead 301 to the new page instead.

    Be aware that if you're planning to use rel=canonical rather than a 301 (which is perfectly acceptable), you should first ensure that the content is exactly the same on both pages. Trying to maintain two different version of a page with one canonicalizing to another isn't specifically against the engines' guidelines, but it's also not entirely white hat (and it may not work, since the engines do some checking to determine content matches before counting rel=canonical sometimes).
    _
  2. The Content Rewrite
    If you need to maintain the old page and have a suspicion that content focus, topic modeling or on-page optimization may be to blame, a strategy of re-authoring the page from scratch and focusing on both relevance and user experience may be a wise path. It's relatively easy to test and while it will suck away time from other projects, it may be helpful to give the page more focused, relevant, useful and conversion-inducing material.
    _
  3. The Link Juice Funnel
    If you're fairly certain that raw link metrics like PageRank or link quantities are to blame for the issue, you might want to try funnelling some additional internal links to the target page (and possibly away from the currently ranking page). You can use a tool like Open Site Explorer to identify the most important/well-linked-to pages on your site and modify/add links to them to help channel juice into the target page and boost its rankings/prominence.
    _
  4. The Content Swap
    If you strongly suspect that the content of the pages rather than the link profiles may be responsible and want to test, this is the strategy to use. Just swap the on-page and meta data (titles, meta description, etc) between the two pages and see how/if it impacts rankings for the keyword. Just be prepared to potentially lose traffic during the test period (this nearly always happens, but sometimes is worth it to confirm your hypothesis). If the less-well-ranked page rises with the new content while the better-ranked page falls, you're likely onto something.
    _
  5. The Kill 'Em with External Links
    If you can muster a brute force, external link growth strategy, either through widgets/badges, content licensing, a viral campaign to get attention to your page or just a group of friends with websites who want to help you out, go for it. We've often seen this precise strategy lift one page over another and while it can be a lot of work, it's also pretty effective.

While this set of recommendations may not always fix the issue, it can almost always help identify the root cause(s) and give you a framework in which to proceed. If you've got other suggestions, I look forward to hearing about them in the comments!


Do you like this post? Yes No

Read more...

3 Key Takeaways from Search & Social

Posted by Lindsay

Last week Jen and I attended the Search & Social Summit here in my backyard of Tampa Bay. This isn't your typical conference recap post, though. I wanted to focus on the action items that still stand out for me a week later, the things will make a difference in what I do or how I do it. Perhaps you'll rethink the way you do a thing or two as well.

Outsource, Seriously.

Kevin Henrikson is a low key guy, and one that I hadn’t met until the Search & Social Summit. You won’t see him spouting off on Twitter or elaborating on his accomplishments on LinkedIn. He beats even me in the blog neglect category. Personally, I wish he’d publish more. He has a strong business acumen and seems to find his comfort zone well outside the boundaries that most of us create in our own DIY vs. outsource struggles.

Kevin’s presentation was about outsourcing. I expected the standard cliché we’ve all heard 100 times, “Do what you do best. Outsource the rest.” Good advice, absolutely, but now what? Kevin's presentation was different. It outlined real, actionable strategies for outsourcing the things you’d expect - like copywriting and development - but he also spoke about his experience delegating some pretty unusual stuff like the hiring of a housekeeper for his parents out-of-state.

Kevin covered more than a dozen solid online sources for building your outsourced empire including craigslist (for local need), Amazon’s Mechanical Turk, and the old standby Elance. None of those excited me like oDesk and 99designs.

oDesk describes themselves as a marketplace for online workstreams. Don’t have time to sift through your email to identify the important ones that require a response? Hire a personal assistant to do the drudge work for you. Need a new site design converted to work with your WordPress blog? You'll be surprised by the rates. I created my account while listening to Kevin’s presentation and can’t wait to get started.

99designs provides a platform and 192K strong community to facilitate your own ‘design contest’. Open an account, outline your project in seven simple fields, pay a few hundred dollars and within a week you’ll have dozens of designs to choose from that were created by the 99designs community. I did a hack job of my own blog logo design a few years ago. I figured there was no time like the present, so jumped onto 99designs and kicked off my own contest. For a few hundred dollars I’ve received around 200 logo designs. You can check out the contest entries and maybe even help me choose a winner from the frontrunners.

If you want more information on how to leverage the outsourcing vehicles like the ones mentioned above, check out Rand's recent post on the topic here.

Targeted Promotion on Niche Social News Sites

If you're like me, when you think 'social news', examples like Digg and Reddit stand out. Though the traffic from these sites is astounding - IF you can get your story to the front page - obtaining traction is hit or miss and the competition is intense. Brent Csutoras is a wiz in the world of social marketing, and another speaker that presented some refreshing content at the Search & Social Summit last week.

Brent highlighted Kirtsy.com as a great place to post content that would appeal to a female audience, for example. This isn't the kind of place to post the latest puss video from PopThatZit (view at your own risk. eww) but if you take a look at the current list of most popular content on the Kirtsy homepage, you'll get the idea of what is possible there. I was surprised to see a few listings from small personal blogs on topics like crafts and parenting.

Despite being more than a year old, Brent says that this list of niche social media sites from Chris Winfield over at 10e20 is still the best out there. Think about the opportunities for sites you represent. No doubt a few more niche social news sites have cropped up since then. If you have another one that works for you, I'd love to hear about it in the comments.

Get New Content Indexed Faster

Michael Gray recommends creating small sitemaps of <100 pages, in addition to your regular sitemap(s), to help get  new content indexed faster.

Michael has found that for sites that add a lot of new pages, or want to get the pages they do add indexed quickly, using a dedicated sitemap for fresh content is the key. In his testing, deep pages on large sites that would sometimes take weeks or months to make it into the index took just 1-3 days with the dedicated fresh content XML sitemap. He suggests playing with the '100' number. That is what the need has been for his clients, but if you are working with a site that has a larger fresh content output you may achieve the same affect by including more.

I'll be testing this one out for sure! Let us know how it goes for you, too.

Action Items

  1. Are you making the most of your time? Think about the things that someone else could do for you and outsource it. Check out 99designs for graphics work and oDesk for nearly everything else.
  2. Look through Chris Winfield's list of niche social news sites. Maybe your content can 'make popular' on social news afte rall.
  3. Try creating a supplemental fresh content XML sitemap to see if it helps you get your content indexed faster.

Happy Optimizing!

Lindsay Wassell (aka @lindzie)


Do you like this post? Yes No

Read more...

Define Competitors: Step 4 of the 8-Step SEO Strategy

Posted by laura

This post was originally in YOUmoz, and was promoted to the main blog because it provides great value and interest to our community. The author's views are entirely his or her own and may not reflect the views of SEOmoz, Inc.

Congratulations on making it halfway through building this SEO Strategy document with me!  Do you feel your value as an SEO rising?

If you’re jumping into the 8 Step SEO Strategy here in Step 4, or just need a recap, you can find the previous three steps here:

DEFINING CATEGORY COMPETITORS

Step 4 is a simple one where we’ll be defining our competitors in SERPs for use in dissection in the Step 5.

We’ll only be looking at search engine competitors here, and not comScore, Hitwise or other types of industry-defined competition by Uniques or Page Views, or any other metric.  For the SEO Strategy we’re building here, we’re concerned with Search, therefore we’ll stick to competitors in search results only. 

I can already hear you saying – this is easy – just do a search for your keywords and see who shows up.  True.  That’s part of it.  But because we’re going to do some serious dissection in Step 5, we’ll want to make sure we get the right competitors to dissect and compare ourselves against.

We broke our keyword research out into categories in Step 2, so we’ll want to define competitors for each category (or pick just a few important categories – especially if you're working on large enterprise-sized sites).

What I mean when I mention defining competitors by categories is this: If I am working on a site all about celebrities, my competitors might be OMG, TMZ, Perez Hilton, etc.  But that’s only at the high level.  My keyword categories from step 2 might cover subtopics like celebrity photos, celebrity news and more. Each of those subtopics has someone who is dominating those rankings.  It may be the same one or two sites across the board, but it’s likely that each subtopic will have different high-ranking competitors.  We want to know specifically who’s doing well for each topic.

HOW TO FIND YOUR COMPETITORS

There are several ways you can do this. If you’ve already got a method you like and want to stick with – by all means do (and if you’re compelled to share your method with us in the comments – you know we love to hear it).  I’m going to give you an example of how I pull this data together. 

Here’s how I set it up:

Grab a new Excel worksheet and name it something like ‘Competitors’.  Create one tab to keep track of your overall site competitors, and if you’re tracking any subtopics on your site (likely the keyword categories we defined in step 2), create a tab for each one of those that you’re going to do competitive research for.  We’re not going to do any calculations or fancy stuff with this worksheet – it’s just for keeping track of your competitors in one place.  You can use a Word doc or good ol’ pen and paper if you want too.

Excel category tabs

The easy way to figure out who your competitors are is to type a couple of terms into the search box and see who shows up.  So let’s look at that method. Here’s what I see in the top 5 results for [celebrity gossip]. 

Google Search results for celebrity gossip

Take note in your Excel sheet of who’s appearing in the top rankings for a couple of terms for each tab/topic.  You don’t have to look up the competitors for every term in your keyword group, just pick a few and make note of what comes up.

You can also choose to check the top rankings in all three search engines, or just pick one. It’s up to you.  In the end you’ll be looking for which site(s) show up the most often for this keyword group.

Another method of doing this is to use SEOmoz’s Keyword Difficulty Tool.  The cool thing about the Difficulty Tool is that you get extra insights along with your top competitors.  But for this example I just want to get my top-ranked competitors in a downloadable csv file that I’ll just copy and paste into my Excel sheet.

To get this info, type in one of your terms:

Enter keyword into SEOmoz Keyword Difficulty Tool

Below the difficulty score and authority comparison graph are the top-ranked results...

SEOmoz Keyword Difficulty Results - Top ranked competitors for celebrity gossip

...and at the bottom of the page you can export the results.  I’ll do the same thing for a few more terms that represent the topic I’m researching, and add the results all to the tab for the topic.

In the end I have something that looks like this – here’s my general terms (there’s only two for this example, but the more terms you can use the better idea you’ll get of who shows up in the rankings the most):

Comparing top-ranked competitors for general celebrity terms in Excel

I’ve highlighted the sites that show up in the top 5 rankings for both terms and made a note of it on the top.  This is a competitor I know I want to target.

Here’s another example of one of my subcategories:

comparing top-ranked competitors for celebrity news topic in Excel worksheet

Here I see two sites appearing for multiple keywords. I’ve highlighted them and made note of them at the top.  These are competitors I’ll be targeting for my competitive dissection of sites for the Celebrity News subtopic in Step 5.   Again, there’s only 3 terms in the screenshot example above – I recommend pulling the data for at least 5-10 per topic.

Note that you can also choose to target 2 competitors or 5 competitors for each category – whatever you prefer (I usually like to do at least 3).  The more sites you choose the more work you have to do in Step 5, but the more insight you’ll get back. 

That’s the jist of it folks.  Now you have targeted competitors defined for each topic you’re interested in.  In the next post we’ll look at how to dig into the competitive landscape to uncover site features, content, and SEO strategy that should be built into your site in order to outrank your competitors. This is where we really start to take SEO to another level. 

In the meantime, if you use any of the vast selection of SEO tools out there to define your competitors, or just do it in a different way, please share with the readers in the comments!

 


Do you like this post? Yes No

Read more...

Whiteboard Friday – Facebook’s Open Graph WON’T Replace Google

Posted by great scott!

Earlier this week Facebook announced its 'Open Graph' at F8. There was all sorts of hubbub (much of it the bye-product of well-orchestrated buzz) about Facebook finally making strides to kill Google's dominance of the web.  So should you hangup your white hat, your black hat, your grey hat, and trade it all in for a blue hat?  Much as we love Facebook, the answer, dear reader, is no: SEO is not dead. 

Watch this week's video to hear Rand's take on how Facebook's 'Open Graph' will impact web marketing and all the ways it won't.  There are all sorts of opportunities that will likely emerge out of this new technology, so you should pay attention. So go ahead and keep an eye out for a nice fitting blue hat in the near future, but don't plan to throw away your white hat anytime soon.

 

 

 

 

Facebook Sticker
The sticker we received


Do you like this post? Yes No

Read more...

Competitive Intelligence: Purpose & Process

Posted by JoannaLord

When it comes to marketing your brand online there is just so much to do. We spend our days researching, creating, implementing, and then measuring the success of our efforts. There are dozens of channels to participate in, and obviously thousands of ways to go about marketing your brand, but however you slice it—online marketing comes down to introducing new audiences to your brand, keeping your current brand users happy, and evolving the brand/company itself. outline strategy

Unfortunately I think the first two steps often overshadow that third step to the process—evolving the brand/company itself, probably because to grow as a company you really need to take a pause and evaluate where you are currently standing. As marketers, the idea of pausing is equated with losing momentum which scares the hell out of us all. This industry moves too quickly, and pausing to reflect on where your brand is compared to your competitors seems like time poorly spent.

I am here to argue just the opposite. A few weeks ago I gave a presentation at PubCon South on “Competitive Intelligence on the Social Web,” and I wanted to extract a few of my key arguments and offer them up the SEOmoz audience both as thought provokers and for feedback. In my opinion competitive intelligence is one of those marketing steps we all say we did, but few of us rarely do. It’s true. Most of us are big fat liars when it comes to “doing competitive intelligence.”

For example, competitive intelligence IS NOT:

  • Sitting in a room and ranting about your competitor’s latest marketing move
  • Grabbing lunch with your Product Manager and creating a roadmap based on what your competitors have that you don’t.
  • Putting together a grid of you and your competitor’s website’s traffic stats, never to be looked at again.
  • Googling your competitor’s brand name to see what latest things are noted in the SERP’s


Sorry friends that is not competitive intelligence.

However, competitive intelligence IS:

  • Understanding what direction your competitor's are headed & how that might intersect or parallel your own
  • Knowing what products you are pushing out and how they match up or differ from your competitor's
  • Mapping out a list of key differentials and attributes for your biggest competitors and yourself
  • Researching & monitoring a variety of platforms to better understand your competitors


Okay now that we all have a better sense of what it is, let’s talk about how to do it.  Instead of throwing a 20-slide PowerPoint at you I thought I would dilute it down to a few key steps toward understanding your competitive landscape, and perhaps more importantly I want to tie those into how you can use this information for company gains.

The Grid of Awesomeness:
Okay maybe that name is a bit of an exaggeration, but either way, the first key step toward understanding your competitors is getting them all down on paper and forcing yourself to research key attributes. I have included below an example grid that you can use to get you started.

You might ask yourself—how do I know which competitors to include? This can differ depending on the size of your company and the scope of your industry but a great place to start is the “3-1-1 rule”. I usually suggest you pick 3 brands that are often grouped with yours, either in roundup articles, or in conversation. Those are your primary competitors. Then choose one “dreamer,” which would be the brand in your vertical you hope to be one day. Lastly, I suggest including one “newbie” in your competitive analysis, this is assuming that isn’t you of course. By picking a newbie in your industry you can often gain perspective into where your industry is moving, and key marketing channels to consider since they tend to operate pretty lean.

After you have chosen your competitors I suggest filling out the following for them: name, size, products, features, price points, affiliate program description (do they have one? What are the key attributes?), playing grounds (what channels, platforms, communities are they dominating?), advocates/influencers (who is lobbying for them?), notes. Don’t forget to fill this out for your company as well!

Example Grid:
Competitive Analysis Grid

Product Growth & Benchmarking:
This is perhaps the most time consuming element to competitive intelligence when it is done well. There needs to be someone in charge of competitive intelligence maintenance. This person should subscribe to your competitor’s blog so you are hearing about product launches as they happen, and all company announcements in real time. You can also gain a lot of insight from reading the comments to those posts.

In addition to this you should set up Google Alerts for your competitor’s brand plus the words “launches” and “announces.” We all know that Google Alerts are limited and somewhat unreliable, but you should have a daily digest set to notify you of any big moves your competitor's are making. You never know which could be a real momentum changer.

The last step to this is really to keep a pulse on the traffic growth to their sites by checking Alexa or Compete monthly. While it may seem a strain on your time and resources it’s beneficial for you to know what momentum trajectory your competitor’s are on.

Monitoring Mentions:
This is what most people think competitive intelligence is. While it's not the only piece to the competitive intelligence puzzle, it certainly is an important one. There are so many tools available to us (most free) that help us keep an eye on what our competitors do…it’s actually a bit creepy how many tools and sites are out there to help us be shady. I personally support this shadiness.

Some examples would include sites like: Whostalkin, SocialMention, Backtype, etc. All of these allow you to search a competitor’s brand or products and find out the latest things said about them. These social web aggregators search a number of channels like images, videos, blogs, new feeds, etc. They are great for understanding how a product launch might have gone for a competitor or how any other announcement was received.

Other ways to spy on your competitor’s in the social web—create private twitter lists and monitor their brand and employee’s feeds, sign up for competitor’s newsletters, etc. The key is know where they are pushing out the most crucial information and then making sure you have someone dabbling in that space.

Hiring Espionage:
Now that you have a sense of where your competitor’s currently stand and what they are doing right now, it’s time to spy on them and try to figure out their next moves. Hiring espionage is a great way to do this. You can gain a great sense of where your competitors are moving by looking at who they are investing in from an employee perspective.

A great way to do this is to keep an eye on their company job listings, and occasionally throw their brand into a job meta-engine. The best possible place to spy on hiring moves is by going to LinkedIn and finding their company profile page. There is a section down at the bottom that shows recent hires. You can defer tons of information from this section—are they hiring a bunch of sales people? Top-level engineers? Whatever team they are stacking up is probably the team they are focusing on.

The Takeaway:
The important thing to remember is that competitive intelligence isn’t something you do once and never revisit again. It also isn’t something that you can base on intuition or informal conversations with coworkers. Competitive intelligence is a key process that can be used to inform instrumental decisions you make. The better you understand your competitors the clearer perspective you have on your industry and audience as a whole. Competitive intelligence enables you to better speak on your strengths, brainstorm ideas for quick gains, and make more data-driven decisions all around.

Plus you get to pretend you are a spy which is just all sorts of fun (please note trench coat and night vision goggles are optional).
 


Do you like this post? Yes No

Read more...