Monday, September 2, 2013

Google Rolling Out First Panda Refresh of 2013 Today

Beware the Panda. According to a tweet from the official @Google Twitter account this morning, a new data refresh is rolling out today.
This update, according to the notice, should only affect 1.2 percent of English language queries. No other information is available so far.
google-panda-tweet-1-22-2013
This is the first Panda data refresh of 2013. It also marks the third consecutive month of Panda data updates.
The first Panda update was nearly two years ago in February 2011. Google's stated goal of Panda is to reward "high-quality sites."
While Google has never formally defined what a "high-quality site" is, Google has its own list of bullet points on their blog post from early 2011. The rationale has always been the same: to find more high-quality sites in search.

Google Goes Boom on Low-Quality Sites...So They Say

Chances are good that you or someone you know has seen some ranking changes today as Google rolled out a new algorithmic update. With the recent announcements aimed at "low quality sites" (many interpret this to mean content farms), even less than two weeks ago, Google stated they were exploring different new methods to detect spam.
"This update is designed to reduce rankings for low-quality sites--sites which are low-value add for users, copy content from other websites or sites that are just not very useful. At the same time, it will provide better rankings for high-quality sites--sites with original content and information such as research, in-depth reports, thoughtful analysis and so on," Google announced last night.
No one can say this one came out of left field. Google launched an algorithm tweak in January to combat spam and scraper sites, though that affected a much smaller number of sites.
This IS a big one. We're talking 11.8% across the board. Now, the big question is did they do it right?
From the looks of it, Google is not simply devaluing sites serving duplicated content, they are going after sites here with specific types of backlinks, spying through Chrome extensions, and this is only within the first 24 hours! More will become clear once site owners see drastic changes in their traffic stats.
As with every major Google update, SEO forums are dedicating a thread to this and they are filling up fast with reactions and reports. Since BackLinkForum.com tends to have the skilled gray/blackhat crowd, and because this update is only happening in the U.S. (for now), BLF is a great place to see what is really happening down in the trenches.
Two possible things happening there worth noting:
  1. Sites with the majority of their backlink profiles consisting of profile links could be a target.
  2. Not every content farm was red flagged. This may have been a response to the scraper update along with bigger content farm sites. Similar to the recent Blekko update.
ehow-and-wiki-ranking-high.jpg
Although coming to a conclusion about the update within 24 hours is extremely risky, I would be willing to bet that this is targeting self-service linking as much as content farms.
However, sites like eHow, Answers.com, and even low-level scraper sites still seem to be saturating the SERPs. That leaves me asking, "Who was penalized then?"
As with any Google algorithm changes, some innocent sites are going to be slammed. Some SEOs have reported seeing 40 percent traffic drops to their sites.
This latest update may just more evidence that Google simply can't distinguish between "good" or "bad" content.
Let us know what you're seeing today -- the good, bad, and disastrous.

Blekko Removes Content Farms From Search Results

In an effort to combat web spam, Blekko will block from its search results 20 of the worst-offending SERP clogging content farms, including Demand Media's eHow and Answerbag, TechCrunch reports. This list of barred sites are as follows:

  • ehow.com
  • experts-exchange.com
  • naymz.com

  • activehotels.com

  • robtex.com

  • encyclopedia.com
  • fixya.com

  • chacha.com

  • 123people.com

  • download3k.com

  • petitionspot.com

  • thefreedictionary.com

  • networkedblogs.com

  • buzzillions.com

  • shopwiki.com

  • wowxos.com

  • answerbag.com

  • allexperts.com

  • freewebs.com

  • copygator.com
blekko-spam-clock.png
Blekko seems to be taking spam seriously. Last month, the newest search engine introduced the spam clock, which announced that 1 million new spam pages are created every hour. As of this morning, the total number of spam pages was at 750 million and counting (though Blekko admits it is more "illustrative more than scientifically accurate.")
The reasoning for the spam clock, according to Blekko CEO Rich Skrenta:
"Millions upon millions of pages of junk are being unleashed on the web, a virtual torrent of pages designed solely to generate a few pennies in ad revenue for its creator. I fear that we are approaching a tipping point, where the volume of garbage soars beyond and overwhelms the valuable of what is on the web."
So Blekko seems to be doing its small part for cleaning up its own search results.
Meanwhile, Google has also announced an algorithm change to combat spam. But as Mike Grehan notes in his column today "The Google Spam-Jam," spam "is a problem that Google has had from day one and it's not likely to go away anytime soon" with its current search model.

Google's War on Spam Begins: New Algorithm Live

Google's Matt Cutts today announced the launch of a new algorithm that is intended to better detect and reduce spam in Google's search results and lower the rankings of scraper sites and sites with little original content. Google's main target is sites that copy content from other sites and offer little useful, original content of their own.
Positing on Hacker News, Cutts wrote:
"The net effect is that searchers are more likely to see the sites that wrote the original content. An example would be that stackoverflow.com will tend to rank higher than sites that just reuse stackoverflow.com's content. Note that the algorithmic change isn't specific to stackoverflow.com though."
On his blog, Cutts wrote:
This was a pretty targeted launch: slightly over 2% of queries change in some way, but less than half a percent of search results change enough that someone might really notice. The net effect is that searchers are more likely to see the sites that wrote the original content rather than a site that scraped or copied the original site's content.
Cutts said the change was approved last Thursday and launched earlier this week. Cutts announced Google's intention to up the fight against spam in an Official Google Blog post last Friday.
In response to criticism that Google's results were deteriorating and seeing more spam in recent months, Cutts said a newly redesigned document-level classifier will better detect repeated spammy words, such as those found in "junky" automated, self-promoting blog comments. He also said that spam levels today are much better than five years ago.
At Webmaster World, there is discussion about big drops in traffic. Are you seeing any changes as a result of this change?


Negative SEO Case Study: How to Uncover an Attack Using a Backlink Audit

negative-seo
Ever since Google launched the Penguin update back in April 2012, the SEO community has debated the impact of negative SEO, a practice whereby competitors can point hundreds or thousands of negative backlinks at a site with the intention of causing harm to organic search rankings or even completely removing a site from Google's index. Just jump over to Fiverr and you can find many gigs offering thousands of wiki links, or directory links, or many other types of low-quality links for $5.
By creating the Disavow Links tool, Google acknowledged this very real danger and gave webmasters a tool to protect their sites. Unfortunately, most people wait until it's too late to use the Disavow tool; they look at their backlink profile and disavow links after they've been penalized by Google. In reality, the Disavow Links tool should be used before your website suffers in the SERPs.
Backlink audits have to be added to every SEO professional's repertoire. These are as integral to SEO as keyword research, on-page optimization, and link building. In the same way that a site owner builds links to create organic rankings, now webmasters also have to monitor their backlink profile to identify low quality links as they appear and disavow them as quickly as they are identified.
Backlink audits are simple: download your backlinks from your Google Webmaster account, or from a backlink tool, and keep an eye on the links pointing to your site. What is the quality of those links? Do any of the links look fishy?
As soon as you identify fishy links, you can then try to remove the links by emailing the webmaster. If that doesn't work, head to Google's disavow tool and disavow those links. For people looking to protect their sites from algorithmic updates or penalties, backlink audits are now a webmaster's best friend.
If your website has suffered from lost rankings and search traffic, here's a method to determine whether negative SEO is to blame.

A Victim of Negative SEO?

Google Analytics 2012 vs 2013 Traffic
A few weeks ago I received an email from a webmaster whose Google organic traffic dropped by almost 50 percent within days of Penguin 2.0. He couldn't understand why, given that he'd never engaged in SEO practices or link building. What could've caused such a massive decrease in traffic and rankings?
The site is a 15-year-old finance magazine with thousands of news stories and analysis, evergreen articles, and nothing but organic links. For over a decade it has ranked quite highly for very generic informational financial keywords – everything from information about the economies of different countries, to very detailed specifics about large corporations.
With a long tail of over 70,000 keywords, it's a site that truly adds value to the search engine results and has always used content to attract links and high search engine rankings.
The site received no notifications from Google. They simply saw a massive decrease in organic traffic starting May 22, which leads me to believe they were impacted by Penguin 2.0.
In short, he did exactly what Google preaches as safe SEO. Great content, great user experience, no manipulative link practices, and nothing but value.
So what happened to this site? Why did it lose 50 percent of its organic traffic from Google?

Backlink Audit

I started by running a LinkDetox report to analyze the backlinks. Immediately I knew something was wrong:
Your Average Link Detox Risk 1251 Deadly Risk
Upon further investigation, 55 percent of his links were suspicious, while 7 percent (almost 500) of the links were toxic:
Toxic Suspicious Healthy Links
So the first step was to research those 7 percent toxic links, how they were acquired, and what types of links they were.
In LinkDetox, you can segment by Link Type, so I was able to first view only the links that were considered toxic. According to Link Detox, toxic links are links from domains that aren't indexed in Google, as well as links from domains whose theme is listed as malware, malicious, or having a virus.
Immediately I noticed that he had many links from sites that ended in .pl. The anchor text of the links was the title of the page that they linked to.
It seemed that the sites targeted "credit cards", which is very loosely in this site's niche. It was easy to see that these were scraped links to be spun and dropped on spam URLs. I also saw many domains that had expired and were re-registered for the purpose of creating content sites for link farms.
Also, check out the spike in backlinks:
Backlink Spike
From this I knew that most of the toxic links were spam, and links that were not generated by the target site. I also saw many links to other authority sites, including entrepreneur.com and venturebeat.com. It seems that this site was classified as an "authority site" and was being used as part of a spammers way of adding authority links to their outbound link profile.

Did Penguin Cause the Massive Traffic Loss?

I further investigated the backlink profile, checking for other red flags.
His Money vs Brand ratio looked perfectly healthy:
Money vs Brand Keywords
His ratio of "Follow" links was a little high, but this was to be expected given the source of his negative backlinks:
Follow vs Nofollow Links
Again, he had a slightly elevated number of text links as compared to competitors, which was another minor red flag:
Text Links
One finding that was quite significant was his Deep Link Ratio, which was much too high when compared with others in his industry:
Deep Link Ratio
In terms of authority, his link distribution by SEMrush keyword rankings was average when compared to competitors:
SEMrush Keyword Rankings
Surprisingly, his backlinks had better TitleRank than competitors, meaning that the target site's backlinks ranked for their exact match title in Google – an indication of trust:
metric-comparison-titlerank
Penalized sites don't rank for their exact match title.
The final area of analysis was the PageRank distribution of the backlinks:
Link Profile by Google PageRank
Even though he has a great number of high quality links, the percentage of links that aren't indexed in Google is substantially great. Close to 65 percent of the site's backlinks aren't indexed in Google.
In most cases, this indicates poor link building strategies, and is a typical profile for sites that employ spam link building tactics.
In this case, the high quantity of links from pages that are penalized, or not indexed in Google, was a case of automatic links built by spammers!
As a result of having a prominent site that was considered by spammers to be an authority in the finance field, this site suffered a massive decrease in traffic from Google.

Avoid Penguin & Unnatural Link Penalties

A backlink audit could've prevented this site from being penalized from Google and losing close to 50% of their traffic. If a backlink audit had been conducted, the site owner could've disavowed these spam links, performed outreach to get these links removed, and documented his efforts in case of future problems.
If the toxic links had been disavowed, all of the ratios would've been normalized and this site would've never been pegged as spam and penalized by Penguin.

Backlink Audits

Whatever tool you use - whether it's Ahrefs, LinkDetox, or OpenSiteExplorer – it's important that you run and evaluate your links on a monthly basis. Once you have the links, make sure you have metrics for each of the links in order to evaluate their health.
Here's what to do:
  • Identify all the backlinks from sites that aren't indexed in Google. If they aren't indexed in Google, there's a good chance they are penalized. Take a manual look at a few to make sure nothing else is going on (e.g., perhaps they just moved to a new domain, or there's an error in reporting). Add all the N/A sites to your file.
  • Look for backlinks from link or article directories. These are fairly easy to identify. LinkDetox will categorize those automatically and allow you to filter them out. Scan each of these to make sure you don't throw out the baby with the bathwater, as perhaps a few of these might be healthy.
  • Identify links from sites that may be virus infected or have malware. These are identified as Toxic 2 in LinkDetox.
  • Look for paid links. Google has long been at war with link buying and it's an obvious target. Find any links that have been paid and add them to the list. You can find these by sorting the results by PageRank descending. Evaluate all the high PR links as those are likely the ones that were purchased. Look at each and every one of the high quality links to assess how they were acquired. It's almost always pretty obvious if the link was organic or purchased.
  • Take the list of backlinks and run it through the Juice Tool to scan for other red flags. One of my favorite metrics to evaluate is TitleRank. Generally, pages that aren't ranking for their exact match title have a good chance of having a functional penalty or not having enough authority. In the Juice report, you can see the exact title to determine if it's a valid title (for example, if the title is "Home", of course they won't rank for it, whether they have a penalty). If the TitleRank is 30+, you can review that link by doing a quick check, and if the site looks spammy, add it to your "Bad Links" file. Do a quick scan for other factors, such as PageRank and DomainAuthority, to see if anything else seems out of place.
By the end of this stage, you'll have a spreadsheet with the most harmful backlinks to a site.
Upload this Disavow File, to make sure the worst of your backlinks aren't harming your site. Make sure you then upload this disavow file when performing further tests on Link Detox as excluding these domains will affect your ratios.

Don't be a Victim of Negative SEO!

Negative SEO works; it's a very real threat to all webmasters. Why spend the time, money, and resources building high quality links and content assets when you can work your way to the top by penalizing your competitors?
There are many unethical people out there; don't let them cause you to lose your site's visibility. Add backlink audits and link profile protection as part of your monthly SEO tasks to keep your site's traffic safe. It's no longer optional.

To Be Continued...

At this point, we're still working on link removals, so there is nothing conclusive to report yet on a recovery. However, once the process is complete, I plan to write a follow-up post here on SEW to share additional learnings and insights from this case.

Google Penguin 2.0 Update is Live

google-penguin-watch-out-webspam
Webmasters have been watching for Penguin 2.0 to hit the Google search results since Google's Distinguished Engineer Matt Cutts first announced that there would be the next generation of Penguin in March. Cutts officially announced that Penguin 2.0 is rolling out late Wednesday afternoon on "This Week in Google".
"It's gonna have a pretty big impact on web spam," Cutts said on the show. "It's a brand new generation of algorithms. The previous iteration of Penguin would essentinally only look at the home page of a site. The newer generation of Penguin goes much deeper and has a really big impact in certain small areas."
In a new blog post, Cutts added more details on Penguin 2.0, saying that the rollout is now complete and affects 2.3 percent of English-U.S. queries, and that it affects non-English queries as well. Cutts wrote:
We started rolling out the next generation of the Penguin webspam algorithm this afternoon (May 22, 2013), and the rollout is now complete. About 2.3% of English-US queries are affected to the degree that a regular user might notice. The change has also finished rolling out for other languages world-wide. The scope of Penguin varies by language, e.g. languages with more webspam will see more impact.
This is the fourth Penguin-related launch Google has done, but because this is an updated algorithm (not just a data refresh), we’ve been referring to this change as Penguin 2.0 internally. For more information on what SEOs should expect in the coming months, see the video that we recently released.
Webmasters first got a hint that the next generation of Penguin was imminent when back on May 10 Cutts said on Twitter, “we do expect to roll out Penguin 2.0 (next generation of Penguin) sometime in the next few weeks though.”
Matt Cutts Tweets About Google Penguin
Then in a Google Webmaster Help video, Cutts went into more detail on what Penguin 2.0 would bring, along with what new changes webmasters can expect over the coming months with regards to Google search results.
He detailed that the new Penguin was specifically going to target black hat spam, but would be a significantly larger impact on spam than the original Penguin and subsequent Penguin updates have had.
Google's initial Penguin update originally rolled out in April 2012, and was followed by two data refreshes of the algorithm last year – in May and October.
Twitter is full of people commenting on the new Penguin 2.0, and there should be more information in the coming hours and days as webmasters compare SERPs that have been affected and what kinds of spam specifically got targeted by this new update.
Let us know if you've seen any significant changes, or if the update has helped or hurt your traffic/rankings in the comments.
UPDATE: Google has set up a Penguin Spam Report form.

Google Penguin Tightens the Noose on Manipulative Link Profiles [Report]

Portent, a Seattle-based Internet marketing agency, has released a report offering new insight into Google’s Penguin algorithm. The report, based on primary data gathered by the agency, suggested that Google has been “applying a stricter standard over time.”
penguin-links
In part, the report reads:

In the initial Penguin update, the only sites we saw penalized had link profiles comprised of more than 80 percent manipulative links. Within two months, Google lowered the bar to 65 percent. Then in October 2012, the net got much wider. Google began automatically and manually penalizing sites with 50 percent manipulative links.
Although the report refers to Penguin a penalty, Penguin isn't a penalty. A penalty is a manual action taken against a site.
Yes, the Penguin update has demoted the rankings of sites, but as Google's Distinguished Engineer Matt Cutts has explained, Penguin is an algorithmic change, not a penalty. We explain this more in "Google Penalty or Algorithm Change: Dealing With Lost Traffic."
If Portent's findings are correct, then Google is likely becoming more confident with the accuracy of its Penguin algorithm in terms of minimizing false positives.
What does this mean for webmasters and SEO professionals? Continue to diligently clean up your inbound link profile.
Identify bad inbound links, then remove them or disavow them. Google’s next iteration of Penguin could lower the tolerance level for spammy inbound links even further; this might even be what Matt Cutts was referring to when he stated at this year’s SXSW that the next Penguin release would be significant and one of the more talked about Google algorithm updates this year.

Google Penguin, the Second (Major) Coming: How to Prepare

Unless you've had your head under a rock you've undoubtedly heard the rumblings of a coming Google Penguin update of significant proportions.
To paraphrase Google’s web spam lead Matt Cutts the algorithm filter has "iterated" to date but there will be a "next generation" coming that will have a major impact on SERPs.
Having watched the initial rollout take many by surprise it make sense this time to at least attempt to prepare for what may be lurking around the corner.

Google Penguin: What We Know So Far

We know that Penguin is purely a link quality filter that sits on top of the core algorithm, runs sporadically (the last official update was in October 2012), and is designed to take out sites that use manipulative techniques to improve search visibility.
And while there have been many examples of this being badly executed, with lots of site owners and SEO professionals complaining of injustice, it is clear that web spam engineers have collected a lot of information over recent months and have improved results in many verticals.
That means Google's team is now on top of the existing data pile and testing output and as a result they are hungry for a major structural change to the way the filter works once again.
We also know that months of manual resubmissions and disavows have helped the Silicon Valley giant collect an unprecedented amount of data about the "bad neighborhoods" of links that had powered rankings until very recently, for thousands of high profile sites.
They have even been involved in specific and high profile web spam actions against sites like Interflora, working closely with internal teams to understand where links came from and watch closely as they were removed.
In short, Google’s new data pot makes most big data projects look like a school register! All the signs therefore point towards something much more intelligent and all encompassing.
The question is how can you profile your links and understand the probability of being impacted as a result when Penguin hits within the next few weeks or months?
Let’s look at several evidence-based theories.

The Link Graph – Bad Neighborhoods

Google knows a lot about what bad links look like now. They know where a lot of them live and they also understand their DNA.
And once they start looking it becomes pretty easy to spot the links muddying the waters.
The link graph is a kind of network graph and is made up of a series of "nodes" or clusters. Clusters form around IPs and as a result it becomes relatively easy to start to build a picture of ownership, or association. An illustrative example of this can be seen below:
node-illustration
Google assigns weight or authority to links using its own PageRank currency, but like any currency it is limited and that means that we all have to work hard to earn it from sites that have, over time, built up enough to go around.
This means that almost all sites that use "manipulative" authority to rank higher will be getting it from an area or areas of the link graph associated with other sites doing the same. PageRank isn't limitless.
These "bad neighborhoods" can be "extracted" by Google, analyzed and dumped relatively easily to leave a graph that looks a little like this:
graph-extracted-bad-neighborhoods
They won’t disappear, but Google will devalue them and remove them from the PageRank picture, rendering them useless.
Expect this process to accelerate now the search giant has so much data on "spammy links" and swathes of link profiles getting knocked out overnight.
The concern of course is that there will be collateral damage, but with any currency rebalancing, which is really what this process is, there will be winners and losers.

Link Velocity

Another area of interest at present is the rate at which sites acquire links. In recent months there definitely has been a noticeable change in how new links are being treated. While this is very much theory my view is that Google have become very good now at spotting link velocity "spikes" and anything out of the ordinary is immediately devalued.
Whether this is indefinitely or limited by time (in the same way "sandbox" works) I am not sure but there are definite correlations between sites that earn links consistently and good ranking increases. Those that earn lots quickly do not get the same relative effect.
And it would be relatively straightforward to move into the Penguin model, if it isn't there already. The chart below shows an example of a "bumpy" link acquisition profile and as in the example anything above the "normalized" line could be devalued.
chart-ignore-links-above-this-line

Link Trust

The "trust" of a link is also something of interest to Google. Quality is one thing (how much juice the link carries), but trust is entirely another thing.
Majestic SEO has captured this reality best with the launch of its new Citation and Trust flow metrics to help identify untrusted links.
How is trust measured? In simple terms it is about good and bad neighborhoods again.
In my view Google uses its Hilltop algorithm, which identifies so-called "expert documents" (websites) across the web, which are seen as shining beacons of trust and delight! The closer your site is to those documents the better the neighborhood. It’s a little like living on the "right" road.
If your link profile contains a good proportion of links from trusted sites then that will act as a "shield" from future updates and allow some slack for other links that are less trustworthy.

Social Signals

Many SEO pros believe that social signals will play a more significant role in the next iteration of Penguin.
While social authority, as it is becoming known, makes a lot of sense in some markets, it also has limitations. Many verticals see little to no social interaction and without big pots of social data a system that qualifies link quality by the number of social shares across site or piece of content can't work effectively.
In the digital marketing industry it would work like a dream but for others it is a non-starter, for now. Google+ is Google’s attempt to fill that void and by forcing as many people as possible to work logged in they are getting everyone closer to Plus and the handing over of that missing data.
In principle it is possible though that social sharing and other signals may well be used in a small way to qualify link quality.

Anchor Text

Most SEO professionals will point to anchor text as the key telltale metric when it comes to identifying spammy link profiles. The first Penguin rollout would undoubtedly have used this data to begin drilling down into link quality.
I asked a few prominent SEO professionals their opinions on what the key indicator of spam was in researching this post and almost all pointed to anchor text.
“When I look for spam the first place I look is around exact match anchor text from websites with a DA (domain authority) of 30 or less," said Distilled’s John Doherty. "That’s where most of it is hiding.”
His thoughts were backed up by Zazzle’s own head of search Adam Mason.
“Undoubtedly low value websites linking back with commercial anchors will be under scrutiny and I also always look closely at link trust,” Mason said.
The key is the relationship between branded and non-branded anchor text. Any natural profile would be heavily led by branded (e.g., www.example.com/xxx.com) and "white noise" anchors (e.g., "click here", "website", etc).
The allowable percentage is tightening. A recent study by Portent found that the percentage of "allowable" spammy links has been reducing for months now, standing at around 80 percent pre-Penguin and 50 percent by the end of last year. The same is true of exact match anchor text ratios.
Expect this to tighten even more as Google’s understanding of what natural "looks like" improves.

Relevancy

One area that will certainly be under the microscope as Google looks to improve its semantic understanding is relevancy. As it builds up a picture of relevant associations that data can be used to assign more weight to relevant links. Penguin will certainly be targeting links with no relevance in future.

Traffic Metrics

While traffic metrics probably fall more under Panda than Penguin, the lines between the two are increasingly blurring to a point where the two will shortly become indistinguishable. Panda has already been subsumed into the core algorithm and Penguin will follow.
On that basis Google could well look at traffic metrics such as visits from links and the quality of those visits based on user data.

Takeaways

No one is in a position to be able to accurately predict what the next coming will look like but what we can be certain of is that Google will turn the knife a little more making link building in its former sense a more risky tactic than ever. As numerous posts have pointed out in recent months it is now about earning those links by contributing and adding value via content.
If I was asked what my money was on, I would say we will see a tightening of what is an allowable level of spam still further, some attempt to begin measuring link authority by the neighborhood it comes from and any associated social signals that come with it. The rate at which links are earned too will come under more scrutiny and that means you should think about:
  • Understanding your link profile in much great detail. Tools and data from companies such as Majestic, Ahrefs, CognitiveSEO, and others will become more necessary to mitigate risk.
  • Where you link comes from not just what level of apparent "quality" it has. Link trust is now a key metric.
  • Increasing the use of brand and "white noise" anchor text to remove obvious exact and phrase match anchor text problems.
  • Looking for sites that receive a lot of social sharing relative to your niche and build those relationships.
  • Running back link checks on the site you get links from to ensure their equity isn’t coming from bad neighborhoods as that could pass to you.

Penguin 2.0 Forewarning: The Google Perspective on Links

First and foremost, I don't work for Google. This article represents my opinions, but my company has worked on helping large numbers of sites get Google penalties removed.
The hardest part of these projects is always to get the client to understand what constitutes a bad link. This starts at the very core of how they think about online marketing and search engine optimization (SEO).
There are many who argue that this problem is of Google's own making. They created a world in which abuse wasn't only possible, but that was even very easy to abuse in the beginning. As some would say, they were talking the talk, but not walking the walk.
Some people even got mad. They would yell at Google that they couldn't follow their guidelines because it put them in a situation that was like bringing a knife to a gunfight.
But, as Greg Boser said on stage at SMX Advanced in 2012, Google is now not just talking the talk, but they are walking the walk as well. Penguin and their various attacks on unnatural links have dramatically reshaped their ability to detect and act on link building practices they consider detrimental to their algorithms.
walk-the-walk
These will continue to see dramatic improvements. Penguin 2.0 is just around the corner, and I expect it to have a bigger impact than Penguin 1.0. So let's step back and discuss what Google wants a link to represent.

Links Must Be Citations

This is the core concept. Just like the professor's research paper, which lists other research papers referenced by the professor in creating their paper.
The professor only lists (links) to the other papers most relevant to and most important to to their paper. You can't buy that, and never occurred to researchers to try and do that with each other. This system was pure at its heart.
sydney-reference-list
This notion is at the very core of the original PageRank thesis. Any deviation from it at all is a problem. In fact, here's what Google's Distinguished Engineer Matt Cutts said about it in my last interview with him, when I asked him if he felt the concept of link building was itself problematic:
It segments you into a mindset, and people get focused on the wrong things.
I always wondered why people who read the interview didn't pick up on that a lot more. There was a lot of buzz about the comments he made on infographics and boilerplate content on web sites, but nothing on this comment, which I thought was the most telling statement in the entire interview.
Later on, when I asked him about how publishers can help themselves he said:
By doing things that help build your own reputation, you are focusing on the right types of activity. Those are the signals we want to find and value the most anyway.
With this in mind, let's look at four link building practices that are still common today:
Infographics
This was also featured in my interview with Cutts. The biggest problem these face is that the sex appeal of the infographic is so high that many publishing sites don't care what they need to do to get it.
On top of that, many infographics are inaccurate or unrelated topically to the page receiving the link. Even without these problems it is likely that the great majority of people republishing infographics aren't thoughtfully endorsing the page they end up linking too.
Including rich anchor text links inside a guest post
If the New York Times accepted a guest post from you, what are the chances that they would let you load rich anchor text links inside your post back to the blatant money-making page on your site? Not a chance.
So when Google sees these rich anchor text links inside a guest post, it is a clear signal of a lack of editorial standards at the site publishing the content. This could even hurt the publisher of the content. Note that rich anchor text to other content that is a source is a very different matter.
sydney-great-deals-on-rental-cars
Guest posts that are only loosely related to the topic of the page receiving a link
Let's say you run a business selling golf carts. So you write a decent article on the best golf courses in Bermuda. You don't put rich anchor text in the body of the article, but in the attribution at the bottom you include a link with the anchor text "premium golf carts" to your site.
As before, these are links where the citation value is weak, and the editorial standards of the site are questionable. A link like this smells more like "payment" than a legitimate endorsement.
Award Badges
This is an oldie but (not) goodie that is sadly still being promoted by a few companies. What really makes these programs trivial for Google or Bing to flag is when the award badges seem to appear only on the lesser authoritative sites of a market segment. Like waving the red flag at the bull in the bullfighting ring, you're going to attract some attention!
These are just four examples. Note that I did not even bother with sites that have lots of footer links, lots of links on the right rail of pages, links from foreign language sites, links from markets where you don't sell/promote your stuff, etc. Those things are already being actively attacked by Google.
There are many other examples similar to the four above that can be constructed with a little thought – add yours to the comments if you like! Or, send some examples and I will give you my opinion.

Some Closing Questions For Qualifying Your Links

These questions were included in my recent article on 10 Common Link Building Problems, but I am going to expand upon them here:
1. Would you build the link if Google and Bing did not exist?
Any good link is something that has value even without search engines. Treat this question seriously, as it mirrors the behavior that Google and Bing want to see.
For example, would you spend 2 hours of a marketing person's time and $200 in expenses writing an article for Nameless Blog 23 just because they let you put that rich anchor text link in the article?
Oh, right. Without search engines that rich anchor text notion might not even be in your vocabulary.
2. If you have 2 minutes with a customer, and the law required that you show a random sampling of your links to customer prospects, would you happily show the link to a target customer? Or would it embarrass you?
This supports the notion that every link should be brand building in nature. A nice variant of this question is - would you proudly show it to your children?
3. Did the person giving you the link intend it as a genuine endorsement?
If not, Google wants to torch it, and so should you. This relates to the infographics and badge examples above, for sure, but it also relates to the blog examples. As soon as proper attribution slips into a model that looks a bit like "payment" you are no longer looking at a citation.
4. Do you have to make an argument to justify that it's a good link?
This is my favorite one. A good link shouldn't be the subject of an argument.
No argument is required with good links. When you see a good link, you know it right away. Sometimes I simplify this statement (for fun) by saying, "If you have to argue it is, it isn't."

Summary

If you've been building links that are exposed by these questions, the best thing you can do is start getting in front of it now. Don't wait for Penguin 2.0, or the next wave of link message penalties to come out. Start getting your business on a sound long-term footing now.
Start actively building unimpeachable links, and start working on eliminating the bad ones. I am not saying that you need to stand on the rooftops and yell out "hey Google I sinned come punish me", but you can begin asking sites that are the source of dangerous links to remove them.
I know that this is a hard decision to make. You have revenue, and you're paying people salaries, etc. See if you can find a way to add the good ones fast enough to make up for removals and keep your business moving forward.
And, for the record, and disclaimer purposes, your mileage may vary. I can't project the exact best strategy for sites that I have not even looked at, but I do know that Google is making large investments in the areas of fighting bad links, and I do know that Cutts let us know that a new Penguin update is coming that he referred to it as a "major update".
The last time Cutts made a similar statement, at SXSW last year, we got Penguin 1.0. You can trust that this new update is coming. You can also trust that it is not the last thing that Google will do.
Even if you get by this next update, learn to truly appreciate the meaning of a true citation and adjust your marketing strategy accordingly.

Google Penguin 2013: How to Evolve Link Building into Real SEO

Google has just rolled out Penguin 2.0, a large algorithmic update promising to go “deeper” than the 2012 Penguin release, which put a hurting on websites with number of manipulative links in their profile.
This prospect creates fear for many small businesses who depend on search engine optimization (SEO) for their livelihoods. But there is also a sense of confusion as the line often shifts and the message from Google contradictory.

Sorting out Panda, Penguin, and Manual Actions

Google's Panda update is a different release than Penguin. Panda is geared toward duplicative, thin, or spun content on websites.
Google's Distinguished Engineer Matt Cutts recently stated that Google is actually pulling back on Panda because of too many false positives. This is good for news aggregators and other sites that reuse content appropriately and have been hit hard by the Panda filter.
Penguin is much harder to understand, focusing on backlink patterns, anchor text, and manipulative linking tactics that provide little value to end users. To make matters worse, Google likes to take large manual actions just prior to major algorithm updates. In 2012 we saw the removal of BuildMyRank from the index just prior to Penguin.
Earlier this year we saw major manual action taken against advertorials. Last week Google announced the removal of thousands of link selling websites and we are hearing of a manual spam penalty against Sprint this week.
The proximity of these manual actions with major algorithmic updates is brilliant PR as it associates them together in our memories, discussions and debates - but they are very different things.

Is SEO Enough?

As small business owners move through the here we go again feelings to actually decide what to do in response to Penguin 2013, sorting out the truth is paramount. Google is clearly beating the familiar drum with the same core messages:
  1. Build a great website.
  2. Make awesome content with high end-user value.
  3. Visitors will magically appear.
But the reality is that visitors don’t magically come, at least on any reasonable scale, without organized promotional activities. Many excellent websites have died a slow death due to lack of promotion. And this is where the contradictions emerge in SEO, which has demonstrated extremely high ROI compared to other marketing channels.

Long Live Online Marketing

While discussed many times, webmasters still struggle with shifting their link building activities to real SEO strategy. They fail to see that SEO in 2013 is now integral to online marketing and no longer a standalone activity.
Whereas SEO used to be about tuning a website for optimal consumption by spiders, today’s SEO is about earning recognition, social spread, and backlinks through excellent content marketing. This means SEO is now ongoing, integrated, and strategic – whereas it used to be one-time, isolated, and technical.

Real SEO

Real SEO is the prescription for those who fear Penguin 2013. Here are practical activities that need to be done every month to achieve real SEO:
  • Continually Identify Audience Demand: Your SEO won't be successful if it isn't useful. To serve a need, webmasters must understand what the audience is seeking. Keyword research, as always, is critical. While doing keyword research don’t over-emphasize head terms or money keywords. Focusing on long-tail keywords renders more immediate results, increases the breadth of a website (remember Panda), and builds authority that will ultimately help the head term.
  • Content marketing: In my opinion, content marketing is the new link building. Earn recognition, social spread, and backlinks by giving away valuable information for free. Excellent content has high audience value and points readers to other resources via cocitation. Video is an excellent form of content marketing that is still under-utilized by small businesses. And newsjacking is an emerging form of content marketing that specifically targets hot news topics for viral spread.
  • Work on brand: There is increasing evidence that branded mentions are an important legitimacy signal to Google. Promoting the brand has traditional marketing benefits and also now helps SEO. But be careful not to turn SEO content marketing into an endorsement, as this crosses the line. Find traditional marketing tactics, such as press releases, to drive branding while announcing news-worthy events.
  • Syndicate: The "build it and they will come" philosophy doesn't work on an Internet with more than 500 million active domain names. This is why even excellent content needs to be promoted. Email marketing, social media, community engagement in forums, and guest blog posting are efficient mechanisms for spreading the word about engaging content. Interviews, PPC ads, and local event sponsorship will also get your name and content noticed. Any activity that broadcasts your message, your brand, and builds real community discussion will ultimately support SEO, and should be considered part of the SEO process.

Conclusions

The arrival of Penguin 2013 has many small business owners scared and confused. But SEO remains one of the best online marketing channels.
Real SEO is the path forward for those who wish to make a long-term investment in online marketing. Forward-looking webmasters can prepare their sites for Penguin 2014, 2015, and beyond with well-researched, end-user focused content marketing that provides strong audience value.
Using modern syndication tactics, they can broadcast their message, gain audience mind-share and earn recognition. By spreading valuable content, small business can build their brands and earn bulletproof backlinks.

The Myth of Content Marketing, the New SEO & Penguin 2.0

"What Should Lead Your Online Marketing Strategy: SEO or Content". "Why Content Marketing is the New SEO". "Is Google's love affair with content marketing usurping SEO?" "Content Marketing is the New SEO "
These are actual titles from article in the top search results for [content marketing and SEO].

Content Marketing Isn't New

nyan-cat-matt-cutts
CONTENT MARKETING! DO IT! It's the NEW SEO!
In fact, it's so awesome you don't need anything else! Just produce awesome content and you will be in SEO nirvana! It's like double rainbows and Matt Cutts got together and had baby NyanCats!
Old SEO is dead. This is the new SEO and it's beautiful!
Sound too good to be true? That's because it is.
Content marketing isn't new. It's just a new buzzword picked up by other industries that suddenly found out they could to "do SEO", but they didn't want to "do SEO", so they tried to make it more special. It isn't.
Content marketing has been around since SEO on Google has been called SEO. To not understand this is to not understand what Google and its algorithms measure and how this might affect your site.
Now with the arrival of Penguin 2.0, you might be just setting yourself up for a fall – right out of the rankings. And yes despite all our talk of rankings not mattering, they do, because if you go from somewhere on Page 1 (with personalization) to nowhere on page 51, you will suddenly say, "Oh no! My rankings!"
Rankings matter. SEO matters. And content marketing is SEO. It always has been, and always will be – well, at least until the search engines don't use algorithms and content, but that's a long way off.
Need more proof of the power of content? Back in 2008, I ranked a website in the top 15 for a one-word term in competitive vertical with no links, a domain that was less than a year old, four weeks from launch, with 1,500 pages of unique, solid, quality content. Every word on the site was original, even the Contact Us.
How do I know content was the reason for getting the site ranked in the top 15? Content! To be fair, I can only be 99 percent sure that content, thanks to a Google engineer at a party at an SES Conference who confirmed it was "most likely the reason".
Like I said, the importance of unique, quality content isn't a new concept.

Just What is "Content Marketing"?

content-marketing-2013-buzzword
If you want the best literal explanation, this quote from Quora (found via Ann Smarty and Authority Labs) works very well:
"Content marketing is the umbrella of all techniques that are used to generate traffic, leads, online visibility, and brand awareness/fidelity."
If you want the one that really gets it, then this one from Sugar Rae says it best:
"Content marketing isn't a new strategy, it's merely a new word.
Why ... do we as an industry feel the need to invent a new buzzword for the same services every few years? We've been doing "content marketing" forever.
  • Website = content
  • Promotion of that website = marketing
Website + promotion of said website = content marketing."
And there you go. It is content that you put on your website and promote. That can be text, video, infographics, images, whatever you think of and put on your site. When you release it as part of your site marketed materials, then it is "content marketing".
It's really that simple. Again, it's not new, it's just a new buzzword.
content-marketing-google-trends
Now that we have that straightened out, what does content marketing have to do with Penguin, SEO, future penalties, and you?
Content marketing is not the new SEO. It is SEO and so are a lot of other things.

It's All SEO Now

One client's site I recently reviewed was brilliant. The company had never bought a link, was completely legit, and worked feverishly on their content marketing – yet they had 16 warnings and penalties. Why? Because while content is great and certainly a very important part of any SEO strategy, it isn't all or even most of what you need to be concerned about when thinking about the algorithm.

Taking Your Eyes Off The Ball

black-hat-flags
So while you were spending all that time concentrating on your content marketing, what were you doing about making sure you met the rest of the 200+ points on the algorithm? What about the other things that Penguin was meant to control?
How is your internal linking? Your anchor text either coming in or internally?
How about where your sites are linking externally? Where are you linking to and are you linking to other sites you own? (triangulation - crosslinking)
What about the other changes Google announced are coming this summer (which I will just term, the "no one is home" penalties for lack of a better term)? You know, like spam comment in your forums or blogs? Or your page speed and usability?
How about your page crawls? Sitemaps? Are you showing Google no one is at the helm while you spend all your time focused on cultivating the latest viral video or super infographic?
Starting to see the issue?
Content marketing isn't separate from SEO and isn't the new SEO. It doesn't replace SEO. It is SEO just like all the other items mentioned are SEO.
matt-cutts-over-optimization
SEO isn't just search engine optimization anymore. It is, as Cutts suggested a little while back, search experience optimization and it covers everything on the website, either directly or relationally.
Once you realize that "content marketing" is just using good content practices and that you might have been neglecting the rest of your site SEO, what should you do?

12 Other Google Update Checks (Penguin Included)

1. Titles and Descriptions

Titles and descriptions still remain one of the most misunderstood items on any site and they are still as important as ever. Know what these mean and how to write each properly. Make sure you don't have duplicates, ones that are too long or over-optimized tags.

2. Anchor Text

Is your anchor text over-optimized with keywords? Are you using keywords when domain names should be used? What is the natural way someone would link to your site? This counts with inbound links as well as internally. Beware of over-optimized and overused keyword anchor text.

3. Links – Inbound & Outbound

Run a link check. How do your inbound links look? The threshold for spammy links was about 80 percent, it is now down to about 50 percent. That means 50 percent questionable links can keep your site or a page out of the index.
Know your link profile.
Using outbound links, make sure you are not sending out link juice on ad links, but still make sure you are doing some links offsite. Google doesn't like it when you hoard that link power all for yourself. Share with worthwhile sites, but never with ad links.

4. Links Cross or Triangulate

Sometimes by accident even, sites crosslink to other sites they own or partner with that site while sitting on the same IP addresses or C classes. Do you know if yours do? If they do, delink your sites or put rel=nofollow on those links, or Google may think you are attempting to put up a link network of your own.
Remember, Google can't discern intent, so the appearance of impropriety is all that you need to give yourself a penalty.

5. Page Speed

Google likes to say page speed is a small factor for websites and maybe for some industries this is the case, but in others our experience shows it isn't. This only makes sense. For Google, faster loading sites lower the load on Google's end, so take the page speed tool, check your site, and get it above a 90 percent if you can. That seems to be the magic threshold for most.

6. User-Generated Content Spam

User-generated content spam on your site is directly linked to a penalty now at Google. (Heard about Sprint's latest fiasco?)
It doesn't take a lot to indicate to Google "No One is Home" keeping an eye on things.
Make sure you have checked your blogs and comment areas for things like multiple https or for words such as "free shipping" with a database crawler or in Google with site:domain.com "words go here" and see, is someone scamming you?
Note: If the spammers are very good you may not be able to see it without a Google search.

7. Redirects

Get a tool like Screaming Frog and check your site pages for redirects then make sure those redirected pages have a 301 permanent redirect, which tells Google the page has been permanently moved and it should keep following it.
It's rare you need a different type of page redirect and if you do, then remove the page from the index with a noindex tag in the header. (There are rare cases where this won't be the case, this is just the general rule.)
Also make sure you have your canonicals in place and that they are correct. This should go without saying, but not all sites do it.

8. Over-Optimization on Non-Content Items

A common type of over-optimization happens in the navigation, the header or footer.
This is where someone either adds a keyword to every (or almost every) word to try to rank for the term or where someone adds an overabundance of header or footer links to "help" a site position for known keywords. This won't help and is likely to give the site a penalty.

9. Alt Attributes

How are you using the alt attribute on your images? Don't stuff keywords into this text. Using good alt text, especially when images are replacing text in links, can be very good for a site. In fact, Google will treat this alt text as actual text in these cases.
Go to http://webaim.org to learn the rules for "alt text" content generation.

10. Ad Issues

Google doesn't like it when a site seems to only be there to support the ads on it, so an overabundance of above the fold ads can cause the site to receive a penalty.
What is too much? Google is a little obtuse about this, but find out what is above the fold for your screen size (not your screen, but the site screen size), then hold up a post it note, if it takes up more space than the note, it is probably too large.

11. Crawl Issues

When is the last time you got into your Webmaster Tools and checked how your crawls were going? How is your crawl rate? Are the spiders having any crawl issues?
We once had a client who had 28k crawl errors. These will affect your site strength and authority with the "No One Is Home" devaluations. So keep an eye on your crawl rate and if it is not crawling well, find out why as quickly as possible and fix it!

12. Malware or Rogue Sites

For the most part, we're fortunate that Google will email us and tell you that you have malware on your site – but be careful: this isn't always the case. Periodically you want to do a search for your site, see if you trigger malware warnings in a site search or mobile, then check your analytics to make sure no one is running anything untoward on your site like say a rogue Viagra site. If you want to see how prevalent this is, go to Google search and put in ".gov" Viagra.
Not only can these sites be doing things on your site that could be causing you "hack" issues, but also sending links to their pages on your site causing your link profile to be damaged.

What Else?

This was just a partial list to get you started. We haven't touched authorship, structured data, URL construction or a whole host of things you should be doing these are just some things you need to be checking, but hopefully you get an idea that myopic SEO is not SEO at all.
penguin-update-2
If you haven't been doing much more than content marketing and thought there was something called the "new SEO" and the "old SEO" was dead, my best advice is with the arrival of Penguin 2.0 and several other changes still on the horizon, is to conduct a site audit.
This is going to be the summer of change on Google, and this article has only touched on some of the items known to be part of the Penguin and Panda algorithms and the coming attractions. Don't get caught with your proverbial pants down, wondering, what happened?
With SEO proactive is always better than reactive, because only a small percentage of sites hit by the first Penguin have ever fully recovered. If (or when) your site gets hit, sometimes all you can do is start again.

Tuesday, August 20, 2013

How to Train a Link Builder From Scratch

Over the past few years I've helped train more than 60 link builders, almost all of whom didn't know the first thing about building links. Most of them had very little, if any, SEO knowledge either. That was our intention though, as we have a specific way of doing things.
Although having more experienced people could provide some great new ideas and proven tactics, sometimes training people from scratch is better because they have no preconceived notions.
From what I've learned with my link builders, there are a few signs that someone will be fantastic at the job:

  • Excellent written and interpersonal communication skills.
  • Curiosity about anything and everything.
  • Awesome organizational skills.
  • Creative mind.
  • A really, really hard work ethic.
  • The ability to understand why we build links.
  • Willingness to try new techniques and listen to feedback.
The things that are red flags to me? A person who is easily frustrated and complains early on that link building is too hard.

Link building is hard. It's one of the most tedious things I've ever done in my SEO career and if you're going to get upset and give up easily, I don't see you having a big future in link building.

Why Build Links?

While I could argue that we could train a link builder to be successful without fully explaining much about why people build links, it's ridiculous not to give your link builders the knowledge of why links matter. The more they know, the better they'll perform.
However, in a few cases, some link builders almost got too overwhelmed with thinking about the potential SEO benefits and implications of what they were doing. While they did build some great links, they also overthought things to the extent that they weren't as efficient as they needed to be.

Still, I don't enjoy doing something when I'm not told why I should be doing it. Giving your link builders the respect of explaining why their efforts matter is critical, especially if you want them to enjoy what they do.

How to Look at a Site's Backlinks

There are a lot of free link check tools, and some paid ones have free versions or trials. Have your link builders find one that they like and that fits your budget.
Google and Bing also report inbound links in their respective Webmaster Tools consoles, but they don't show nearly as many as a proper link tool will.
Link builders need to be trained on how to run at least a basic report so they can see a site's backlinks, look at the anchor text and metrics, etc.
Link builders need to know how to distinguish between number of links total and unique linking domains, too. That has been an area of confusion both to clients and link builders starting out, so take a look at where you can find that information in the tools that I list below.
My usual roster of backlink tools include the following:
Ahrefs
ahrefs
Majestic SEO
Majestic SEO
Open Site Explorer
Moz
Notice how the counts are different from tool to tool? That's because each tool uses its own database.

How to Find Contact Information

There are some cases where a new link builder will find a great site but it's almost impossible to find the contact information. When that happens, here's what to do:
  1. Search inside the site for @url.com. For example, you could search for "@example.com"
  2. Search for site with @url.com. Example: site:example.com "@example.com" and this brings up the info@example.com email contact in the results.
  3. Check the About page or Contact page. If there is not an email address listed, view the source code and search for "@url.com" there.
  4. Check whois. This will usually list the email address of the person who registered the domain.
  5. Search the site for "email me", "my address", "email address", etc. (e.g., "email me" site:example.com)
  6. (last resort)
    • Ping the site to get the IP address. 
    • http://webhosting.info/ go here and slap the IP address into the search box and choose IP Address. Note: do not use the top box, which is for searching the web. Hit the tiny Go button.
    • You'll see a list of other websites hosted on the same IP. Click on a few and see if you can find contact info on a few to see if any look related.  It probably won't work for many cases but if you're dealing with a casino site, can't find info, and do this…if you see 6 other casino sites listed, maybe one of them will be your guy.

How to Write Code for a Link

Sounds simple but you'd be amazed at how many webmasters don't know how to write code for a link. Make sure link builders know how to write code for both text and image links.
  • Text link code example:  < a href="http://searchenginewatch.com/article/2285837/Broken-Link-Building-How-to-Find-Thousands-of-Broken-Link-Opportunities-at-a-Time" title="Broken Link Building: How to Find Thousands of Broken Link Opportunities at a Time">Broken Link Building: How to Find Thousands of Broken Link Opportunities at a Time < /a>
  • Image link code example:  < a title="Broken Link Building: How to Find Thousands of Broken Link Opportunities at a Time" class="ukn-article-image" href="http://searchenginewatch.com/article/2285837/Broken-Link-Building-How-to-Find-Thousands-of-Broken-Link-Opportunities-at-a-Time"> < img src="http://cms.searchenginewatch.com/IMG/525/266525/broken-links-320x198.jpg" alt=""/> < /a>

How to Check Code

Following on the heels of knowing how to write very basic code, it's critical to know how to check code for times when a webmaster tries to give you a link and blows up the page. (Yes, it happens. Quotes are left out, usually.) It's also important to know how to do the following:
  • Read a robots.txt file (and make sure the page their link is on won't be blocked.) A robots syntax checker is always handy but link builders should be able to read these files and understand what they mean. There are tools that can create the correct robots.txt file for you, so if you're new to this, they're pretty useful. I'd recommend http://www.mcanerin.com/EN/search-engine/robots-txt.asp. A very useful summary of robots.txt is http://www.robotstxt.org/robotstxt.html, which gives you the kiss of death for robots:
    User-agent: *
    Disallow: /
    This code will prevent bots from accessing your site and I've seen it "leftover" from code changes enough that it's one of the first things I look for if anything seems to be going wrong. 
  • Check for nofollows (in the code…not just through a nofollow plugin) which will look something like  < a href="http://www.cnn.com/" target="_blank" rel="nofollow">CNN < /a>.
  • Look up anything else in the code that might be important: nowadays you'll see all sorts of code in links, most of it harmless. It's just important to know what it all means so that you don't needlessly hassle webmasters to make changes. If you see something you don't recognize, slap it into a search engine and see what it does. One exceptionally good reference is http://www.w3schools.com/.

Metrics

How to Evaluate a Site

  • Whatever metrics you think are important, you need to train a link builder on where to look for them and what they mean. Ahrefs has a URL Rank and an Ahrefs Domain Rank, both of which will appear when you run a basic link report. Open Site Explorer has Domain Authority and Page Authority, both of which also appear when you run a basic link report. Majestic uses Citation Flow and Trust Flow metrics, and as with the others, they're shown for any basic report. You can see where to find these metrics in the images listed in the backlink tool section a few sections up.
  • Quick ways to see if a site/page has problems: Always always always check to make sure the site you want to work with is indexed in Google by doing a simple site:url.com search in Google. If it's not, a link builder should be taught that this is a sign of a serious problem but he or she does need to know how to see what the problem is. Is the site blocked in the robots.txt file, accidentally, perhaps? If that's not the case and the site isn't in the Google index, it's an indication that the site has been penalized in some way. Also link builders should be taught that there are a few things that can indicate the need for further digging:
  • Toolbar PR of site seems too high or low for the age of the site and its backlink profile. If you see a site with a homepage TBPR of 5 and you can only find 2 backlinks, something's not right. If you see a site that has 10k backlinks, is 6 years old, and has a TBPR of 0, again, something's off. Since TBPR is updated infrequently, this could be a sign that the site you're looking at has recently been penalized but the TBPR has not been updated yet.
Googlebot Connect Errors
  • Site doesn't load properly or reliably. Sites that take ages to load can see less frequent bot visits which causes their content to take longer to get indexed. 
  • The SERP snippet shown is full of irrelevant or spammy content that doesn't match the site. This can be a sign that the site has been hacked or injected with some form of malware. Just search for something like "cialis online" and you should see what I mean but here's an example:
cialishack

How to Conduct Discovery

Finding linking partners can be the best and worst parts of link building. It's simple enough to type in a related keyword to Google and dig through the resulting SERPs, but that's a very inefficient method. While you will hopefully find relevant sites, you may run into the issue of finding the same exact sites that everyone else in your niche is contacting, lessening your chances of getting a good link.
Discovery can be conducted differently depending upon your goal, too:
404 Error
Broken link building
The goal of this is to find sites that link to 404 pages on other sites that are in your niche, contact them, point them to your own site, and get a link for your trouble. It can be very time-consuming but if you find a few great sites that have outdated links and you can convert them to your own site, you can score some major link power.
Garrett French wrote a great piece on broken link building, so I'd suggest reading that and familiarizing your link builders with the method. The hardest part is finding sites that do have outdated links of course, but you can use some of the tips in the section on Resource lists below to help you get started.
Resource lists
These can be very spammy so be careful and make sure that a link builder understands the difference between a good resource list and a bad one. A bad one will generally have links to irrelevant and unrelated sites and will seemingly exist for no purpose other than to host links. A good one will generally have its own backlinks and will rank well and have a decent Toolbar PageRank.
A lot of good resources will be on .edu sites but you'll find some great ones on any TLD so don't ignore the rest of them. Searches like this should help a link builder find good resource lists:
inurl:.edu medicine resources inurl:.edu medicine "resource list" medicine resources
Link builders should be trained on how to ask to be included in a list like this…politely. Garrett French (yes again) also wrote a piece about resource pages a couple of years ago so I'd suggest bookmarking that one.
Directory submissions
These can also be spammy and potentially dangerous as some free directories will give a listing to anyone at all. Avoid those (and make sure the directory is actually indexed in Google.) Generally if there's no review process, I'd avoid it.
Guest blogging potentials
While the future of this is uncertain, link builders need to know how to look for quality guest posting opportunities, write an outreach letter, and compose a guest post that will get published. Since guest posting is so popular right now, many sites simply list information about it, negating the need for intensive discovery using terms like "guest post accepted" and "guest bloggers welcome."
Asking for a link/paying for a link
However you plan to get the link, it's important to train your link builders to use creative discovery and not just robotically type in one or two keywords, expecting to find amazing and relevant sites that are going to be thrilled to give them a link. (We actually had a link builder who would add curse words to his keyword searches because he found some amazing sites that way.)
Searching in search engines vs. other methods
The main difference in using search engines vs. social media is that you'll usually find new content faster through social, and depending upon who you're following and who they're following/retweeting, you'll see a lot of content that you might not have found in the search engines.
I'd suggest training link builders to use both methods for discovery and switch off, as that can help avoid burnout. There are some great social media tools that can help with this, and I especially like IceRocket as they have a good "Search All" functionality that gives you recent results from the blogosphere, Facebook, and Twitter.

How to Write a Great Outreach Email

There are some wonderful articles out there that give examples of effective outreach emails but the principles are the same:
  • Be nice.
  • Be engaging.
  • State your case without being too blunt or going on for 18 paragraphs.
Many popular sites get loads of emails. Link builders should be trained on how to make theirs stand out (and not get slapped into the spam folder.) For a great recent post on this topic, see Simon Penson's piece.

How to Create Great Content

This is so much harder than almost anything else. Great content isn't just about great writing, and you can spend ages on a post and have it go nowhere.
It's tough to learn what works for different industries and target markets. Unless you spend time trying to find out what works, you'll never have much success, so it's important that link builders spend lots of times not only writing, but reading what other successful writers are producing.

How to Promote Content

I cannot find a better representation of the best way to socialize content that Rae Hoffman's which lists the steps to take. I've referenced this in a presentation and in posts, and I truly think it's the best. If link builders are producing content, they need to know how to promote it, period.

Keeping Up With Link Building

This could end up being one of the most critical parts of a link builder's job. Make sure they read articles about link building and stay informed about the industry as much as possible, as link builders should never stop learning.
It's important to be open to new ideas and to give up old ones at times, and keeping up to date on what others are doing successfully, trying your own methods, and thinking concretely and creatively about their job ups the chance that they'll continue to enjoy their work.
A link builder who hates what he does is not a link builder that will do the best job. Most big industry sites have a link column but there are loads of good blogs that talk about link building regularly.
I'd also recommend that a link builder follow some of the industry leaders on Twitter as not everyone writes about links but you will pick up some awesome tips from Twitter. Here are a few Twitter lists to get you started:

Summary

Make sure that you fully bring out the creative potential in your link builders when you train them, and make a big effort to brainstorm and listen to them. I've been involved with links for years and I still learn from even my newest link builders.
Brainstorming sessions where management isn't present might lead to more creative ideas that otherwise might be dismissed. Some of those ideas will be impossible, if not fairly absurd, but with some effort just maybe they could be turned into more workable ones.

Always make sure that you're open to questions and be willing to let them have a bit of freedom, no matter what you're doing.