Monday, September 2, 2013

Google Penguin 2.0 Update is Live

google-penguin-watch-out-webspam
Webmasters have been watching for Penguin 2.0 to hit the Google search results since Google's Distinguished Engineer Matt Cutts first announced that there would be the next generation of Penguin in March. Cutts officially announced that Penguin 2.0 is rolling out late Wednesday afternoon on "This Week in Google".
"It's gonna have a pretty big impact on web spam," Cutts said on the show. "It's a brand new generation of algorithms. The previous iteration of Penguin would essentinally only look at the home page of a site. The newer generation of Penguin goes much deeper and has a really big impact in certain small areas."
In a new blog post, Cutts added more details on Penguin 2.0, saying that the rollout is now complete and affects 2.3 percent of English-U.S. queries, and that it affects non-English queries as well. Cutts wrote:
We started rolling out the next generation of the Penguin webspam algorithm this afternoon (May 22, 2013), and the rollout is now complete. About 2.3% of English-US queries are affected to the degree that a regular user might notice. The change has also finished rolling out for other languages world-wide. The scope of Penguin varies by language, e.g. languages with more webspam will see more impact.
This is the fourth Penguin-related launch Google has done, but because this is an updated algorithm (not just a data refresh), we’ve been referring to this change as Penguin 2.0 internally. For more information on what SEOs should expect in the coming months, see the video that we recently released.
Webmasters first got a hint that the next generation of Penguin was imminent when back on May 10 Cutts said on Twitter, “we do expect to roll out Penguin 2.0 (next generation of Penguin) sometime in the next few weeks though.”
Matt Cutts Tweets About Google Penguin
Then in a Google Webmaster Help video, Cutts went into more detail on what Penguin 2.0 would bring, along with what new changes webmasters can expect over the coming months with regards to Google search results.
He detailed that the new Penguin was specifically going to target black hat spam, but would be a significantly larger impact on spam than the original Penguin and subsequent Penguin updates have had.
Google's initial Penguin update originally rolled out in April 2012, and was followed by two data refreshes of the algorithm last year – in May and October.
Twitter is full of people commenting on the new Penguin 2.0, and there should be more information in the coming hours and days as webmasters compare SERPs that have been affected and what kinds of spam specifically got targeted by this new update.
Let us know if you've seen any significant changes, or if the update has helped or hurt your traffic/rankings in the comments.
UPDATE: Google has set up a Penguin Spam Report form.

Google Penguin Tightens the Noose on Manipulative Link Profiles [Report]

Portent, a Seattle-based Internet marketing agency, has released a report offering new insight into Google’s Penguin algorithm. The report, based on primary data gathered by the agency, suggested that Google has been “applying a stricter standard over time.”
penguin-links
In part, the report reads:

In the initial Penguin update, the only sites we saw penalized had link profiles comprised of more than 80 percent manipulative links. Within two months, Google lowered the bar to 65 percent. Then in October 2012, the net got much wider. Google began automatically and manually penalizing sites with 50 percent manipulative links.
Although the report refers to Penguin a penalty, Penguin isn't a penalty. A penalty is a manual action taken against a site.
Yes, the Penguin update has demoted the rankings of sites, but as Google's Distinguished Engineer Matt Cutts has explained, Penguin is an algorithmic change, not a penalty. We explain this more in "Google Penalty or Algorithm Change: Dealing With Lost Traffic."
If Portent's findings are correct, then Google is likely becoming more confident with the accuracy of its Penguin algorithm in terms of minimizing false positives.
What does this mean for webmasters and SEO professionals? Continue to diligently clean up your inbound link profile.
Identify bad inbound links, then remove them or disavow them. Google’s next iteration of Penguin could lower the tolerance level for spammy inbound links even further; this might even be what Matt Cutts was referring to when he stated at this year’s SXSW that the next Penguin release would be significant and one of the more talked about Google algorithm updates this year.

Google Penguin, the Second (Major) Coming: How to Prepare

Unless you've had your head under a rock you've undoubtedly heard the rumblings of a coming Google Penguin update of significant proportions.
To paraphrase Google’s web spam lead Matt Cutts the algorithm filter has "iterated" to date but there will be a "next generation" coming that will have a major impact on SERPs.
Having watched the initial rollout take many by surprise it make sense this time to at least attempt to prepare for what may be lurking around the corner.

Google Penguin: What We Know So Far

We know that Penguin is purely a link quality filter that sits on top of the core algorithm, runs sporadically (the last official update was in October 2012), and is designed to take out sites that use manipulative techniques to improve search visibility.
And while there have been many examples of this being badly executed, with lots of site owners and SEO professionals complaining of injustice, it is clear that web spam engineers have collected a lot of information over recent months and have improved results in many verticals.
That means Google's team is now on top of the existing data pile and testing output and as a result they are hungry for a major structural change to the way the filter works once again.
We also know that months of manual resubmissions and disavows have helped the Silicon Valley giant collect an unprecedented amount of data about the "bad neighborhoods" of links that had powered rankings until very recently, for thousands of high profile sites.
They have even been involved in specific and high profile web spam actions against sites like Interflora, working closely with internal teams to understand where links came from and watch closely as they were removed.
In short, Google’s new data pot makes most big data projects look like a school register! All the signs therefore point towards something much more intelligent and all encompassing.
The question is how can you profile your links and understand the probability of being impacted as a result when Penguin hits within the next few weeks or months?
Let’s look at several evidence-based theories.

The Link Graph – Bad Neighborhoods

Google knows a lot about what bad links look like now. They know where a lot of them live and they also understand their DNA.
And once they start looking it becomes pretty easy to spot the links muddying the waters.
The link graph is a kind of network graph and is made up of a series of "nodes" or clusters. Clusters form around IPs and as a result it becomes relatively easy to start to build a picture of ownership, or association. An illustrative example of this can be seen below:
node-illustration
Google assigns weight or authority to links using its own PageRank currency, but like any currency it is limited and that means that we all have to work hard to earn it from sites that have, over time, built up enough to go around.
This means that almost all sites that use "manipulative" authority to rank higher will be getting it from an area or areas of the link graph associated with other sites doing the same. PageRank isn't limitless.
These "bad neighborhoods" can be "extracted" by Google, analyzed and dumped relatively easily to leave a graph that looks a little like this:
graph-extracted-bad-neighborhoods
They won’t disappear, but Google will devalue them and remove them from the PageRank picture, rendering them useless.
Expect this process to accelerate now the search giant has so much data on "spammy links" and swathes of link profiles getting knocked out overnight.
The concern of course is that there will be collateral damage, but with any currency rebalancing, which is really what this process is, there will be winners and losers.

Link Velocity

Another area of interest at present is the rate at which sites acquire links. In recent months there definitely has been a noticeable change in how new links are being treated. While this is very much theory my view is that Google have become very good now at spotting link velocity "spikes" and anything out of the ordinary is immediately devalued.
Whether this is indefinitely or limited by time (in the same way "sandbox" works) I am not sure but there are definite correlations between sites that earn links consistently and good ranking increases. Those that earn lots quickly do not get the same relative effect.
And it would be relatively straightforward to move into the Penguin model, if it isn't there already. The chart below shows an example of a "bumpy" link acquisition profile and as in the example anything above the "normalized" line could be devalued.
chart-ignore-links-above-this-line

Link Trust

The "trust" of a link is also something of interest to Google. Quality is one thing (how much juice the link carries), but trust is entirely another thing.
Majestic SEO has captured this reality best with the launch of its new Citation and Trust flow metrics to help identify untrusted links.
How is trust measured? In simple terms it is about good and bad neighborhoods again.
In my view Google uses its Hilltop algorithm, which identifies so-called "expert documents" (websites) across the web, which are seen as shining beacons of trust and delight! The closer your site is to those documents the better the neighborhood. It’s a little like living on the "right" road.
If your link profile contains a good proportion of links from trusted sites then that will act as a "shield" from future updates and allow some slack for other links that are less trustworthy.

Social Signals

Many SEO pros believe that social signals will play a more significant role in the next iteration of Penguin.
While social authority, as it is becoming known, makes a lot of sense in some markets, it also has limitations. Many verticals see little to no social interaction and without big pots of social data a system that qualifies link quality by the number of social shares across site or piece of content can't work effectively.
In the digital marketing industry it would work like a dream but for others it is a non-starter, for now. Google+ is Google’s attempt to fill that void and by forcing as many people as possible to work logged in they are getting everyone closer to Plus and the handing over of that missing data.
In principle it is possible though that social sharing and other signals may well be used in a small way to qualify link quality.

Anchor Text

Most SEO professionals will point to anchor text as the key telltale metric when it comes to identifying spammy link profiles. The first Penguin rollout would undoubtedly have used this data to begin drilling down into link quality.
I asked a few prominent SEO professionals their opinions on what the key indicator of spam was in researching this post and almost all pointed to anchor text.
“When I look for spam the first place I look is around exact match anchor text from websites with a DA (domain authority) of 30 or less," said Distilled’s John Doherty. "That’s where most of it is hiding.”
His thoughts were backed up by Zazzle’s own head of search Adam Mason.
“Undoubtedly low value websites linking back with commercial anchors will be under scrutiny and I also always look closely at link trust,” Mason said.
The key is the relationship between branded and non-branded anchor text. Any natural profile would be heavily led by branded (e.g., www.example.com/xxx.com) and "white noise" anchors (e.g., "click here", "website", etc).
The allowable percentage is tightening. A recent study by Portent found that the percentage of "allowable" spammy links has been reducing for months now, standing at around 80 percent pre-Penguin and 50 percent by the end of last year. The same is true of exact match anchor text ratios.
Expect this to tighten even more as Google’s understanding of what natural "looks like" improves.

Relevancy

One area that will certainly be under the microscope as Google looks to improve its semantic understanding is relevancy. As it builds up a picture of relevant associations that data can be used to assign more weight to relevant links. Penguin will certainly be targeting links with no relevance in future.

Traffic Metrics

While traffic metrics probably fall more under Panda than Penguin, the lines between the two are increasingly blurring to a point where the two will shortly become indistinguishable. Panda has already been subsumed into the core algorithm and Penguin will follow.
On that basis Google could well look at traffic metrics such as visits from links and the quality of those visits based on user data.

Takeaways

No one is in a position to be able to accurately predict what the next coming will look like but what we can be certain of is that Google will turn the knife a little more making link building in its former sense a more risky tactic than ever. As numerous posts have pointed out in recent months it is now about earning those links by contributing and adding value via content.
If I was asked what my money was on, I would say we will see a tightening of what is an allowable level of spam still further, some attempt to begin measuring link authority by the neighborhood it comes from and any associated social signals that come with it. The rate at which links are earned too will come under more scrutiny and that means you should think about:
  • Understanding your link profile in much great detail. Tools and data from companies such as Majestic, Ahrefs, CognitiveSEO, and others will become more necessary to mitigate risk.
  • Where you link comes from not just what level of apparent "quality" it has. Link trust is now a key metric.
  • Increasing the use of brand and "white noise" anchor text to remove obvious exact and phrase match anchor text problems.
  • Looking for sites that receive a lot of social sharing relative to your niche and build those relationships.
  • Running back link checks on the site you get links from to ensure their equity isn’t coming from bad neighborhoods as that could pass to you.

Penguin 2.0 Forewarning: The Google Perspective on Links

First and foremost, I don't work for Google. This article represents my opinions, but my company has worked on helping large numbers of sites get Google penalties removed.
The hardest part of these projects is always to get the client to understand what constitutes a bad link. This starts at the very core of how they think about online marketing and search engine optimization (SEO).
There are many who argue that this problem is of Google's own making. They created a world in which abuse wasn't only possible, but that was even very easy to abuse in the beginning. As some would say, they were talking the talk, but not walking the walk.
Some people even got mad. They would yell at Google that they couldn't follow their guidelines because it put them in a situation that was like bringing a knife to a gunfight.
But, as Greg Boser said on stage at SMX Advanced in 2012, Google is now not just talking the talk, but they are walking the walk as well. Penguin and their various attacks on unnatural links have dramatically reshaped their ability to detect and act on link building practices they consider detrimental to their algorithms.
walk-the-walk
These will continue to see dramatic improvements. Penguin 2.0 is just around the corner, and I expect it to have a bigger impact than Penguin 1.0. So let's step back and discuss what Google wants a link to represent.

Links Must Be Citations

This is the core concept. Just like the professor's research paper, which lists other research papers referenced by the professor in creating their paper.
The professor only lists (links) to the other papers most relevant to and most important to to their paper. You can't buy that, and never occurred to researchers to try and do that with each other. This system was pure at its heart.
sydney-reference-list
This notion is at the very core of the original PageRank thesis. Any deviation from it at all is a problem. In fact, here's what Google's Distinguished Engineer Matt Cutts said about it in my last interview with him, when I asked him if he felt the concept of link building was itself problematic:
It segments you into a mindset, and people get focused on the wrong things.
I always wondered why people who read the interview didn't pick up on that a lot more. There was a lot of buzz about the comments he made on infographics and boilerplate content on web sites, but nothing on this comment, which I thought was the most telling statement in the entire interview.
Later on, when I asked him about how publishers can help themselves he said:
By doing things that help build your own reputation, you are focusing on the right types of activity. Those are the signals we want to find and value the most anyway.
With this in mind, let's look at four link building practices that are still common today:
Infographics
This was also featured in my interview with Cutts. The biggest problem these face is that the sex appeal of the infographic is so high that many publishing sites don't care what they need to do to get it.
On top of that, many infographics are inaccurate or unrelated topically to the page receiving the link. Even without these problems it is likely that the great majority of people republishing infographics aren't thoughtfully endorsing the page they end up linking too.
Including rich anchor text links inside a guest post
If the New York Times accepted a guest post from you, what are the chances that they would let you load rich anchor text links inside your post back to the blatant money-making page on your site? Not a chance.
So when Google sees these rich anchor text links inside a guest post, it is a clear signal of a lack of editorial standards at the site publishing the content. This could even hurt the publisher of the content. Note that rich anchor text to other content that is a source is a very different matter.
sydney-great-deals-on-rental-cars
Guest posts that are only loosely related to the topic of the page receiving a link
Let's say you run a business selling golf carts. So you write a decent article on the best golf courses in Bermuda. You don't put rich anchor text in the body of the article, but in the attribution at the bottom you include a link with the anchor text "premium golf carts" to your site.
As before, these are links where the citation value is weak, and the editorial standards of the site are questionable. A link like this smells more like "payment" than a legitimate endorsement.
Award Badges
This is an oldie but (not) goodie that is sadly still being promoted by a few companies. What really makes these programs trivial for Google or Bing to flag is when the award badges seem to appear only on the lesser authoritative sites of a market segment. Like waving the red flag at the bull in the bullfighting ring, you're going to attract some attention!
These are just four examples. Note that I did not even bother with sites that have lots of footer links, lots of links on the right rail of pages, links from foreign language sites, links from markets where you don't sell/promote your stuff, etc. Those things are already being actively attacked by Google.
There are many other examples similar to the four above that can be constructed with a little thought – add yours to the comments if you like! Or, send some examples and I will give you my opinion.

Some Closing Questions For Qualifying Your Links

These questions were included in my recent article on 10 Common Link Building Problems, but I am going to expand upon them here:
1. Would you build the link if Google and Bing did not exist?
Any good link is something that has value even without search engines. Treat this question seriously, as it mirrors the behavior that Google and Bing want to see.
For example, would you spend 2 hours of a marketing person's time and $200 in expenses writing an article for Nameless Blog 23 just because they let you put that rich anchor text link in the article?
Oh, right. Without search engines that rich anchor text notion might not even be in your vocabulary.
2. If you have 2 minutes with a customer, and the law required that you show a random sampling of your links to customer prospects, would you happily show the link to a target customer? Or would it embarrass you?
This supports the notion that every link should be brand building in nature. A nice variant of this question is - would you proudly show it to your children?
3. Did the person giving you the link intend it as a genuine endorsement?
If not, Google wants to torch it, and so should you. This relates to the infographics and badge examples above, for sure, but it also relates to the blog examples. As soon as proper attribution slips into a model that looks a bit like "payment" you are no longer looking at a citation.
4. Do you have to make an argument to justify that it's a good link?
This is my favorite one. A good link shouldn't be the subject of an argument.
No argument is required with good links. When you see a good link, you know it right away. Sometimes I simplify this statement (for fun) by saying, "If you have to argue it is, it isn't."

Summary

If you've been building links that are exposed by these questions, the best thing you can do is start getting in front of it now. Don't wait for Penguin 2.0, or the next wave of link message penalties to come out. Start getting your business on a sound long-term footing now.
Start actively building unimpeachable links, and start working on eliminating the bad ones. I am not saying that you need to stand on the rooftops and yell out "hey Google I sinned come punish me", but you can begin asking sites that are the source of dangerous links to remove them.
I know that this is a hard decision to make. You have revenue, and you're paying people salaries, etc. See if you can find a way to add the good ones fast enough to make up for removals and keep your business moving forward.
And, for the record, and disclaimer purposes, your mileage may vary. I can't project the exact best strategy for sites that I have not even looked at, but I do know that Google is making large investments in the areas of fighting bad links, and I do know that Cutts let us know that a new Penguin update is coming that he referred to it as a "major update".
The last time Cutts made a similar statement, at SXSW last year, we got Penguin 1.0. You can trust that this new update is coming. You can also trust that it is not the last thing that Google will do.
Even if you get by this next update, learn to truly appreciate the meaning of a true citation and adjust your marketing strategy accordingly.

Google Penguin 2013: How to Evolve Link Building into Real SEO

Google has just rolled out Penguin 2.0, a large algorithmic update promising to go “deeper” than the 2012 Penguin release, which put a hurting on websites with number of manipulative links in their profile.
This prospect creates fear for many small businesses who depend on search engine optimization (SEO) for their livelihoods. But there is also a sense of confusion as the line often shifts and the message from Google contradictory.

Sorting out Panda, Penguin, and Manual Actions

Google's Panda update is a different release than Penguin. Panda is geared toward duplicative, thin, or spun content on websites.
Google's Distinguished Engineer Matt Cutts recently stated that Google is actually pulling back on Panda because of too many false positives. This is good for news aggregators and other sites that reuse content appropriately and have been hit hard by the Panda filter.
Penguin is much harder to understand, focusing on backlink patterns, anchor text, and manipulative linking tactics that provide little value to end users. To make matters worse, Google likes to take large manual actions just prior to major algorithm updates. In 2012 we saw the removal of BuildMyRank from the index just prior to Penguin.
Earlier this year we saw major manual action taken against advertorials. Last week Google announced the removal of thousands of link selling websites and we are hearing of a manual spam penalty against Sprint this week.
The proximity of these manual actions with major algorithmic updates is brilliant PR as it associates them together in our memories, discussions and debates - but they are very different things.

Is SEO Enough?

As small business owners move through the here we go again feelings to actually decide what to do in response to Penguin 2013, sorting out the truth is paramount. Google is clearly beating the familiar drum with the same core messages:
  1. Build a great website.
  2. Make awesome content with high end-user value.
  3. Visitors will magically appear.
But the reality is that visitors don’t magically come, at least on any reasonable scale, without organized promotional activities. Many excellent websites have died a slow death due to lack of promotion. And this is where the contradictions emerge in SEO, which has demonstrated extremely high ROI compared to other marketing channels.

Long Live Online Marketing

While discussed many times, webmasters still struggle with shifting their link building activities to real SEO strategy. They fail to see that SEO in 2013 is now integral to online marketing and no longer a standalone activity.
Whereas SEO used to be about tuning a website for optimal consumption by spiders, today’s SEO is about earning recognition, social spread, and backlinks through excellent content marketing. This means SEO is now ongoing, integrated, and strategic – whereas it used to be one-time, isolated, and technical.

Real SEO

Real SEO is the prescription for those who fear Penguin 2013. Here are practical activities that need to be done every month to achieve real SEO:
  • Continually Identify Audience Demand: Your SEO won't be successful if it isn't useful. To serve a need, webmasters must understand what the audience is seeking. Keyword research, as always, is critical. While doing keyword research don’t over-emphasize head terms or money keywords. Focusing on long-tail keywords renders more immediate results, increases the breadth of a website (remember Panda), and builds authority that will ultimately help the head term.
  • Content marketing: In my opinion, content marketing is the new link building. Earn recognition, social spread, and backlinks by giving away valuable information for free. Excellent content has high audience value and points readers to other resources via cocitation. Video is an excellent form of content marketing that is still under-utilized by small businesses. And newsjacking is an emerging form of content marketing that specifically targets hot news topics for viral spread.
  • Work on brand: There is increasing evidence that branded mentions are an important legitimacy signal to Google. Promoting the brand has traditional marketing benefits and also now helps SEO. But be careful not to turn SEO content marketing into an endorsement, as this crosses the line. Find traditional marketing tactics, such as press releases, to drive branding while announcing news-worthy events.
  • Syndicate: The "build it and they will come" philosophy doesn't work on an Internet with more than 500 million active domain names. This is why even excellent content needs to be promoted. Email marketing, social media, community engagement in forums, and guest blog posting are efficient mechanisms for spreading the word about engaging content. Interviews, PPC ads, and local event sponsorship will also get your name and content noticed. Any activity that broadcasts your message, your brand, and builds real community discussion will ultimately support SEO, and should be considered part of the SEO process.

Conclusions

The arrival of Penguin 2013 has many small business owners scared and confused. But SEO remains one of the best online marketing channels.
Real SEO is the path forward for those who wish to make a long-term investment in online marketing. Forward-looking webmasters can prepare their sites for Penguin 2014, 2015, and beyond with well-researched, end-user focused content marketing that provides strong audience value.
Using modern syndication tactics, they can broadcast their message, gain audience mind-share and earn recognition. By spreading valuable content, small business can build their brands and earn bulletproof backlinks.