0
votes

Search Engine Optimization !! Black-Hat Techniques VS White-Hat Techniques

posted January 2, 2008 - 11:36am
Search Engine Optimization !! Black-Hat Techniques VS White-Hat Techniques

Here I provide a clear definition of the various spamming techniques so that you can recognize and avoid them in your SEO campaigns. Many of these methods are quite old and the search engine robots have already been updated to recognize them. The rest are more inventive and technically complicated, and it's impossible to tell clearly whether spiders are able to recognize any of these.

The only thing to be sure of is that all these illegal methods that may be used today to promote a site will become visible and recognizable tomorrow or the day after tomorrow.

Common sense is your best guide during an SEO campaign. To determine whether or not a search engine optimization strategy will be considered spam, Matt Cutts, Software Engineer for Google, said that webmasters should ask themselves the following questions:
Does your Web page's content help end users?
Would you perform this optimization strategy if the search engines did not exist?
Are your pages automated? If so, Google does not want them in its index.
"Essentially, we want the best search results on top," Cutts said. "We want to get end users off of Google as soon as possible."

You can access and read Google's guidelines concerning unethical promotion techniques here:

http://www.google.com/webmasters/seo.html

http://www.google.com/webmasters/guidelines.html

http://www.google.com/contact/spamreport.html

Outdated spamming methods

These are easily cracked by spiders. The first thing to do when you receive a site ready for optimization (apart from assessing site quality) – is to ensure that none of these even occasionally takes place on the site.

Hidden or invisible text. This is defined as the use of text of the same color as the background. The characters are not seen to the naked eye, but search engine spiders are able to match your font color with the background settings. How do you spot this yourself? An easy way to do it is to press CTRL + A to select the whole page when it has been loaded into the browser. If you see that some text has appeared that wasn't there before the selection was made, it means the page utilizes the same color font and background.

Sometimes you may unintentionally fall into occasional spamming by using a white page background, then white text in a colored table cell. Visitors will see this text – as it is placed against the red cell background – but the search engine spider will probably ignore the table cell background color and detect that the text is the same color as the general page background. Avoid using white text even within colored table cells (if your page background is white, of course).

Anything that you try to hide from the viewer is considered spam. All content must be visible on the page to be considered relevant. After all, if you're not willing to display it, it can't be that important.

A more tricky technique is to make the text invisible to the visitors with the help of CSS. Most experts agree that spiders cannot parse and understand CSS. Besides, a style sheet can be linked from outside the HTML page, and the actual CSS file may further be restricted for reading by search engine spiders via the "robots.txt" file.

There are two explicit ways to make text invisible with the help of CSS:
1. This text will not be visible
2. This text will not be visible either
Although such techniques can't be tracked by the spiders, I don't recommend using it unless you are forced to initially hide some text or sub-menu items in the course of utilizing DHTML to make your page more interactive. The first reason to avoid it is the Code of Ethics and the second one is that sooner or later spiders will be updated to recognize this.

A separate type of hidden text is hidden links. Use of hidden links on a Web page originally put there solely for search engine spiders to follow, but not meant for visitors as a means to navigate through the site is considered spam. The same applies to small pixel-sized images (1x1) used to link to other pages or sites. Actually if you make an image a link, try to keep its size larger than 10x10 pixels.

Tiny text

Text that is hard to read because the font size is too small is in fact another obstacle to getting the page into the search engine index. A common practice is to write disclaimers in a font size several points smaller than the rest of the textual content on the page. The spiders will probably just ignore this, but I warn you against using font size smaller than 6pt or writing the whole Web page in tiny text. Try to set your font sizes to at least a font size 3 (HTML font size 3 corresponds to 12pt), which is also considered the default font size.

Excessive or irrelevant use of keywords.

This is probably the most ancient sort of spamming, yet many use it today in hopes of better rankings. Do not over-repeat keywords in your META keywords, META description tags or TITLE tag. Do not use keywords that are frankly irrelevant to your Web site – you will not achieve good rankings, you will just dissolve the relevant keywords in this mess and undergo the risk of being penalized. Another old trick is to use multiple TITLE and META tags to help boost rankings, but today it would be labeled as spam.

This kind of spam is often called keyword stuffing (applied to any area of the HTML page). Here's an example of keyword-stuffed META keywords tag:

Actually, assuming you optimize for "furniture", the right tactic is to use keyword combinations once or twice at the beginning of the keywords tag, then use many synonyms (but not numerous stem derivatives).

Another mistake is to enumerate your comma-separated keywords in the META description tag when the purpose of it is to describe your page in one or two common human-readable phrases.

Redirects. We recommend that you don't use on-page redirects with the help of the META refresh tag. Its use is only justified on technical pages that spiders do not access via the link system or are denied access because of the robots.txt tag. We also recommend that you do not use JavaScript commands to redirect visitors, unless it simply can't be avoided. The best choice is a server-side redirect with HTTP 301 response code (Moved Permanently). One more thing, different Web page URLs that bridge to the same URL are considered spam.

Pages or sites with duplicate or mirror content. Search engines do not strictly ban this practice, however Yahoo! is famous for its negative attitude toward groups of satellite sites that are revolving around one major site. In fact, URLs of different Web page with duplicate or mirror content are thrown out of a search engine's index (with only the oldest one being left in the database).

Such domain spam can sometimes be the result of a corporation's attempt to have websites for each of its company departments or subsidiaries. Those with many subsidiaries get a big boost from these domains. Realizing this, spammers are increasingly encouraging clients to have sites hosted on different IP addresses and even in different geographical locations.

In terms of contemporary optimization, this technique does little to help with rankings.

Link farms

The term "Link farms" stands for pages that only contain links to other pages or pages that contain only banner ads and no other textual content. Actually, many site maps can also be considered link farms, and in order to avoid it I do not recommend using more than 30 hyperlinks per site map single page.

Linking to a link farm (or to any so-called "Free-For-All link page", FFA ) from your site is also a prohibited practice and can result in losing search engine positions.

Doorway pages

The term "Doorway pages" stands for machine-generated pages with minimal content (or no content at all) that contain a link inviting the visitor to enter another site. Its sole purpose is to artificially boost the link popularity for the site this page links to. All kinds of invisible content may also be stuffed into such pages which earns it even more spam points by the search engine spider.

and spamming

Sometimes webmasters try to increase keyword density by putting a tag on a non-framed page. Search engines will detect this and see it as spam.

Spammers often use these tags to include keyword-stuffed content and links to their other pages that the visitors will not see but the spiders will follow. This technique may be used to secretly interlink a number of sites "in the underground". Here's an even more intricate example of the code:

new homes
concrete
design

precast
mantel
home decorating
home improvement
luxury homes

As you can see, this code is full of links to dynamic pages. When the spider follows them, the responding server will take the keywords from the query string and generate a page especially optimized for these keywords and feed it to the spider.

Of course, visitors to this site won't notice any links. However, this is one of the most dangerous techniques out there and can result in a life-time penalty.

Excessive submission. Search engines do not like being spammed by repeated submissions of the same page over and over again. This practice will not help you get into the index, and can result in IP banning. Limit Web page URL submissions to a maximum of once a month or when your page content changes.
Advanced spamming techniques
Here we provide an overview of more intricate techniques that are by definition spamming as their sole purpose is to attain a high Page Rank and search engine ranking rather than delivering unique and quality content to the visitors. However, it's difficult to know whether the spiders recognize these techniques – most probably, they can't.

We continue to discourage use of any of these techniques for a serious SEO campaign. As technologies rapidly become more and more intelligent, the effort you put into implementing a cunning strategy will be nullified when this strategy comes into disfavor with the crawlers.

Publishing Empires. This refers to a Web publisher's activity building numerous interlinked websites with the purpose of generating high PageRank and subsequent rankings. This form of spam is difficult for a search engine to penalize, while the links are completely legitimate – any single business entity has the right to interlink its own websites. When the content themes of the sites are further overlapped, the links receive even more weight in the search engines' eyes. When you perform a site search on one of these sites, you're actually guaranteed to see one of the company's other Web properties in the search results.

Many among the most successfully ranked sites use this system, i.e. this form of spamming is extremely widespread among the powerful companies and businesses. Page Rank and link reputation are collected within the network used creatively to dominate the best keyword phrases. Search engines haven't found a way to stop this technique, but they will eventually, since this form of spamming is a major threat to the quality of search results.

Wiki, blogs and forums. "Wiki" stands for a type of public content management environment where anyone can post content. The best known example is "wikipedia.org", the collectively edited online encyclopedia. Wikis can be a great way to present and edit ideas without close censorship, and have proven extremely successful for creation and maintenance of projects that require input from users around the globe.

The popularity-based search engines tend to rank Wiki-based sites high, and this makes them valuable sources of Page Rank and therefore a favorable platform for search engine spammers. Without close human control, users may simply add their links to a Wiki as a means to take advantage of the Wiki's PR. Until another user of the Wiki removes the link, the linked site enjoys the benefits of this unscrupulous activity.

Blog is an abbreviation for "Web log" and is another kind of content management environment usually led by a single person or small organization, however other participants can also post their comments, ideas and other content related to the theme of the blog. Blogs of famous experts can be a source of precise, up-to-date and technically detailed information and are thus very valuable to info-hungry searchers, and are extremely popular. For some reason blogs easily reach the top (at least relatively high) positions in the search engines. As such, they have become a spammers' heaven – but just for a while, Google recently stopped indexing them at all because of the high level of abuse.

Finally, a forum is yet another kind of website designed so that a community can post their comments on a certain topic to a single topic page. Like blogs, forums can be a rich source of relevant information. Unfortunately, some forum participants make comments in forums when their only purpose is to publish links back to their own sites. This may be acceptable if the user provides help or assistance to another forum member. However, when the posts become excessive and are comprised solely of irrelevant comments, then the value of the link and the whole forum can be put into question. Some forum owners only start forums in the hope that they will raise search engine rankings.

In terms of contemporary optimization, forums cannot be regarded as a serious SEO strategy.

Dynamic page generation and cloaking. It is possible for a Web server to produce and serve different, optimized pages according to the referrer of any page request. First, the search engine robots come from definite IP addresses; these IP addresses are no secret and widely known. So for a dynamic server-side script it's possible to detect the visitor's IP address and if this is a spider's IP, the script will serve a highly optimized (mostly plain text) page, if it is an ordinary visitor, quite another document is served.

This practice is called cloaking.

Another use of cloaking is to load a site in response to the spider's request with hundreds of phantom pages (dynamic URL addresses) that act as affiliate links to another site. When the search engines began to spider dynamic URL addresses, they made themselves vulnerable to this kind of spamming. As their technology gets more mature, this spamming strategy will fall into disfavor with search engines.

DHTML spamming. DHTML stands for "Dynamic HTML". Although it involves the word "dynamic", all the dynamics is purely browser-side and consists of advanced usage of CSS (cascading style sheets) and JavaScript to make the pages interact with the visitors.

DHTML allows for the content to be visually organized into several different layers, one above the other. Using layering, spammers can hide layers of keywords beneath graphics. One layer covers the other visually, yet the text hidden on the lowers layer is readable by the search engine robot. As you have already guessed, this is just another example of a highly illegal technique.

The power of CSS also allows the unscrupulous SEO to hide the content of table cells stuffed with keywords and heading tags. CSS permits the flexible positioning of Web page elements and search engines do not fully understand it at present. In this example, the CSS affects the display of the body of the Web page, which is set to 97%:

{font-family: Arial, Helvetica, sans-serif; width:97%; font-size: 10pt; overflow: hidden; color: #000000;
margin: 0px;}.

Within the regular code, image spacer files can be placed to extend the page at a width of 150%, while the directive "overflow:hidden" within CSS makes these extra 50% invisible for the visitors ensuring that part of the page is not seen. A spammer then stuffs that space with plenty of keywords in
tags and other strategically important page areas.

Machine-generated content. One of the simplest tricks is to automatically generate hundreds (if not thousands) of pages with only several keyword phrases stretched out all across them. The pages are built with templates and the sentences within them are basically shuffled from one page to the next. Finally, unique title tags are plugged into each page that's generated.

This technique basically sees the same page repeated hundreds to thousands of times. It can even be done using a computer program that systematically stuffs the text sentences, paragraphs and headings, including keywords, into pages. E-commerce sites that have a limited range of products for sale often make use of this technique. Often, the products are simply re-organized or shuffled to create another page that appears to be unique. It's actually the same selection of products presented in countless different ways.

Presently, it's difficult to say whether this activity can help with the search engine rankings. In any case, I do not advise that you use this technique because it is potentially dangerous as the spiders are highly aware of various kinds of keyword stuffing.



Comments

Post new comment

  • Lines and paragraphs break automatically.
  • You can use BBCode tags in the text. URLs will automatically be converted to links.
  • Allowed HTML tags: <p> <br> <b> <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img> <span> <object> <param> <embed> <table> <tr> <td> <div>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

Join Xomba Today

Do you like to write? Would you like to make a little extra money on the side? These people do. Join the Xomba community today.
Become a Member