Jump to content
Richar Green

Blocking MajesticSEO, OpenSiteExplorer, Ahref, Spyglass etc...

Recommended Posts

I agree with the latter.

if Google went into stealth mode and changed their useragent to any of the blocked bots and then compared this to their normal useragent, they'd be able to isolate sites that have applied this technique quite quickly.

Share this post


Link to post
Share on other sites

Cool. This would be awesome to apply. But won't think make the Googlebot even more suspicious of the activity of sites?

The question you have to ask yourself is, "What does the Googlebot have to do with the Spyfyu bot, or the ahrefs bot or any other spam bot, or link checking bot that belongs to a private company?"

Once you answer that question, you quickly realize that the Googlebot has nothing to do with any of the non-Google related bots that belong to private companies.

I agree with the latter.

if Google went into stealth mode and changed their useragent to any of the blocked bots and then compared this to their normal useragent, they'd be able to isolate sites that have applied this technique quite quickly.

But what you "may" be missing is the fact that blocking certain bots, site rippers and crawlers from accessing one's site is a common SEO practice. Consequently, blocking the aforementioned bots is not anything out of the ordinary.

Share this post


Link to post
Share on other sites

The question you have to ask yourself is, "What does the Googlebot have to do with the Spyfyu bot, or the ahrefs bot or any other spam bot, or link checking bot that belongs to a private company?"

Once you answer that question, you quickly realize that the Googlebot has nothing to do with any of the non-Google related bots that belong to private companies.

It has the same to do with other bots that it has to do with YOUR site (that might be owned by a private company), your domain, your content, your links, your authority, your trust etc.

If Google sees you are blocking some pages of your site for some bots aka, kinda cloaking, that would apparently raise a red flag.

Share this post


Link to post
Share on other sites

It has the same to do with other bots that it has to do with YOUR site (that might be owned by a private company), your domain, your content, your links, your authority, your trust etc.

If Google sees you are blocking some pages of your site for some bots aka, kinda cloaking, that would apparently raise a red flag.

-1

Oh dear, Google aren't going to care about something like this, they only may take a bit of notice if this practice went mainstream - which it wont.... Even then not everyone who wants to hide their links will be doing shady practices... They may just want to hide their high PR blog comment sources for instance...

Share this post


Link to post
Share on other sites

You lost me here; maybe one of us is not getting the picture???

It has the same to do with other bots that it has to do with YOUR site (that might be owned by a private company), your domain, your content, your links, your authority, your trust etc.

If Google sees you are blocking some pages of your site for some bots aka, kinda cloaking, that would apparently raise a red flag.

To be clear, I am not talking about blocking content; nor am I talking about presenting certain content to viewers but different content to the search engines, which is the whole premise behind cloaking.

I’m talking about blocking certain bots, which has absolutely nothing to do with cloaking. I have a number of bots that I block and have done so for years. I’ve had to block certain BOTS because they were attempting to scrape my sites. I have to block others that I found to be site scrapers and others that were eating up bandwidth for no reason.

Again, blocking certain BOTs is a common SEO practice and I cannot fathom how anyone can tie blocking a spam bot or a site scraper bot or any other bot for that matter into a means to put Google on some type of an alert.

If someone can explain that, I'd sure like to hear it because I have never had an issue from blocking certain BOTs

Edited by SirKnight

Share this post


Link to post
Share on other sites

If someone can explain that, I'd sure like to hear it because I have never had an issue from blocking certain BOTs

Of course it's not an issue. Gee, you can even block the Googlebot from certain pages and not suffer any undesired consequences. Some people are just jumpy!

Share this post


Link to post
Share on other sites

I always block this bots and tell all peopel under me to do so

@igl00 how are you doing it? Would be awesome if you could share the code you are using in the .htaccess file

And do you know how to block seo spyglass bot?

The thread on warriorforum is a little unclear

Edited by Adams

Share this post


Link to post
Share on other sites

This is the code I'm using in my .htacess that I found from a Google search and the user agents from the WF thread.

Please login or register to see this code.

Is this code working for you and has it had any effect on the rankings?

I have found this one in the WF thread:

RewriteEngine On

RewriteCond %{HTTP_USER_AGENT} ^rogerbot [OR]

RewriteCond %{HTTP_USER_AGENT} ^exabot [OR]

RewriteCond %{HTTP_USER_AGENT} ^MJ12bot [OR]

RewriteCond %{HTTP_USER_AGENT} ^dotbot [OR]

RewriteCond %{HTTP_USER_AGENT} ^gigabot [OR]

RewriteCond %{HTTP_USER_AGENT} ^AhrefsBot

RewriteRule .* - [F]

Is there any difference between those 2?

Share this post


Link to post
Share on other sites

Not too sure. There are more than one way to skin a cat with apache. I'm not getting crawled by said bots so assume it's working.

Just a thought: aren't links off the domain so wouldn't robots.txt and htaccess blocks be pointless? They don't need to crawl my site to know a backlink is pointing to it.

Share this post


Link to post
Share on other sites

Credit goes to OP of this thread at WF.

Please login or register to see this link.

I think this is interesting to protect your network being exposed to your competitors and to save it from getting reported as well.

Any thoughts?

Props to Ricar for bringing this to everybody's attention. Excellent!

People need to understand how powerful htaccess is in general. You can globally handle all affiliate codes from one file, you can handles how your website address resolves www etc., you can block IP's, and you can block countries. I block certain countries routinely because all they do is rip content.

I do know one thing though - bots can ignore robots.txt files, but they cannot get past a block on the htaccess file (as far as I know). I believe every bot must by definition have an IP, so that is probably a smarter way to stop them. After all, they could change their name, and then the block won't work.

Just a thought: aren't links off the domain so wouldn't robots.txt and htaccess blocks be pointless? They don't need to crawl my site to know a backlink is pointing to it.

This is a great comment. For those of you looking to stop majestic, ahrefs, etc. from logging all of your links so people cannot diagnose/reverse engineer your links, I don't think this stops the bots from recording who links to you. Unless of course you do business with somebody like fraggler who blocks those bots on his end.

Share this post


Link to post
Share on other sites

Great to see this thread back in discussions. This is what im using to succesfully block those bots to protect my network.

.htaccess file code

RewriteEngine On

RewriteBase /

RewriteCond %{HTTP_USER_AGENT} ^rogerbot [OR]

RewriteCond %{HTTP_USER_AGENT} ^exabot [OR]

RewriteCond %{HTTP_USER_AGENT} ^MJ12bot [OR]

RewriteCond %{HTTP_USER_AGENT} ^dotbot [OR]

RewriteCond %{HTTP_USER_AGENT} ^gigabot [OR]

RewriteCond %{HTTP_USER_AGENT} ^AhrefsBot

RewriteRule ^.* - [F,L]

---

...in .htaccess blocked Ahrefs and OSE but, for some reason, not Majestic. So, I also put...

robots.txt file code

User-agent: Googlebot

Disallow:

User-agent: msnbot

Disallow:

User-agent: Slurp

Disallow:

User-agent: Teoma

Disallow:

User-agent: rogerbot

Disallow: /

User-agent: exabot

Disallow: /

User-agent: MJ12bot

Disallow: /

User-agent: dotbot

Disallow: /

User-agent: gigabot

Disallow: /

User-agent: AhrefsBot

Disallow: /

User-agent: *

Disallow: /

---

...in robots.txt which should block everything except Google, Yahoo and Bing.

Share this post


Link to post
Share on other sites

Just a thought: aren't links off the domain so wouldn't robots.txt and htaccess blocks be pointless? They don't need to crawl my site to know a backlink is pointing to it.

Props to Ricar for bringing this to everybody's attention. Excellent!

People need to understand how powerful htaccess is in general. You can globally handle all affiliate codes from one file, you can handles how your website address resolves www etc., you can block IP's, and you can block countries. I block certain countries routinely because all they do is rip content.

I do know one thing though - bots can ignore robots.txt files, but they cannot get past a block on the htaccess file (as far as I know). I believe every bot must by definition have an IP, so that is probably a smarter way to stop them. After all, they could change their name, and then the block won't work.

This is a great comment. For those of you looking to stop majestic, ahrefs, etc. from logging all of your links so people cannot diagnose/reverse engineer your links, I don't think this stops the bots from recording who links to you. Unless of course you do business with somebody like fraggler who blocks those bots on his end.

Using your .htaccess file is much better than your robots.txt file as bots can and many do ignore the robots.txt file but if you block them using your .htaccess file they can not access your site at all.

You are correct that blocking bots from your money sites does not really do anything to keep the crawlers from finding your links. The best use for that purpose is to use it on your network sites that you are using to rank your sites. If you block the bots, it makes it much harder for anyone to find and out your network.

Share this post


Link to post
Share on other sites

Do I add the specific lines inside of the rewrite rules or do I just add another

RewriteEngine On

RewriteBase /

Ok that was confusing let me try again. Is this correct? I included it with what was already in the rewrite rules:

# BEGIN WordPress

<IfModule mod_rewrite.c>

RewriteEngine On

RewriteBase /

RewriteRule ^index\.php$ - [L]

RewriteCond %{REQUEST_FILENAME} !-f

RewriteCond %{REQUEST_FILENAME} !-d

RewriteRule . /index.php [L]

RewriteCond %{HTTP_USER_AGENT} ^rogerbot [OR]

RewriteCond %{HTTP_USER_AGENT} ^exabot [OR]

RewriteCond %{HTTP_USER_AGENT} ^MJ12bot [OR]

RewriteCond %{HTTP_USER_AGENT} ^dotbot [OR]

RewriteCond %{HTTP_USER_AGENT} ^gigabot [OR]

RewriteCond %{HTTP_USER_AGENT} ^AhrefsBot

RewriteRule ^.* - [F,L]

</IfModule>

RewriteEngine On

RewriteBase /

# END WordPress

Share this post


Link to post
Share on other sites

Great to see this thread back in discussions. This is what im using to succesfully block those bots to protect my network.

.htaccess file code

RewriteEngine On

RewriteBase /

RewriteCond %{HTTP_USER_AGENT} ^rogerbot [OR]

RewriteCond %{HTTP_USER_AGENT} ^exabot [OR]

RewriteCond %{HTTP_USER_AGENT} ^MJ12bot [OR]

RewriteCond %{HTTP_USER_AGENT} ^dotbot [OR]

RewriteCond %{HTTP_USER_AGENT} ^gigabot [OR]

RewriteCond %{HTTP_USER_AGENT} ^AhrefsBot

RewriteRule ^.* - [F,L]

---

...in .htaccess blocked Ahrefs and OSE but, for some reason, not Majestic. So, I also put...

robots.txt file code

User-agent: Googlebot

Disallow:

User-agent: msnbot

Disallow:

User-agent: Slurp

Disallow:

User-agent: Teoma

Disallow:

User-agent: rogerbot

Disallow: /

User-agent: exabot

Disallow: /

User-agent: MJ12bot

Disallow: /

User-agent: dotbot

Disallow: /

User-agent: gigabot

Disallow: /

User-agent: AhrefsBot

Disallow: /

User-agent: *

Disallow: /

---

...in robots.txt which should block everything except Google, Yahoo and Bing.

Unfortunately Rich, this method didn't work for me.

Ahrefs and Spyglass were still able to pick up some private network links with those mods made.

Share this post


Link to post
Share on other sites

Unfortunately Rich, this method didn't work for me.

Ahrefs and Spyglass were still able to pick up some private network links with those mods made.

They must be using bots with other names.

May be if we can track their IP addresses and block them...then it may work.

Share this post


Link to post
Share on other sites

I would like to know this too. I have been using robots.txt as above to block these sites with success, but I have just noticed that all my backlinks are now showing up. Has something changed? Really would like to find a working solution here...

Edit: I contacted support at Ahrefs and they say they can still find links to our sites "at other sites". They didn't say which ones, anyone have any idea on this?

Any update on this guys. Is there a complete htaccess file that has been proven to block all the big link analysis tools?

Edited by the_judge

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now

×