should i fix this via robots.txt or htaccess redirect or?
#1
Posted 05 February 2012 - 22:38
www.mydomain.com/Translate.php?Lang=de&Page=http://www.mydomain.com/inner-post
Should I do some sort of redirect of all these pages or add them to robots.txt or how can I fix this junk so it no longer is in the index?
Thanks
#2
Posted 06 February 2012 - 00:41
i am not sure how robots.txt would fix .
Is it coming up in your error logs?
is it giving a 404 ?
Does it show on WM tools as a crawl error
#3
Posted 06 February 2012 - 01:16
1) If you've got WMT activated on your site you can use the parameter blocking settings to block the "Lang" and "Page" parameters. After you do that they should deindex themselves over time.
2) You could use a wildcards in the HTAccess to 301 redirect everything with those parameters in. Just tested this on my local server and it seemed to do the trick:
RewriteEngine on RewriteRule ^translate.php(.*) /? [L,R=301]
3) Or you can add something like this to your robots.txt file.
Disallow: *page=* Disallow: *lang=*
Edited by malphas, 06 February 2012 - 06:13.
#4
Posted 09 February 2012 - 21:14
my thoughts on adding to robots would be to block G from seeing it at all therefor negating the issue
gonna go with redirect
actually almost didnt check it, but this is like the only site i own that is actually in webmaster tools
DAMNIT! I thought I had fixed an old problem ...months ago some asshole totally spammed my site with all sorts of porno injected pages,got in thru my host somehow. I nuked them all but they are all now just still showing up as 'not found' pages and coming up in the not found crawl error section of gwt. Can/should I add them to my htacess as a 301 as well?
Though back on point, i do not see any of the translation pages in the crawl errors in gwt..just all the old porn pages?
Fishingman1, on 06 February 2012 - 00:41, said:
i am not sure how robots.txt would fix .
Is it coming up in your error logs?
is it giving a 404 ?
Does it show on WM tools as a crawl error
malphas, on 06 February 2012 - 01:16, said:
1) If you've got WMT activated on your site you can use the parameter blocking settings to block the "Lang" and "Page" parameters. After you do that they should deindex themselves over time.
2) You could use a wildcards in the HTAccess to 301 redirect everything with those parameters in. Just tested this on my local server and it seemed to do the trick:
RewriteEngine on RewriteRule ^translate.php(.*) /? [L,R=301]
3) Or you can add something like this to your robots.txt file.
Disallow: *page=* Disallow: *lang=*
Think option 2 seems the best all around and am going to implement that
Could i fix my old porn issue i mentioned above in the same way? All of the xxx pages were uploaded in a separately created folder by the spammer/hacker. how would i tweak that rewrite rul for the url...to address everythign in a folder?
THANKS SO MUCH!
#6
Posted 10 February 2012 - 00:22
Not a lot of what you are saying makes sense.
Happy to help need more details .
I am a Better WM than SEO.
PM if you like.
#7
Posted 10 February 2012 - 01:17
RewriteEngine on RewriteRule ^xxx/(.*) / [L,R=301]
Edited by malphas, 10 February 2012 - 06:17.
#8
Posted 10 February 2012 - 09:07
This is what I'm looking at with one of the pages with the spam folder http://www.myblog.com/gfjtal/Charlie-Chase
I have RewriteEngine on further up in my htaccess (though i did add it in right above the new rule to test, didnt seem to make a difference and is redundant?)
RewriteRule ^gfjtal/(.*) / [L,R=301] is not working? What am I missing?
I have an issue in another open cart based site as well I need to redirect an old blog on so this will be great to figure out I greatly appreciate it
#9
Posted 10 February 2012 - 10:19
Right, I just tested it on my local server and it worked fine. However, I tried it on a live website and it did not work.
I managed to fix it by putting it as early as possible in the HTAccess file. I'm not sure what your HTAccess file looks like but mine had a lot of stuff like canonical fixes and W3 cache stuff. Try messing about with the placement, failing that strip out your URLs and PM it over to me and I'll put it up on my server to work out what's breaking it.
Edited by malphas, 10 February 2012 - 10:21.
#10
Posted 10 February 2012 - 11:02
I have a similar yet different issue as well that I'm not sure how to address...example url i had this structure in place on an opencart site domain.com/buy keyword notice the space....i just had it switched to domain.com/buy-keyword adding the hyphen to have a continuous url and no spaces
but the old space based urls are still coming up, how do i redirect or otherwise fix that properly? i know a url space comes up as %20 or something like that in some text readers....but im at a loss how to address this properlyIf I
If I can build some nice links for you or something let me know...do you do any sort of freelance web work?
Edited by googlealchemist, 10 February 2012 - 11:07.
#11
Posted 10 February 2012 - 11:30
redirect 301 "/old url" "http://www.newurl.com"
That should work regardless whether you enter it with a space or a %20.
Nahh seriously dude, don't worry about it, happy to help!
#12
Posted 13 February 2012 - 16:02
I suppose there is no similar option like there was for the translation page issue or the whole folder
issue as they were just all redirected to the root, vs needing all these to go to the individual pages.
Was poking around G forums and found these comments...
"On a minor point, if you ever need to (e.g.) do redirections using the .htaccess you may have to adopt
special syntax to get over the problem of the spaces in the URLs."
Which you seem to have addressed by your above post...
"Consider using punctuation in your URLs. The URL http://www.example.com/green-dress.html is much more
useful to us than http://www.example.com/greendress.html."
So G is reading both urls as different? Currently on my site,with space,with hyphen instead of
space,with neither hypen nor space but the two words backtoback all work to take me to the same page. I
was concerned G would see all these urls as different and hit me for onsite dup content. Then started
hoping it wouldn't acknowledge the spaces or hyphens on a practical level so it wouldn't matter, now I'm
doubting that again doh!
#13
Posted 14 February 2012 - 01:27
You might be able to automate it, but if there isn't any kind of pattern and its just a list of URLs that need to be fixed you'll need to look at other options. Have you got any examples of the URLs?
If there wasn't any kind of pattern I would use Screaming Frog to spider the website, export the CSV of URLs and do some finding/replacing to create the rules.
#14
Posted 14 February 2012 - 12:28
running screamingfrog now, very cool thanks for that tip
#15
Posted 15 February 2012 - 03:01
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users













