Robots.txt missing some entries
Problem reported by Simon Sprott - July 30, 2017 at 2:26 AM
Submitted
The community area link for "Report Abuse" "Edit Reply" etc launch a java script form, but also present a link which appends 'ThreadPanels' to the url and returns the same page. When spiders crawl the site they keep traversing these links recursively.
 
This causes a number of issues, they sometime keep digging quite a long way generating quite a lot of traffic, also the SEO/crawl reports I've looked at don't like them.
 
Could you add a rule to the robots.txt file you generate to stop spiders looking at links with 'ThreadPanels' in them.
Maybe something like this
Disallow: /community/*/ThreadPanels/
 
Sample urls being requested.
http ://www.example.com/community/a10/ThreadPanels/%3Ca%20target='_blank'%20href='https :/www.example.com/my-downloads'%3Ehttps :/www.example.com/my-downloads%3C/a%3E
http ://www.example.com/community/a10/ThreadPanels/ThreadPanels/%3Ca%20target='_blank'%20href='https :/www.example.com/my-downloads'%3Ehttps :/www.example.com/my-downloads%3C/a%3E
http ://www.example.com/community/a10/ThreadPanels/ThreadPanels/ThreadPanels/%3Ca%20target='_blank'%20href='https :/www.example.com/my-downloads'%3Ehttps :/www.example.com/my-downloads%3C/a%3E
http ://www.example.com/community/a10/ThreadPanels/ThreadPanels/ThreadPanels/ThreadPanels/%3Ca%20target='_blank'%20href='https :/www.example.com/my-downloads'%3Ehttps :/www.example.com/my-downloads%3C/a%3E
etc
 

Reply to Thread