Project

General

Profile

Support #9692

Addition to robots.txt for newlaborforum.cuny.edu

Added by Diane Krauthamer almost 2 years ago. Updated almost 2 years ago.

Status:
Rejected
Priority name:
Normal
Assignee:
-
Category name:
SEO
Target version:
-
Start date:
2018-05-01
Due date:
% Done:

0%

Estimated time:

Description

I am doing some SEO cleanup for newlaborforum.cuny.edu and have found that Google crawls a number of URLs that point to the main login page. Because of this I would like to make a change to the robots.txt file, which should be on the server. Therefore I would like the following line added to the robot.txt file:

Disallow: http://newlaborforum.cuny.edu/wp-login.php

This should prevent Google from crawling those specific pages. If this is not an option, could you please advise on alternatives to this? Please let me know if you have any questions, and thank you for your help.

Diane Krauthamer
Web Designer, New Labor Forum

History

#1 Updated by Boone Gorges almost 2 years ago

  • Status changed from New to Rejected

Hi Diane - As discussed in https://redmine.gc.cuny.edu/issues/9684#note-2, our robots.txt is shared by the entire installation, so cannot be customized for the needs of specific subsites. Moreover, the robots.txt standard does not allow for paths to be defined in an absolute way (using http://...); paths must be relative. See eg https://developers.google.com/search/reference/robots_txt#group-member-records.

It's worth noting that the login page on Commons sites already has a meta tag like this:

<meta name='robots' content='noindex,follow' />

So no changes at the robots.txt level should be necessary in this specific case.

For further customizations, you may want to look into the Yoast SEO plugin, which allows the site admin to set up noindex meta tags for various pieces of content.

#2 Updated by Diane Krauthamer almost 2 years ago

Thanks, I understand that but it seems that the feature which is usually available on Yoast (the ability to update the robots.txt file) but is not included in this edition. It does allow users to block indexing some types of URLs, but not duplicates such as these. However I understand that figuring this out goes beyond the developer team so I will search for another plugin or another way to do this.

Also available in: Atom PDF