Project

General

Profile

Actions

Bug #16937

closed

Link checker checks transient links

Added by Raffi Khatchadourian 2 months ago. Updated 2 months ago.

Status:
Rejected
Priority name:
Normal
Assignee:
-
Category name:
-
Target version:
-
Start date:
2022-09-30
Due date:
% Done:

0%

Estimated time:
Deployment actions:

Description

From https://redmine.gc.cuny.edu/issues/16889:

...[the link checker] seems to be going into the linked web pages and checking for broken links. I think it is more desirable only to check the links from common sites. I can't fix the links on the sites I link to, so I don't know what to do with that information. Also, I think this is causing extra work for the CAC. What do you think?


Related issues

Related to CUNY Academic Commons - Bug #16889: Broken link checker not workingResolvedBoone Gorges2022-09-24

Actions
Actions #1

Updated by Boone Gorges 2 months ago

  • Related to Bug #16889: Broken link checker not working added
Actions #2

Updated by Boone Gorges 2 months ago

  • Status changed from New to Reporter Feedback

Thanks for the report. This does indeed seem like incorrect behavior.

As I look closer, I wonder if this is not a bug with broken-link-checker, but a quirk of your link setup. See eg https://khatchad.commons.gc.cuny.edu/wp-admin/post.php?post=4166&action=edit - the entire content of the source page is in the content field. Since it's saved in WP, it perhaps makes sense that broken-link-checker would check the links therein. So the question is, why is the content of the source pages in that field? Do you recall whether you entered it manually, or is it something that your Link Library plugin did for you?

Actions #3

Updated by Raffi Khatchadourian 2 months ago

Oops, sorry, I meant "transitive," not transient.

Actions #4

Updated by Raffi Khatchadourian 2 months ago

Boone Gorges wrote in #note-2:

Thanks for the report. This does indeed seem like incorrect behavior.

As I look closer, I wonder if this is not a bug with broken-link-checker, but a quirk of your link setup. See eg https://khatchad.commons.gc.cuny.edu/wp-admin/post.php?post=4166&action=edit - the entire content of the source page is in the content field. Since it's saved in WP, it perhaps makes sense that broken-link-checker would check the links therein.

Ah! You're right. I think that's a feature of the link library, i.e., to allow you to save the page content in case the link breaks in the future. But, this indeed seems to be not the intended behavior. I would like the link checked its "cached" content.

So the question is, why is the content of the source pages in that field?

See above.

Do you recall whether you entered it manually, or is it something that your Link Library plugin did for you?

I did it manually.

Actions #5

Updated by Raffi Khatchadourian 2 months ago

So, I think I would need an enhancement of the link checker to ignore certain fields in content types.

Actions #6

Updated by Raffi Khatchadourian 2 months ago

Actually, Link Library has its own checker.

Actions #7

Updated by Raffi Khatchadourian 2 months ago

Raffi Khatchadourian wrote in #note-6:

Actually, Link Library has its own checker.

But, it looks like you can't put that on a schedule.

Actions #8

Updated by Raffi Khatchadourian 2 months ago

I've disabled the link library (and the blogroll), and it seems to be OK. It would be nice to check the link library links through the schedule, but that doesn't seem to be possible right now.

Actions #9

Updated by Boone Gorges 2 months ago

  • Status changed from Reporter Feedback to Rejected

I've looked through the codebase of broken-link-checker and I don't see an obvious way that we can make it exclude post_content of links without forking the plugin. Sorry we can't help further with this! The best you can probably do for now is to occasionally run a manual check via Link Library.

Actions

Also available in: Atom PDF