Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.2] Duplicate content when accessing from 2 different URLs #44299

Closed
david-fores opened this issue Oct 17, 2024 · 1 comment
Closed

[5.2] Duplicate content when accessing from 2 different URLs #44299

david-fores opened this issue Oct 17, 2024 · 1 comment

Comments

@david-fores
Copy link

david-fores commented Oct 17, 2024

Steps to reproduce the issue

With the latest changes to the SEF plugin, many scenarios have been improved to avoid duplicate content and erroneously indexing canonical URLs that are not canonical.

However, I have found a case where the redirect does not occur and ends up showing the content of an article with a URL that should not, and also marks it as canonical.

  1. Content > Categories
    Create a category called “category-1”.

  2. Content > Articles
    Create an article “article-1” associated to the category “category-1”, and any text to the article.

  3. Menus > Manage
    Create a menu called "Junk Menu".

  4. Menus > Junk Menu
    Create a menu item of type “Articles > Single Article”.
    Then select the article “article-1”.

  5. Menus > Main Menu
    Create a menu item of type “System Links > Menu Heading” with title "Blog".

  6. Menus > Main Menu
    Create a menu item of type “System Links > Menu Item Alias” with title "alias-to-article-1".
    Then select the article “article-1”.

Expected result

If someone were to access the URL by following the menu path (by typing it manually in the browser for example):
https://www.example.com/blog/alias-to-article-1

He/she should be redirected to the URL of the article:
https://www.example.com/article-1

Actual result

The content of "article-1" is currently displayed without redirecting.
The URL remains at https://www.example.com/blog/alias-to-article-1.
And also in the source code you can see that this URL is marked as canonical.

<link href="https://www.example.com/blog/alias-to-article-1" rel="canonical">

Search engines would be interpreting duplicate content between these 2 URLs, since the 2 would have been marked as canonical but with the same content.
https://www.example.com/article-1
https://www.example.com/blog/alias-to-article-1

System information (as much as possible)

PHP Version: 8.3.6
Database Version: 10.6.17-MariaDB
Web Server: Apache/2.4.58 (Win64) OpenSSL/3.1.3 PHP/8.3.6
WebServer to PHP Interface: apache2handler
Joomla! Version: Joomla! 5.2.0 Stable [ Uthabiti ] 15-October-2024 16:00 GMT

No external plugins installed.

Additional comments

As I mentioned before, this scenario would occur if the user manually entered that path in the browser.

From the HTML code that Joomla generates, I think it would not be possible to access that path, because when creating the link in the menu, it correctly assigns the URL of the article.

But I don't know if any sitemap generator or the crawlers themselves would be able to access it.

I think it is better to have all the possibilities covered.

@Hackwar
Copy link
Member

Hackwar commented Oct 21, 2024

Hello @david-fores, thank you for your report. Yes, Joomla still can encounter duplicate content despite the improvements in 5.2, however this is a very complex problem and we will not be able to solve it for everything. It will be a continuous process to improve this. Since we already have a different feature request with #44310 covering this, I'm closing this issue.

@Hackwar Hackwar closed this as completed Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants