Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtered post content is truncated in post-content block #68605

Open
ryelle opened this issue Jan 10, 2025 · 12 comments · May be fixed by #68926
Open

Filtered post content is truncated in post-content block #68605

ryelle opened this issue Jan 10, 2025 · 12 comments · May be fixed by #68926
Assignees
Labels
[Block] Post Content Affects the Post Content Block [Feature] Block hooks [Status] In Progress Tracking issues with work in progress [Type] Bug An existing feature does not function as intended

Comments

@ryelle
Copy link
Contributor

ryelle commented Jan 10, 2025

When the content is filtered with the_content filter, the replaced content is truncated because it's not wrapped in the post-content block.

Screenshots of the issue

On Five for the Future we inject the company description:

Image

In "notice" blocks, we run the_content to parse markup:

Image

To reproduce

  • Add the following to your site, which will replace all content with 123456789
     add_filter(
     	'the_content',
     	function() {
     		return '123456789';
     	},
     	1
     );
    
  • View any page, and you'll only see 456.

More details

This is due to #67272 adding add_filter( 'the_content', 'remove_serialized_parent_block', 8 );, and remove_serialized_parent_block does not check for serialized blocks before running substr. This results in it trimming from 3, -3 if there is no wrapper.

Additionally, any blocks that run the_content on attributes are also truncated, as this filter runs in the course of rendering the block.

I'm not sure if the better solution is to add a check in remove_serialized_parent_block, or to change how the filter is applied in core/post-content, but I'm happy to draft a core patch if needed.

@ryelle ryelle added [Block] Post Content Affects the Post Content Block [Feature] Block hooks [Type] Bug An existing feature does not function as intended labels Jan 10, 2025
@Mamaduka
Copy link
Member

Related #68614.

@t-hamano
Copy link
Contributor

Since this issue is likely to first appear in Gutenberg 20.0, i.e. the WP 6.8 release, I will add it to the project board.

@Mamaduka
Copy link
Member

Resharing @PaulREnglish's finding, which matches the ones in the description - #68614 (comment).

@richtabor
Copy link
Member

Noting I'm getting this as well—my blog. Any ideas to resolve? 😬

Image

@Mamaduka
Copy link
Member

@ockham or @gziolo might know a temp solution.

@t-hamano
Copy link
Contributor

In any case, it seems necessary to fix the remove_serialized_parent_block() function itself.

echo ( remove_serialized_parent_block( 'Hello World') );
// Output: lo Wo

echo ( remove_serialized_parent_block( '<p>Hello World</p>') );
// Otuput: Hello World<

@Mamaduka
Copy link
Member

@audrasjb re-opened related Trac ticket - https://core.trac.wordpress.org/ticket/61074.

So, functions like remove_serialized_parent_block should check for serialized block content. Something like str_contains( $content, '<!-- wp:' ) and early return should do the trick.

@gziolo
Copy link
Member

gziolo commented Jan 28, 2025

I can confirm that I've seen the same problem that @richtabor reported in #68605 (comment). It was for the post authored with blocks, so the comment mentioned by @Mamaduka will probably be needed but it won't cover all issues. There is has_block utility that can be used to not run the logic for post content that doesn't have blocks:

https://github.com/WordPress/wordpress-develop/blob/7623fc1c051e63ae18e81a5ba1894e9398d71184/src/wp-includes/blocks.php#L770-L782

Other reports need to be further investigated.

@ockham
Copy link
Contributor

ockham commented Jan 28, 2025

Thanks for the ping! I'll work on a fix.

@ockham
Copy link
Contributor

ockham commented Jan 28, 2025

I think the correct fix will be to absorb the remove_serialized_parent_block logic (as well as the logic that adds the serialized parent block in the first place) into apply_block_hooks_to_content (which is something I've discussed with @gziolo before), or introduce a new function that combines them.

The problem is that these operations aren't "atomic" at the moment. Consider the relevant code in the Post Content block:

// Wrap in Post Content block so the Block Hooks algorithm can insert blocks
// that are hooked as first or last child of `core/post-content`.
$content = get_comment_delimited_block_content(
'core/post-content',
$attributes,
$content
);
// We need to remove the `core/post-content` block wrapper after the Block Hooks algorithm,
// but before `do_blocks` runs, as it would otherwise attempt to render the same block again --
// thus recursing infinitely.
add_filter( 'the_content', 'remove_serialized_parent_block', 8 );
/** This filter is documented in wp-includes/post-template.php */
$content = apply_filters( 'the_content', str_replace( ']]>', ']]&gt;', $content ) );
unset( $seen_ids[ $post_id ] );
remove_filter( 'the_content', 'remove_serialized_parent_block', 8 );

We're adding the parent block wrapper on L59-63. We then add the remove_serialized_parent_block filter on L68, apply all the_content filters (which include apply_block_hooks_to_content), and remove the filter again on L74.

As @ryelle pointed out, this breaks the moment that another the_content filter is added which replaces the content with markup that isn't wrapped in the expected parent block markup.

So I don't think we should fix this by making remove_serialized_parent_block smarter, as that would always be a "heuristic" fix that looks at the content and tries to determine from it if it's wrapped in a parent wrapper block. This is never going to be as reliable as keeping the sequence of operations (wrap content in parent block, apply block hooks, remove parent wrapper block) intact.

@ockham
Copy link
Contributor

ockham commented Jan 28, 2025

I think the correct fix will be to absorb the remove_serialized_parent_block logic (as well as the logic that adds the serialized parent block in the first place) into apply_block_hooks_to_content (which is something I've discussed with @gziolo before), or introduce a new function that combines them.

Turns out I've already filed a ticket for that: https://core.trac.wordpress.org/ticket/62716

@ockham ockham linked a pull request Jan 28, 2025 that will close this issue
1 task
@ockham
Copy link
Contributor

ockham commented Jan 28, 2025

PR (WIP): #68926

@github-actions github-actions bot added the [Status] In Progress Tracking issues with work in progress label Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Block] Post Content Affects the Post Content Block [Feature] Block hooks [Status] In Progress Tracking issues with work in progress [Type] Bug An existing feature does not function as intended
Projects
Development

Successfully merging a pull request may close this issue.

6 participants