-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite URLs in imported WXR files to avoid broken navigation links (white screen, errors, nested Playground) #1780
Comments
When I navigate around the site not using the links in the navigation block, the pages load consistently. Screen.Recording.2024-09-18.at.14.51.42.movAt the end of the video, you see me click on the Home link in the navigation, it actually loads another playground instance into the site. |
I'm wondering if these two could be related #349 This seems like a caching bug to me. |
Thank you @adamziel for setting me straight... Glad it was so easy to transfer. |
The default instance of playground uses a URL like https://playground.wordpress.net/scope:0.5198681762892301/?page_id=2 (TT4, Sample page in Header) On my site it only has the URL Is there a way for me to modify the URL in the Navigation space of my .xml file from relative links |
This is definitely related to scope. After the first load, the page is I think that there is an underlying problem because |
To something like https://playground.wordpress.net/scope:{somestring}/about-us? Great research @bph! You are right about the root cause being imported URLs that aren't rewritten. I'm not sure what's the best way to address this and will need to work with @adamziel and @brandonpayton on finding possible next steps. |
It looks like we are attempting to add the scope to the URL if it doesn't exist but that scope isn't used later by the browser or our code (I still don't know). |
I see a few directions here, but I'm not sure what to do.
|
I'm moving this to blocked until I get some feedback from @WordPress/playground-maintainers. |
@bgrgicak thank you so much for pushing this forward. This is actually also a problem when migrating sites to other servers, as absolute links need to have a search/replace function. If Playground can do it out of the box, there wouldn't be a need for me to modify the original site export file for images and links. And a two section of my tutorial could be cut could be cut. 🤔 Seems you have enough information to tackle this. Just want to mention that this is not only a hick-up in relation to the navigation block but happens with normal on page links, to be visible on the Templates page. Those also don't work.
|
@bph A proper resolution will take a few months. Is there a way you could ship that block without an absolute URL in the Longer answer: The imported WXR file contains this code:
Which is not rewritten by the WXR importer we're currently using. I'm not aware of a tool that we could use in Playground that would also could correctly handle that today. I'm planning to fork/build a WXR importer and bake in the URL rewriting using the plumbing we've been exploring for the past year [1] [2]. Once it matures, I'll want to propose it for WordPress core. [1] https://github.com/adamziel/site-transfer-protocol |
@adamziel thanks for looking into this again. I am a bit confused as to what you see as absolute link and relative link
|
So for the header navigation, the examples of how the theme Twenty-Twenty-Four works out of the box got me thinking. If I added all the pages and be deliberate with the page parent selection, the theme default navigation probably will work with the page list, create the submenus and some voodoo that is built into it. (voodoo = not entirely clear, how it works) So with the v2 blueprint and v2 content, I was able to get this part working. Screen.Recording.2024-09-27.at.17.37.35.movIn the video you can see that all link from the top navigation have a scope assigned and load pages from a virtual (or how you want to call it) directory. It works because I didn't create a custom navigation block. The automatism built into WordPress takes care of it. but it seems Playground already rewrites links and adds scope to the URLs. Next steps:
|
…2058) ## Description Adds the Data Liberation WXR importer as an option in the `importWxr` step. The new importer is turned by including the `"importer": "data-liberation"` option: ```json { "steps": [ { "step": "importWxr", "file": { "resource": "url", "url": "https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml" }, "importer": "data-liberation" } ] } ``` When the `importer` option is missing or set to "default," nothing changes in the behavior of the step and it continues using the https://github.com/humanmade/WordPress-Importer importer. The new importer: * Rewrites links in the imported content * Downloads assets through Playground's CORS proxy * Parallelizes the downloads * Communicates progress This PR is a part of #1894 ## Implementation details This `importWxr` step fetches and includes the `data-liberation-core.phar` file. The phar file is built with [Box](https://box-project.github.io/box/configuration/) and contains the importer library with its dependencies, which is a subset of the Data Liberation library, a subset of the Blueprints library, and a few vendor libraries. This, unfortunately, means that any changes in the PHP files require rebuilding the .phar file. Here's how you can do it: ```bash nx build:phar playground-data-liberation ``` You can also build the entire Data Liberation package as a WordPress plugin complete with a wp-admin page: ```bash nx build:plugin playground-data-liberation ``` Both commands will output the built files to `packages/playground/data-liberation/dist` The progress updates are a first-class feature of the new importer. The updated `importer` step receives them in real-time via a `post_message_to_js()` call running after every import step. Then, it passes them on to the progress bar UI. ### Other changes * **TLS traffic now goes through the CORS proxy.** Since the new importer uses `AsyncHTTP\Client` which deals with raw sockets, Playground's [TLS-based network bridge](#1926) runs the outbound traffic through a cors proxy. Technically, `TCPOverFetchWebsocket` gets the `corsProxy` URL passed to the `playground.boot()` call. * A few composer dependencies were forked, downgraded to PHP 7.2 using Rector, and bundled with this PR to keep the Data Liberation importer working. ## Remaining work - [x] PHP 7.2 compatibility. Done by forking and Rector-downgrading dependencies that were incompatible with PHP 7.2. - [x] Report the importer's progress on the overall Blueprint progress bar - [x] Enqueue the data liberation plugin files for downloading at the blueprint compilation stage - [x] Don't eagerly rewrite attachments URLs in `WP_Stream_Importer`. Exposing this information to the API consumer requires an explicit decision. Do we rewrite it? Or do we ignore it? - [x] Fix the TLS errors at the intersection of Playground network transport and the async HTTP client library - [x] Separate the markdown importer and its dependencies (md parser, frontmatter parser, Symfony libraries) from the core plugin - [x] Ship the importer and its tree-shaken deps (URL parser) as a minified zip/phar ## Follow-up work - [ ] Reconsider the `WP_Import_Session` API – do we need so many verbosely named methods? Can we achieve the same outcomes with fewer methods? - [ ] Investigate why there's a significant delay before media downloads start on PHP 7.2 – 7.4. It's likely a PHP.wasm issue. ## Testing instructions * Default importer – [Open this link](http://localhost:5400/website-server/#{%20%22plugins%22:%20[],%20%22steps%22:%20[%20{%20%22step%22:%20%22importWxr%22,%20%22file%22:%20{%20%22resource%22:%20%22url%22,%20%22url%22:%20%22https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml%22%20}%20}%20],%20%22preferredVersions%22:%20{%20%22php%22:%20%228.3%22,%20%22wp%22:%20%226.7%22%20},%20%22features%22:%20{%20%22networking%22:%20true%20},%20%22login%22:%20true%20}) and confirm it does what the current `importWxr` step do, that is it stays at "Importing content" for a moment, fails to fetch media files (CORS issues in network tools), but inserts posts and pages. * Data Liberation – [Open this link](http://localhost:5400/website-server/#{%20%22plugins%22:%20[],%20%22steps%22:%20[%20{%20%22step%22:%20%22importWxr%22,%20%22importer%22:%20%22data-liberation%22,%20%22file%22:%20{%20%22resource%22:%20%22url%22,%20%22url%22:%20%22https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml%22%20}%20}%20],%20%22preferredVersions%22:%20{%20%22php%22:%20%228.3%22,%20%22wp%22:%20%226.7%22%20},%20%22features%22:%20{%20%22networking%22:%20true%20},%20%22login%22:%20true%20}), confirm the import progress is visible and that the content and media indeed get imported: ![CleanShot 2024-12-08 at 14 54 49@2x](https://github.com/user-attachments/assets/a7da3244-a10f-43d2-8e94-43d305220a7e) ## Related issues * #1211 * #2012 * #1477 * #1250 * #1780
On this Playground site. I get intermittent success when using the navigation menu. One-page load works, subsequent page loads show a white screen.
The content is all working when I got to WP-admin > Pages and use the View of each page. But sometimes on link works, but then the next one doesn't.
The content and blueprint can be viewed in this repo.
Here is a video of my clicking around on the site.
Screen.Recording.2024-09-18.at.13.16.13.mov
The text was updated successfully, but these errors were encountered: