Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove duplicate xmlns property from svg creation #338

Closed
wants to merge 1 commit into from

Conversation

hubgit
Copy link

@hubgit hubgit commented Sep 6, 2019

This removes the duplicate xmlns property set when creating an svg element, because:

a) The SVG namespace is already set when creating the svg element: https://github.com/mathjax/MathJax-src/blob/master/ts/output/svg.ts#L311
b) Setting an attribute called xmlns without the http://www.w3.org/2000/xmlns/ namespace leads to a duplicate xmlns attribute being created when the node is serialized to XML.

Copy link
Member

@zorkow zorkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@dpvc
Copy link
Member

dpvc commented Sep 7, 2019

I have an issue with this PR, but can't write it up just now. I will post later as soon as I get the chance.

@zorkow zorkow requested review from zorkow and dpvc September 12, 2019 13:29
@dpvc
Copy link
Member

dpvc commented Sep 15, 2019

Sorry for taking so long to get to this. (I've just started by to teaching for the fall term, and things are always a bit hectic the first week.)

Here are my concerns.

First, I am not able to reproduce the issue that you describe. If I take one of MathJax's svg elements and use (new XMLSerializer()).serializeToString(svg), I do not get duplicate xmlns attributes. I have checking in Firefox, Chrome, and Safari on Mac OS X. I haven't tried Edge or IE. If this is not what you mean by serializing to XML, can you say what your code is doing differently?

Second, if the xmlns attribute is removed, then when the node is serialized as HTML, the attribute is not produced. Since we want to have this attribute even when produced through outerHTML, removing the attribute presents a problem. If that were to be done, that would have to be compensated for elsewhere.

Here is a code snippet that shows the output of outerHTML and the XMLSerializer:

<!DOCTYPE html>
<head> 
  <title>Testing MathJax v3 xmlns attributes</title>
  <script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
  <script>
  MathJax = {
options: {
    typesetError(doc, math, err) {
        console.log(err);
    }
},
    show(id, svg) {
        svg = svg.replace(/>.*/, '>');
        document.getElementById(id).appendChild(document.createTextNode(svg));
    },
    startup: {
      pageReady() {
          MathJax.typeset();

          const svg = document.getElementsByTagName('svg')[0];
          MathJax.config.show('outerHTML', svg.outerHTML);

          const xml = (new XMLSerializer()).serializeToString(svg);
          MathJax.config.show('serializer', xml);
      }
    }
  };
  </script>
  <script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-svg.js"></script>
</head>
<body>

<p style="visibility:hidden">\(x\)</p>

<p>
outerHTML:
<pre id="outerHTML"></pre>
</p>

<p>
XMLSerializer.serializToString():
<pre id="serializer"></pre>
</p>

</body>
</html>

The results for me are:

outerHTML:

<svg style="vertical-align: -0.025ex;" xmlns="http://www.w3.org/2000/svg" width="1.294ex" height="1.025ex" role="img" focusable="false" viewBox="0 -442 572 453" xmlns:xlink="http://www.w3.org/1999/xlink">

XMLSerializer.serializToString():

<svg style="vertical-align: -0.025ex;" xmlns="http://www.w3.org/2000/svg" width="1.294ex" height="1.025ex" role="img" focusable="false" viewBox="0 -442 572 453" xmlns:xlink="http://www.w3.org/1999/xlink">

Note that both are the same, with no duplicate xmlns. With your change, the results are

outerHTML:

<svg style="vertical-align: -0.025ex;" width="1.294ex" height="1.025ex" role="img" focusable="false" viewBox="0 -442 572 453" xmlns:xlink="http://www.w3.org/1999/xlink">

XMLSerializer.serializToString():

<svg style="vertical-align: -0.025ex;" xmlns="http://www.w3.org/2000/svg" width="1.294ex" height="1.025ex" role="img" focusable="false" viewBox="0 -442 572 453" xmlns:xlink="http://www.w3.org/1999/xlink">

where the outerHTML output is without the xmlns attribute.

Setting an attribute called xmlns without the http://www.w3.org/2000/xmlns/ namespace leads to a duplicate xmlns attribute being created when the node is serialized to XML.

I'm not sure if you are suggesting that one should set the xmlns attribute using setAttributeNS() rather than setAttribute(), as it is now. Doing so leads to an error. For example, in Firefox, if I do

svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
svg.setAttributeNS('http://www.w3.org/2000/svg', 'xmlns', 'http://www.w3.org/2000/svg');

I get the error

NamespaceError: An attempt was made to create or change an object in a way which is incorrect with regard to namespaces

So I'm not sure how to make the attribute available for outerHTML any other way, and can't reproduce the duplicate attribute that you are seeing, so can't really test out any solutions in any case.

If you could give a more explicit description of what you are doing to get the duplicates, that might help. Thanks.

@dpvc
Copy link
Member

dpvc commented Sep 15, 2019

Also, as an aside for the future, it would be best to start an issue for the problem that you have seen, and then make a pull request that resolves that issue. Then we can use the issue tracker to discuss the problem itself (how to reproduce, potential alternative solutions), and save the comments here for actual review of your code that resolves the problem.

@hubgit
Copy link
Author

hubgit commented Sep 15, 2019

Firstly apologies for not creating a full reproduction of the issue, I'll do that week.

I did think that this would be a straightforward fix, as it's quite clear that xmlns is a serialization of the node's namespace URI rather than a regular attribute, and that the SVG namespace is already being set by the use of createElementNS.

The examples below aren't using MathJax, but might be useful as an illustration:

  1. Setting the namespace with createElementNS, serialized as the xmlns attribute:
const svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
new XMLSerializer().serializeToString(svg);

// <svg xmlns="http://www.w3.org/2000/svg"/>
  1. An xmlns attribute set manually is serialized as a duplicate attribute (in Chrome) rather than setting the node's namespace:
const svg = document.createElement('svg');
svg.setAttribute('xmlns', 'http://www.w3.org/2000/svg');
new XMLSerializer().serializeToString(svg);

// Chrome: <svg xmlns="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/2000/svg"></svg>
// Firefox: <a0:svg xmlns:a0="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/2000/svg"></a0:svg>
// Safari: <svg xmlns="http://www.w3.org/2000/svg"></svg>

(the latter example isn't showing everything, because manually setting the xmlns attribute to the same URI as the node's namespace doesn't result in a duplicate attribute, at least in Chrome).

@dpvc
Copy link
Member

dpvc commented Sep 15, 2019

Thanks for the new information. But note the following:

  1. I agree that this example serializes as XML properly. But it doesn't do what we need it to for HTML serialization:

    svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
    
    // <svg></svg>
    

    Here the xmlns attribute is missing, and we want to have it in the output, since this is a route that is being used to generate serialized SVG.

  2. I agree that this does what you indicate, but this is not what MathJax does. It does the following:

     const svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
     svg.setAttribute('xmlns', 'http://www.w3.org/2000/svg');
     new XMLSerializer().serializeToString(svg);
     
     // <svg xmlns="http://www.w3.org/2000/svg"/>
    
     svg.outerHTML;
     
     // <svg xmlns="http://www.w3.org/2000/svg"/>
    

    which produces this same output (no duplicate attributes) in Chrome, Firefox, and Safari for me, for both XML serialization and HTML serialization. That is the ideal outcome.

If you are getting different results, can you say what OS and browser versions you are using? If you are able to reproduce this with actual MathJax output, can you give a working example of that? What do you get for the example document I included above (which does use MathJax output directly)?

@hubgit
Copy link
Author

hubgit commented Sep 15, 2019

It seems that the issue appears when using w3c-xmlserializer (as used in JSDOM) to serialize the node, rather than outerHTML - here's some code to reproduce it: https://codesandbox.io/s/mathjax-xmlns-example-ko3wr

The output of serializing the TeX input 1 to SVG is this:

<svg xmlns="http://www.w3.org/2000/svg" xmlns="http://www.w3.org/2000/svg" width="1.131ex" height="1.507ex" role="img" focusable="false" viewBox="0 -666 500 666" style="vertical-align: 0px;"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"><g data-mml-node="math"><g data-mml-node="mn"><path data-c="31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"/></g></g></g></svg>

Setting the requireWellFormed option when serializing in this way will result in w3c-xmlserializer throwing a "InvalidStateError: Failed to serialize XML: Invalid attribute localName value" error, because it explicitly forbids attributes named xmlns that aren't in the http://www.w3.org/2000/xmlns/ namespace.

If there's a bug in the serialization, I'll file an issue with the w3c-xmlserializer module, but it does seem, still, that MathJax shouldn't be setting the xmlns attribute twice.

@hubgit
Copy link
Author

hubgit commented Sep 15, 2019

I agree that this example serializes as XML properly. But it doesn't do what we need it to for HTML serialization:

svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');

// <svg></svg>

Here the xmlns attribute is missing, and we want to have it in the output, since this is a route that is being used to generate serialized SVG.

I think this serialization is correct, as the namespace is optional on the HTML5 serialization of SVG:

https://developer.mozilla.org/en-US/docs/Web/SVG/Element/svg

Note: The xmlns attribute is only required on the outermost svg element of SVG documents. It is unnecessary for inner svg elements or inside HTML documents.

@hubgit
Copy link
Author

hubgit commented Sep 15, 2019

Another route to serialize SVG with the xmlns attribute would be to create it as a standalone SVG document:

const svg = document.implementation.createDocument('http://www.w3.org/2000/svg', 'svg');
svg.documentElement.outerHTML;

// <svg xmlns="http://www.w3.org/2000/svg"/>

@hubgit
Copy link
Author

hubgit commented Sep 15, 2019

I've created an issue on w3c-xmlserializer, as maybe it should match browsers' behaviour and combine duplicate attributes, which would make this change unnecessary.

@hubgit
Copy link
Author

hubgit commented Sep 15, 2019

At least one other DOM library (slimdom) throws an error when trying to set the xmlns attribute:

import { Document, serializeToWellFormedString } from 'slimdom'

const document = new Document()
const element = document.createElementNS('http://www.w3.org/2000/svg', 'svg')
element.setAttribute('xmlns', 'http://www.w3.org/2000/svg') // "InvalidStateError: The serialization of this attr would not be a well-formed attribute"

console.log(serializeToWellFormedString(element))

Which isn't to say that setting an xmlns attribute shouldn't be allowed in an HTML context, but it does point towards it being problematic.

@dpvc
Copy link
Member

dpvc commented Sep 16, 2019

Thanks for the additional comments and examples. It has turned out to be more subtle than I originally thought. I will make a few comments in response to some of yours, and then suggest a potential resolution.

it does seem, still, that MathJax shouldn't be setting the xmlns attribute twice.

I think there is a technicality that is confusing the issue a bit. Creating a node in a namespace does not actually set the xmlns attribute on the node. It sets its namespace, and the serializer adds an xmlns attribute during serialization. But the node itself doesn't have one. This can be seen by the fact that the HTML serialization doesn't include one, and if you do

svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');

the both svg.getAttribute('xmlns'); and svg.getAttributeNS('http://www.w3.org/2000/svg','xmlns'); produce null. So there is no attribute on the element. The serializer adds xmlns during serialization in order to preserve the namespace of the element. That's not the same as having the actual attribute, and it is the fact that XML adds this (non-)attribute and HTML doesn't that is central to the problem.

I think this serialization is correct, as the namespace is optional on the HTML5 serialization of SVG

I agree. That is because the xmlns attribute is not set on the element (but its namespace is). That is why MathJax adds xmlns. You are right that HTML5 doesn't require the xmlns attribute, but one of the use cases for MathJax is to generate svg images for use as sources for <img> tags, or to use as standalone svg files, and those do need the xmlns attribute. That is why MathJax adds it here.

Another route to serialize SVG with the xmlns attribute would be to create it as a standalone SVG document

Thanks, I was unaware of that. On the other hand, the handling of the created elements in MathJax is all mediated through MathJax's DOM adaptors, so that MathJax can work with structures that are not full (our even actual) DOM implementations (like its own liteDOM). This abstracts the actions needed to create and manipulate DOM elements, and all MathJax's actions on the DOM are mediated through the adaptor. MathJax doesn't manipulate DOM elements directory, except through the HTMLadaptor (and also the menu code, which we assume will only run in a browser with an actual DOM). We have tried to keep the DOMadaptor as generic as possible, so access to things like document.implementation are not available. (Any feature we try to access must be made available in all DOM adaptors, so we want that kept to a minimum.)

At least one other DOM library (slimdom) throws an error when trying to set the xmlns attribute

For me the error comes when I try to serialize, not when the attribute is set, but, yes, I do see that too. The documentation for it does reference the DOM-parsing draft recommendation, and although it is only a draft currently, it does seem to indicate (in the topic called "XML serialization of the attributes", item 3, subitem 8) that when the 'require well-formed' flag is set, xmlns is not allowed as an attribute name without an attribute namespace. So I guess this library is doing the right thing by reporting the error it does.


So that leads me to the suggestion for how to proceed. I think the best solution for MathJax would be to have an option for the SVG output jax that tells it whether to add the xmlns attribute or not. That way, if you are doing HTML serialization, it can be added, and if you are doing XML serialization, it can be turned off.

An alternative would be to subclass the HTMLadaptor and modify its to skip the xmlns attribute when there is a namespace. This is a little less straight-forward, but would also work. Then you would make your choice about whether the attribute is used or not by selecting the proper adaptor.

In any case, I think we can come to a resolution that will work for both your needs and the needs of others, either with an option or a different adaptor.

@hubgit
Copy link
Author

hubgit commented Sep 16, 2019

It seems like there are two possible "correct" solutions:

  1. Creating an SVG element, with const svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg'), then using svg.outerHTML to get the un-namespaced HTML serialization or new XMLSerializer().serializeToString(svg) to get the namespaced XML serialization.
  2. Creating the SVG document with const svg = document.implementation.createDocument('http://www.w3.org/2000/svg', 'svg') and serializing it with either of the above methods to get the namespaced output.

I can see why both of those are potentially tricky given the reasoning you've outlined above, so I'll have a look at subclassing HTMLAdaptor to avoid adding the xmlns attribute instead.

@dpvc
Copy link
Member

dpvc commented Sep 16, 2019

While your number 2 looks attractive from the standpoint of serialization, the results also have to be able to be inserted into an HTML document, which this can't (at least not without extra work). So I think that won't be practical.

The current DOMadaptor interface has two methods for serialization, innerHTML and outerHTML. Currently the HTMLadaptor uses node.innerHTML and node.outerHTML for these. It would be possible, however, to use new XMLSerializer().serializeToString(node) for outerHTML. That would mean the xmlns attribute doesn't have to be added, provided people go through the adaptor, like they are supposed to (they won't but we can't help that).


If you want to subclass the HTMLadaptor, to remove there are two potential places to do it. Either in the node() method, which could be something like

node(kind, def = {}, children = [], ns = null) {
  delete def.xmlns;
  return super.node(kind, def, children, ns);
}

or in the setAttribute() method:

setAttribute(node, name, value, ns = null) {
  if (name !== 'xmlns') {
    super.setAttrubute(node, name, value, ns);
  }
}

The originals are in Typescript, but I remove the types here. See here and here.

Since your code sandbox example used browserAdaptor(), I just wanted to point out that it simply returns an instance of HTMLadaptor using the global window element (see here), so you can replace browserAdaptor() with a similar function that generates an instance of your subclass instead.

@hubgit
Copy link
Author

hubgit commented Sep 19, 2019

Subclassing HTMLAdaptor to exclude the xmlns attribute in setAttribute has the desired effect, so I'll close this pull request.

@hubgit hubgit closed this Sep 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants