Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore NULL byte following TXXX frame? #135

Open
randoragon opened this issue Apr 9, 2024 · 2 comments
Open

Ignore NULL byte following TXXX frame? #135

randoragon opened this issue Apr 9, 2024 · 2 comments

Comments

@randoragon
Copy link

Consider a sample mp3 file created in the following way:

ffmpeg -f lavfi -i anullsrc=r=44100:cl=mono -t 5 -q:a 9 -acodec libmp3lame -id3v2_version 0 sample.mp3
mid3v2 --TXXX Key:Value sample.mp3

The first command creates an empty, 5 seconds long mp3 file with no ID3v2 tag.
The second utilizes the mid3v2 utility based on the Python Mutagen module to add an ID3v2.4 TXXX frame with description "Key" and value "Value".

If we now print the tag using id3...

use id3::{Tag, TagLike};

fn main() {
    let tag = Tag::read_from_path("../sample.mp3").unwrap();
    println!("{:#?}", tag);
}

...this is the output:

Tag {
    frames: [
        Frame {
            id: Valid(
                "TXXX",
            ),
            content: ExtendedText(
                ExtendedText {
                    description: "Key",
                    value: "Value\0",
                },
            ),
            tag_alter_preservation: false,
            file_alter_preservation: false,
            encoding: Some(
                UTF8,
            ),
        },
    ],
    version: Id3v24,
}

As you can see, there is a '\0' at the end of the TXXX value. This '\0' is not reported by any other ID3 tool I've tried (mid3v2, eyeD3, id3ted).

I've investigated this issue a lot since yesterday, I've read the ID3 Informal specification a few times, looked at issues in Mutagen (particularly quodlibet/mutagen#379 is interesting). It seems this inconsistency between tooling is sadly a result of the ID3 specification itself being pretty vague on the subject.
For instance, my not-very-educated stance on this is that TXXX is not a text frame and the optional NULL termination rules do not apply to it (justification: dhamaniasad/mutagen#179 (comment)). However, a maintainer over at Mutagen's closed issue had a different opinion (quodlibet/mutagen#379 (comment)), claiming that:

The v2.3 spec explicitly states that text frames (TXXX) can have optional termination, (...)

...which I strongly disagree with, unless I see where this "explicitness" is coming from (I was unable to find it).

Still, as an author of a command-line ID3 tool (rsid3) based on rust-id3, and especially as a user, I do care about compliance between tools. Since Mutagen is probably the most used and popular ID3 library at the moment, I think it's especially beneficial to be compliant with it, so that users can smoothly transition and expect the same results. This is all assuming that Mutagen is compliant with the standard and the differences only arise from varying interpretations of the standard.

I guess my suggestion is to keep some kind of list of differences with Mutagen (and possibly other popular ID3 libraries, if they exist), maybe even include some remarks in the documentation. And I'd love to hear what you think about this, should this project aim for compliance with Mutagen (within the ramifications of the standard), or do we not care about it at all? Is this a bug, a feature, or neither?

@randoragon
Copy link
Author

randoragon commented Apr 9, 2024

A small correction:
The problem is mid3v2 specifically, not Mutagen. It produces the NULL termination, which is a known issue (quodlibet/mutagen#384, quodlibet/mutagen#379). I ran the same experiment with eyeD3 and id3ted, and the NULL byte was not added (and eyeD3 is based on Mutagen, so this is not a Mutagen problem).

The important difference is that all other tools ignore the NULL when reading TXXX, while rust-id3 includes it as part of the data.

@polyfloyd
Copy link
Owner

Thank you for your research on this matter!

When the spec is vague about something, I think it is important to adhere to whatever all other popular tooling does. Mutagen is certainly a commonly found implementation of the id3 format, so if it strips off the null terminator then so should this library.

randoragon added a commit to randoragon/rsid3 that referenced this issue Apr 10, 2024
The samples were created with mid3v2, but as it turns out, mid3v2 has a
quirk that causes it to terminate TXXX and COMM frames with a NULL byte.
To get rid of these extra bytes, I reencoded the samples with eyeD3.

This is not well-defined by the standard and the id3 crate currently is
not able to parse the extra NULL byte "correctly" (this is not
necessarily an error according to the standard, but other popular ID3
tools omit the trailing NULL byte).
See polyfloyd/rust-id3#135 for more info.
randoragon added a commit to randoragon/rsid3 that referenced this issue Apr 24, 2024
The samples were created with mid3v2, but as it turns out, mid3v2 has a
quirk that causes it to terminate TXXX and COMM frames with a NULL byte.
To get rid of these extra bytes, I reencoded the samples with eyeD3.

This is not well-defined by the standard and the id3 crate currently is
not able to parse the extra NULL byte "correctly" (this is not
necessarily an error according to the standard, but other popular ID3
tools omit the trailing NULL byte).
See polyfloyd/rust-id3#135 for more info.

Issue: #1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants