Skip to content

PDFA 1b test corpus

Dmitry Remezow edited this page Jun 1, 2015 · 27 revisions

6.1 - File structure

6.1.2 - File header

6-1-2-t01-fail-a: File header not compliant with PDF/A: the file does not start with % character

6-1-2-t01-fail-b: File header not compliant with PDF/A: the file header is incorrect

6-1-2-t02-fail-a: File header not compliant with PDF/A: the comment following the file header has less than 4 characters

6-1-2-t02-fail-b: File header not compliant with PDF/A: the line following the file header is not a comment

6-1-2-t02-fail-c: File header not compliant with PDF/A: the comment following the file header contains ANSI characters in the first four bytes

6-1-2-t02-pass-a: File is a valid PDF/A-1b document: the comment following the file header contains ANSI characters after the first four binary bytes

6.1.3 - File trailer

6-1-3-t01-fail-a: ID is missing in the last trailer

6-1-3-t01-pass-a: File is a valid PDF/A-1b document: ID is missing in all trailers except for the last one

6-1-3-t02-fail-a: Linearized file: ID is missing in the first page trailer

6-1-3-t02-pass-a: File is a valid PDF/A-1b document: Linearized file: ID is missing in the last trailer dictionary, but is present in the first page trailer

Note to 6-1-3-t02-pass-a: ISO 19005-1:2005/Cor.2:2011: In a linearized PDF, if the ID keyword is present in both the first page trailer dictionary and the last trailer dictionary, the value to both instances of the ID keyword shall be identical.

6.1.4 - Cross reference table

6-1-4-t01-fail-a: Cross reference subsection is incorrectly formatted: the first subsection has extra spaces between the starting object number and the range

6-1-4-t01-fail-b: Cross reference subsection is incorrectly formatted: the last subsection has extra spaces between the starting object number and the range

6-1-4-t02-fail-a: Cross reference subsection is incorrectly formatted: the xref keyword and the following cross reference subsection header are separated by the SPACE and the EOL marker

6-1-4-t02-fail-b: Cross reference subsection is incorrectly formatted: the xref keyword and the following cross reference subsection header are separated by two LF characters

6-1-4-t03-pass-a: File is a valid PDF/A-1b document: Object, whose offset is not referenced in the cross reference table, contains Length entry with wrong value

6-1-4-t03-pass-b: File is a valid PDF/A-1b document: Object, whose offset is not referenced in the cross reference table, contains invalid hexadecimal string

6.1.5 - Document information dictionary

6-1-5-t01-fail-a: Document information dictionary is not consistent with the document XMP metadata: /Title mismatch between Info dictionary and XMP metadata.

6-1-5-t01-fail-b: Document information dictionary is not consistent with the document XMP metadata: /Author mismatch between Info dictionary and XMP metadata.

6-1-5-t01-fail-c: Document information dictionary is not consistent with the document XMP metadata: /Subject mismatch between Info dictionary and XMP metadata.

Note to 6-1-5-t01-fail-c: ISO 19005-1:2005/Cor.1:2007: Subject => dc:description["x-default"]

6-1-5-t01-fail-d: Document information dictionary is not consistent with the document XMP metadata: /Keywords mismatch between Info dictionary and XMP metadata.

6-1-5-t01-fail-e: Document information dictionary is not consistent with the document XMP metadata: /Creator mismatch between Info dictionary and XMP metadata.

6-1-5-t01-fail-f: Document information dictionary is not consistent with the document XMP metadata: /Producer mismatch between Info dictionary and XMP metadata.

6-1-5-t01-fail-g: Document information dictionary is not consistent with the document XMP metadata: /CreationDate mismatch between Info dictionary and XMP metadata.

6-1-5-t01-fail-h: Document information dictionary is not consistent with the document XMP metadata: /ModDate mismatch between Info dictionary and XMP metadata.

6-1-5-t01-fail-i: Document information dictionary is not consistent with the document XMP metadata: value of the /Title entry in the document information dictionary is an indirect object with a string value different from dc:title in XMP metadata

6-1-5-t01-fail-j: Document information dictionary is not consistent with the document XMP metadata: value of /Title entry in the document information dictionary is an indirect object with non-string value

6-1-5-t02-pass-a: File is a valid PDF/A-1b document: the document information dictionary contains the /Description entry, which does not have analogs in predefined XMP schemas

6-1-5-t02-pass-b: File is a valid PDF/A-1b document: the document information dictionary contains a the /Title entry with value in hexadecimal format, which matches the value of dc:title property in the document XMP metadata

6-1-5-t02-pass-с: File is a valid PDF/A-1b document: value of the /Title key in the document information dictionary is an indirect object with a string value equivalent to dc:title property in the document XMP metadata

6-1-5-t02-pass-d: File is a valid PDF/A-1b document: /Title entity in the document information dictionary is an indirect object with null as value; equivalent to the absence of this entity

6.1.6 - String objects

6-1-6-t01-fail-a: A String object in hexadecimal format contains odd number of non-white-space characters

6-1-6-t01-fail-b: Document contains objects with hexadecimal strings, which contains even number of non-white-space characters. One symbol is not in ranges 0 to 9, A to F or a to f.

6-1-6-t01-pass-a: File is a valid PDF/A-1b document: All String objects in hexadecimal format have even number of non-white-space characters in the range 0 to 9, A to F or a to f

6.1.10 - Filters

6-1-10-t01-fail-a: LZW compression used in content stream

6-1-10-t01-fail-b: LZW compression used in ICC profile

6-1-10-t01-pass-a: File is a valid PDF/A-1b document: LZW compression used for private data. A stream referred from the document Catalog by /XXData key

6.1.12 - Implementation limits

6-1-12- t01-fail-a: Integer value out of range. Contains -2157483648 value in /Widths entry

6-1-12- t02-fail-a: Number value out of range. Contains 60000 Tz in page content stream

6-1-12- t02-fail-b: Number value out of range. Contains /Font [8 0 R 60000] in ExtGState dictionary

6-1-12- t02-fail-c: Number value out of range. Contains 60000.1 Tz in page content stream

6-1-12- t02-fail-d: Number value out of range. Contains /Font [8 0 R 60000.1] in ExtGState dictionary

6-1-12- t02-fail-e: Number value out of range. Contains 60000.0 Tz in page content stream

6-1-12- t02-fail-f: Number value out of range. Contains /Font [8 0 R 60000.0] in ExtGState dictionary

6-1-12- t02-fail-l: Number value out of range. Contains -32676.9 50 Td in content stream

6-1-12- t02-pass-g: Contains 30000 Tz in page content stream

6-1-12- t02-pass-h: Contains /Font [8 0 R 30000] in ExtGState dictionary

6-1-12- t02-pass-i: Contains 30000.1 Tz in page content stream

6-1-12- t02-pass-j: Contains /Font [8 0 R 30000.1] in ExtGState dictionary

6-1-12- t02-pass-k: Td operator contains 65568 digits in fractional part.

6-1-12-t03-fail-a: Maximum length of a string (65535) is exceeded. The last bookmark contains 65827 bytes

[6-1-12- t03-fail-b] (https://github.com/veraPDF/veraPDF-corpus-PDFA-1b/raw/master/PDF_A-1b/6.1%20File%20structure/6.1.12%20Implementation%20limits/veraPDF%20test%20suite%206-1-12-t03-fail-b.pdf): Info dictionary and the embedded XMP package contain 65550 bytes in "Keywords" entry.

[6-1-12- t03-fail-c] (https://github.com/veraPDF/veraPDF-corpus-PDFA-1b/raw/master/PDF_A-1b/6.1%20File%20structure/6.1.12%20Implementation%20limits/veraPDF%20test%20suite%206-1-12-t03-fail-c.pdf): Content stream contains a string with 65538 bytes.

6-1-12-t03-pass-a: File is a valid PDF/A-1b document: Bookmark contains more than 65535 bytes in octal representation, but has less than maximum allowed length (65535) after decoding from octal format

6-1-12-t03-pass-b: File is a valid PDF/A-1b document: Bookmark contains more than 65535 bytes in hexadecimal representation, but has less than maximum allowed length (65535) after decoding from hexadecimal format

[6-1-12- t03-pass-d] (https://github.com/veraPDF/veraPDF-corpus-PDFA-1b/raw/master/PDF_A-1b/6.1%20File%20structure/6.1.12%20Implementation%20limits/veraPDF%20test%20suite%206-1-12-t03-pass-d.pdf): Catalog contains a custom entry with 65540 bytes.

6-1-12-t04-pass-a: File is a valid PDF/A-1b document: Name object contains more than maximum allowed length (127) bytes in unescaped form, but has less than 127 characters after all escape sequences are decoded

6-1-12- t06-fail-a: Maximum number of entries in dictionary (4095) is exceeded. Resources entry in Page dictionary contains 4096 external graphic states.

6-1-12- t07- fail-a: Maximum number of indirect objects (8,388,607) in PDF file is exceeded (the file is zipped and is about 40Mb).

6-1-12- t08-fail-a: Maximal number of nested graphic states (28) is exceeded. Nesting level of q/Q operators is 29 in page content stream.

6-1-12-t08-pass-a: File is a valid PDF/A-1b document: Max.number of nested graphic states. Nesting level of q/Q operators is 28 in page content stream

6-1-12- t09-fail-a: Maximum number of colorants (8) for DeviceN color space is exceeded. DeviceN colorspace contains 9 colorants.

6-1-12- t10-fail-a: Maximum value of a CID (65535) is exceeded. An embedded CMap refers to character identifier with value 65536.

6-1-12-t10-pass-a: File is a valid PDF/A-1b document: Maximum value of a CID (65535) is exceeded, but CID is not referenced

Clone this wiki locally