While reviewing penetration testing reports, I stumbled upon an odd case. The pentester used a generated PDF, using AcroJS to run a Javascript through the app.alert
call. While all the incoming files are scanned using ClamAV with the Potentially Unwanted Application flag enabled, somehow this one was not caught.
$ clamscan -v --detect-pua -a --stdout -d /tmp/test example.pdf
Loading: 16s, ETA: 0s [========================>] 8.72M/8.72M sigs
Compiling: 4s, ETA: 0s [========================>] 41/41 tasks
Scanning example.pdf
example.pdf: OK
----------- SCAN SUMMARY -----------
Known viruses: 8719530
Engine version: 1.4.2
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 0.00 MB
Data read: 0.00 MB (ratio 0.00:1)
Time: 20.456 sec (0 m 20 s)
Start Date: 2025:02:09 14:11:43
End Date: 2025:02:09 14:12:03
Beyond the discussion whether a PDF running a Javascript is a valid finding or not, the interesting bit is that a PDF that contains a Javascript is not picked up, while PUA detection is on.
The obvious next step is to compare the PUA PDF signature, downloaded by freshclam
, and the content of the PDF generated.
Running freshclam
, we gather the up to date definitions, and extract the signatures packaged in the daily.cvd
file:
$ freshclam --datadir=/tmp/test
$ ls /tmp/test
daily.cvd freshclam.dat main.cvd
$ sigtool -u /tmp/test/daily.cvd
$ ls
COPYING daily.crb daily.ftm daily.hsb daily.ign daily.ldb daily.mdu daily.ndb daily.sfp main.cvd
daily.cdb daily.cvd daily.hdb daily.hsu daily.ign2 daily.ldu daily.msb daily.ndu daily.wdb
daily.cfg daily.fp daily.hdu daily.idb daily.info daily.mdb daily.msu daily.pdb freshclam.dat
Finally we grep for our PUA PDF signature
$ grep PUA.Pdf *
daily.ldu:PUA.Pdf.Trojan.EmbeddedFile-1;Engine:51-255,Target:10;0&1&(2|3|4);2f46696c746572205b2f{-12}4465636f6465202f{-12}4465636f6465202f{-12}4465636f6465;2F456D62656464656446696C65;2F54797065202F46696C6573706563202F462028{-100}2E70646629;2F54797065202F46696C6573706563202F462028{-100}2E65786529;2F54797065202F46696C6573706563202F462028{-100}2E646c6c29
daily.ldu:PUA.Pdf.Exploit.CVE_2013_0624-4255860-2;Engine:51-255,HandlerType:CL_TYPE_PDF,Target:0;(0|1|2|3);0:474946383961{-1024}255044462d;0:89504e470d0a1a0a{-1024}255044462d;0:ffd8ffe0{-800}faffda000c{-1024}255044462d;0:d0cf11e0{-1024}255044462d
daily.ndu:PUA.Pdf.Trojan.OpenActionObjectwithJavascript-1:0:0:255044462d*6f626a{-2}3c3c{-100}2f4f70656e416374696f6e{-100}2f4a617661536372697074
daily.ndu:PUA.Pdf.Trojan.OpenActionObjectwithJS-1:0:0:255044462d*6f626a{-2}3c3c{-100}2f4f70656e416374696f6e{-100}2f4a53
daily.ndu:PUA.Pdf.Trojan.CVE_2013_0622-1:0:*:255044462d*6f626a{-4}3c3c{-100}2e6f70656e446f6328{-25}63506174683a{-10}5c5c5c5c
The 2 interesting rules are in the daily.ndu
file:
- PUA.Pdf.Trojan.OpenActionObjectwithJavascript-1
- PUA.Pdf.Trojan.OpenActionObjectwithJS-1
Let's show their content in a friendlier way
$ cat extract/daily.ndu | sigtool --decode-sigs | grep -aA 4 'PUA.Pdf.Trojan.OpenActionObject'
VIRUS NAME: PUA.Pdf.Trojan.OpenActionObjectwithJavascript-1
TARGET TYPE: ANY FILE
OFFSET: 0
DECODED SIGNATURE:
%PDF-{WILDCARD_ANY_STRING}obj{WILDCARD_ANY_STRING(LENGTH<=2)}<<{WILDCARD_ANY_STRING(LENGTH<=100)}/OpenAction{WILDCARD_ANY_STRING(LENGTH<=100)}/JavaScript
VIRUS NAME: PUA.Pdf.Trojan.OpenActionObjectwithJS-1
TARGET TYPE: ANY FILE
OFFSET: 0
DECODED SIGNATURE:
%PDF-{WILDCARD_ANY_STRING}obj{WILDCARD_ANY_STRING(LENGTH<=2)}<<{WILDCARD_ANY_STRING(LENGTH<=100)}/OpenAction{WILDCARD_ANY_STRING(LENGTH<=100)}/JS
Right, the main pattern is to detect the PDF magic bytes %PDF-
, with any version number, followed by an obj
. After that, the pattern watches for /OpenAction
and /JavaScript
(or /JS
). The important bit is the {WILDCARD_ANY_STRING(LENGTH<=100)}
between the 2 of them. That means if there's more than 100 characters in the PDF file between the /OpenAction
and the /JavaScript
, the file is accepted as clean of PUA.
What if we generate a file that is still parsed by PDF readers, but has /OpenAction
and /JavaScript
(again, or /JS
) further away than 100 characters?
In order to reproduce the example, we need:
- a valid PDF
- using
/OpenAction
and more than 100 characters before the 1st occurence of/JavaScript
Using pypdf, the following program generates such file
$ cat example.py
from pypdf import PdfWriter
writer = PdfWriter()
writer.pdf_header = '%PDF-1.5'
writer.metadata = {
'/OpenAction': '5 0 R',
'/Producer': 'A'*100
}
writer.add_blank_page(width=1, height=1)
writer.add_js('app.alert("test");')
with open("example.pdf", "wb") as fp:
writer.write(fp)
The result is the following
$ cat example.pdf
%PDF-1.5
%����
1 0 obj
<<
/OpenAction (5 0 R)
/Producer (AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
>>
endobj
2 0 obj
<<
/Type /Pages
/Count 1
/Kids [ 4 0 R ]
>>
endobj
3 0 obj
<<
/Type /Catalog
/Pages 2 0 R
/Names <<
/JavaScript <<
/Names [ (b0d741f5\0556913\05540fa\05594a3\055ae0e45c1be53) 5 0 R ]
>>
>>
>>
endobj
4 0 obj
<<
/Type /Page
/Resources <<
>>
/MediaBox [ 0.0 0.0 1 1 ]
/Parent 2 0 R
>>
endobj
5 0 obj
<<
/Type /Action
/S /JavaScript
/JS (app\056alert\050\042test\042\051\073)
>>
endobj
xref
0 6
0000000000 65535 f
0000000015 00000 n
0000000169 00000 n
0000000228 00000 n
0000000376 00000 n
0000000466 00000 n
trailer
<<
/Size 6
/Root 3 0 R
/Info 1 0 R
>>
startxref
559
%%EOF
Testing with the default rules, indeed this is not picked up, while opening the file with any browser does show the Javascript alert box.
$ clamscan -v --detect-pua -a --stdout -d /tmp/test example.pdf
Loading: 16s, ETA: 0s [========================>] 8.72M/8.72M sigs
Compiling: 4s, ETA: 0s [========================>] 41/41 tasks
Scanning example.pdf
example.pdf: OK
----------- SCAN SUMMARY -----------
Known viruses: 8719530
Engine version: 1.4.2
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 0.00 MB
Data read: 0.00 MB (ratio 0.00:1)
Time: 20.456 sec (0 m 20 s)
Start Date: 2025:02:09 14:11:43
End Date: 2025:02:09 14:12:03
A similar result can also be achieved in Java with OpenPDF
$ git clone https://github.com/LibrePDF/OpenPDF.git
$ mvn -q compile
$ emacs -nw ./openpdf/src/test/java/org/librepdf/openpdf/independent/NumberOfPagesTest.java
$ mvn test -q -pl openpdf -Dtest=NumberOfPagesTest
The content of NumberOfPagesTest.java
is the following:
package org.librepdf.openpdf.independent;
import com.lowagie.text.Document;
import com.lowagie.text.Paragraph;
import com.lowagie.text.pdf.PdfWriter;
import java.io.IOException;
import com.lowagie.text.pdf.PdfAction;
import java.io.FileOutputStream;
import org.junit.jupiter.api.Test;
public class NumberOfPagesTest {
@Test
void whenWritingHelloWorld_thenOnlyOnePageShouldBeCreated() throws IOException {
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("Annotations.pdf"));
writer.setPdfVersion(PdfWriter.VERSION_1_6);
writer.setOpenAction(PdfAction.javaScript("app.alert(\"Hello\");", writer));
document.open();
document.add(new Paragraph("Hello World"));
document.close();
}
}
And the resulting file
$ cat ./openpdf/Annotations.pdf
%PDF-1.6
%����
2 0 obj
<</Filter/FlateDecode/Length 64>>stream
x�+�r
�26S�00I�2P�5�1��
�BҸ4<Rsr���rR4C��J
@
\C����
endstream
endobj
4 0 obj
<</Contents 2 0 R/Type/Page/Resources<</Font<</F1 1 0 R>>>>/Parent 3 0 R/MediaBox[0 0 595 842]>>
endobj
1 0 obj
<</Subtype/Type1/Type/Font/BaseFont/Helvetica/Encoding/WinAnsiEncoding>>
endobj
3 0 obj
<</Kids[4 0 R]/Type/Pages/Count 1>>
endobj
5 0 obj
<</OpenAction<</S/JavaScript/JS(app.alert\("Hello"\);)>>/Type/Catalog/Pages 3 0 R>>
endobj
6 0 obj
<</CreationDate(D:20250209171824+08'00')/Producer(OpenPDF 2.0.4-SNAPSHOT)>>
endobj
xref
0 7
0000000000 65535 f
0000000257 00000 n
0000000015 00000 n
0000000345 00000 n
0000000145 00000 n
0000000396 00000 n
0000000495 00000 n
trailer
<</Info 6 0 R/ID [<8749eff95f7932a96f47ec00fffce9e4><8749eff95f7932a96f47ec00fffce9e4>]/Root 5 0 R/Size 7>>
startxref
586
%%EOF
What if we still want to alert on such PDF, whether the finding is legitimate or not? We would need to come up with a rule, that allows more than 100 characters between /OpenAction
and /JavaScript
.
Taking back the rules given by ClamAV daily.ndu
, extracted from daily.nvd
the change is quite simple:
$ cat rules.ndu
PUA.Pdf.Trojan.OpenActionObjectwithJavascript-1:0:0:255044462d*6f626a{-2}3c3c{-100}2f4f70656e416374696f6e*2f4a617661536372697074
We replace the second occurence of {-100}
to a wildcard *
. Explaining in ClamAV own words
$ cat rules.ndu | sigtool --decode-sigs
VIRUS NAME: PUA.Pdf.Trojan.OpenActionObjectwithJavascript-1
TARGET TYPE: ANY FILE
OFFSET: 0
DECODED SIGNATURE:
%PDF-{WILDCARD_ANY_STRING}obj{WILDCARD_ANY_STRING(LENGTH<=2)}<<{WILDCARD_ANY_STRING(LENGTH<=100)}/OpenAction{WILDCARD_ANY_STRING}/JavaScript
Let's try to detect our sample
$ cp rules.ndu /tmp/test/
$ clamscan -v --detect-pua -a --stdout -d /tmp/test example.pdf
Loading: 17s, ETA: 0s [========================>] 8.72M/8.72M sigs
Compiling: 4s, ETA: 0s [========================>] 41/41 tasks
Scanning example.pdf
example.pdf: PUA.Pdf.Trojan.OpenActionObjectwithJavascript-1.UNOFFICIAL FOUND
example.pdf!(1): PUA.Pdf.Trojan.OpenActionObjectwithJavascript-1.UNOFFICIAL FOUND
----------- SCAN SUMMARY -----------
Known viruses: 8719531
Engine version: 1.4.2
Scanned directories: 0
Scanned files: 1
Infected files: 1
Data scanned: 0.00 MB
Data read: 0.00 MB (ratio 0.00:1)
Time: 20.967 sec (0 m 20 s)
Start Date: 2025:02:09 14:21:35
End Date: 2025:02:09 14:21:56
There we go, we can now detect occurences of Javascript within PDF, no matter the distance from /OpenAction