You may have heard something in the news about PDF recently… By the
power of Google!
What's all this then?
I gave a presentation at the 27th Chaos Computer Congress in Berlin.
For some reason, the slides never made it from Pentabarf to the Fahrplan.
You can view them here:
(I have had so many requests for these.)
The talk was approximately a twenty minute summary of about 2,500
pages of dense technical documentation, twenty minutes of I wonder
what Acrobat does if I feed it this., and a little bit of (incomplete)
stuff about A/V programs —
because every talk like this has to mention A/V because everyone
asks about it. Speaking of which: if anyone would like to perform a
rigorous test of how well A/V detects obfuscated PDFs, please go right
ahead; I have neither the time nor interest.
(That 2,500 is just for the ISO spec, PDF32000_2008.pdf; the
the XML Forms Architecture Specification 2.6, xfa_spec_2_6.pdf. I'm
not counting the 3D Annotation specification, XMP specification, nor
any of the font specifications.)
I was inspired by the 2009 work of Meredith
L. Patterson, Len
Kaminsky, Sergey Bratus, and any one else I may have forgotten, on
ambiguities in ASN.1 and X.509 parsing.
As I read through the PDF ISO specification two years ago, certain
oddities jumped out at me. I can't possibly be the first person to
have read the specification and noticed this stuff, right? Right?
Can you really just execute arbitrary programs from a PDF? [Answer:
(And more than one PDF reader does this!)]
While reading the ISO
32000-1 [PDF] document - or really any technical specification -
what you really need to pay the most attention to, is what is
not said. Not only is ISO 32000-1 absent of any formal
language definition (BNF,
etc.) but many possible glosses which can be formed that are not
defined. (As it says right at the very beginning of the ISO 32000-1,
there's nothing in this document that defines whether or not a PDF
file is well-formed or not.
It's called Adobe Acrobat because it'll bend over
Since I started presenting this, several unrelated people have told
me that they were using foo-trick, or bar-trick for
years, but I seem to be the first person to stand in front of a room,
and tell people about it for an hour. (Well, that and the first to
make hybridized PDF files.)
Considering the existence of PDF/A, a committee
somewhere must have had the same ideas about tightening up the PDF specification.
Part of the original brainstorm for this talk, was to create a chart
of features/quirks across several different PDF readers and versions.
That could have taken way more time, than I could have invested at
If anyone would like to actually perform these tests, I encourage
you to please do so, and publish your results. I'll be posting the
test files I used for my talk soon.
Live in Person
iSec Open Forum
I'm speaking at the iSEC Open Forum Bay Area next week. This is the
info I have on it:
iSEC Open Forum Bay Area
DATE: Thursday, February 3, 2011
LOCATION: Intuit Building 9, Cook Conference Room
2600 Casey Ave
View, CA 94043
Please visit http://www.meetup.com/iSECOpenForums/
or RSVP to rsvp ＠ isecpartners.com if you wish to attend!
***technical managers and engineers only please***
and beverage provided***
I'm also speaking at TROOPERS
in Heidelberg, Germany on Mar 28-Apr 1, 2011. It will be about PDF,
but not the same stuff as 27C3.
Adobe Reader X
I wrote the bulk of this talk back in May for PH-Neutral
0x7DA. Adobe Acrobat X
even been announced yet. The CCC submission deadline was three
A month before my talk, Adobe
Reader X is released.
So, I should have updated my talk to mention Acrobat X more prominently,
rather than in passing at the end. However, I've done no testing with
it at all.
If you haven't heard yet, Adobe Acrobat X is running (most) of
itself within a sandbox.
Which is probably the only feasible way for Adobe to secure Acrobat. I
worked just like 9.0, and that's about all I know currently about its parser.
About PDF Printer Engines
So, I did not say that you can scan a network from a printer
with a PDF. I was speculating about just how much of the PDF spec a
printer may implement.
If it did implement the whole thing, then crazy stuff like scanning
a network via a printer would be possible, and I'd be very surprised
if any printer maker would ever do that.
I've been informed by someone familiar with HP printers, that the HP
about any other printer.
I reread the ISO specification, and found that it says that 3D data
is encoded in the Universal 3D
format (ECMA-363); Not OpenGL like I said.
Error in Presentation Tests
Paul Baccas has a pretty good summary of most public research on
this kind of thing here: http://nakedsecurity.sophos.com/2011/01/24/review-omg-wtf-pdf/
He's quite possibly the only other person on Earth who's actually
read through my test files, and actually found an error with one.
Most of my tests were written in one day. I think I spent maybe five
minutes on the duplicate object tests, so messing up the
startxref is no surprise.
This means that Slide 123, and Slide 124 in the
27C3 [PDF] talk
are the exact opposite of what they should say. The first Object
will be used if
xref points to it, otherwise if ref is
broken, the last Object defined is used.
(This actually seems much more consitant with Acrobat's other behaviors.
Stuff I forgot to mention
PDF syntax seems to have been influenced by TeX
Office, OpenOffice, and iWork documents are ZIP files; Java ARchives
are ZIP files. Use your imagination.
I can barely fit the information I've got into 50 minutes. Acrobat
has a lot of code, I mean A LOT, A LOT. Load it in
gdb sometime and just list the symbols. It'll take about
half an hour on a 2GHz Macbook.
That said, I should have at least mentioned PDF/A, and it's cousin PDF/X.
Short summary (what I'd say in my talk): PDF/A is a stripped down
version of the PDF-1.4 spec, with mandatory font embedding, and
and tighter requirements on the PDF syntax itself. For example, the
first byte of "%PDF-" must be at file offset zero. I
don't know how many readers enforce these things in practice.
The official ISO documentation is available for sale here: http://www.iso.org/iso/catalogue_detail?csnumber=38920
Currently, it costs about US$125 if you'd like to buy a copy. There
are actually several documents, but I think this is the main one. (I
haven't read it.)
What of PDF/A?
Attackers won't use it. Currently, an attacker can buy advertising
which points to some HTML/JS which launches Acrobat in the browser to
display a malicious PDF. And then the user gets 0wned. Happens
thousands of times a day. Well, that and you don't need
PDF/X is a stripped
for the publishing industry. So there's lots of stuff about CMYK color
space, embedded fonts, and trim area verses bleed area. There are
several ISO documents which describe PDF/X
PDF/UA is for
Universal Accessibility, screen readers and the like. It should also
make it possible to easily convert a PDF into a plain text document.
PDF/E PDF for
Engineers, just read the Wiki page on it.
One More Thing
While Googling around to write this blog post, I discovered this: http://hul.harvard.edu/jhove/pdf-hul.html
It's a PDF Syntax Validator. I don't know anything else about it other
than what it says on that web page.
Frequently Asked Questions
I've got this dialogue going on in my head about many of the
comments I've heard…
The Firſt Dialogue
Sagredo, and Simplicio.
IT was our yeﬅerdayes reſolution, and
that we ſhould to day diſcourſe the moﬅ
and particularly we
could poſſible of the natural reaſons, and
their eﬃcacy that have been hitherto alledged on the one or
other part, by the
maintainers of the Positions, Ariﬆotelian, and Ptolomaique;
and by the followers of the Portable Document Syſteem .
Sagredo: It has
been evidently demonﬆrated that PDF sekurite is not good; What
may be done in time to come? And
forgive me if I continue in Modern English in order to save
will be safe long as you keep patching your PDF software. And
besides it will take a long time to learn how to use a new PDF
will not be safe by just patching. Acrobat and Flash have
had an actively-used-in-the-wild to install malware, zero-day,
about every two months for the last three years. Sure, all PDF
readers have vulnerabilities, and switching software is just
exchanging one set of problems for another. However, most PDF
surface is much much smaller. And really, a PDF reader is not
hard to use. (Open file, read stuff, quit.)
should only open documents from people they
attacks performed via PDFs are drive-by attacks, that is, a
malicious web page (sometimes a modified legitimate one, or a
paid advertisement) instructed your web browser to open a PDF
file in Acrobat. You don't have a choice in opening the
malicious PDF file.
Everyone should switch to using PDF/A!
let's have all the malware authors use PDF/A from now
This doesn't help mitigate vulnerabilities like integer
overflows in embedded image handlers.
You could get everyone to reject old-style PDF documents, and
only accept PDF/A…
You'd need to convert every single old PDF document on earth,
but keep in mind that documents that accept form input won't
function as a PDF/A. And you need everyone to throw out their
old vulnerable PDF readers.
Sagredo: What's a
good PDF reader to use that's not Adobe Acrobat?
This is a list of every version of my talk presented to more than 20
people at a time.
It's mostly the same talk each time, except with some new stuff
added from the last time I presented it.
May 29, 2010 ph-neutral 0x7da PH-Neutral2010.pdf
Sep 09, 2010 SEC-T 2010 Sec-T_2010_Julia_Wolf_final.pdf
Oct 24, 2010 ToorCon 12 San Diego Julia_Wolf_ToorCon12_OMG_WTF.pdf
Oct 27, 2010 SecTor 2010 Julia_Wolf_SecTor_OMG_WTF.pdf
Dec 11, 2010 BayThreat 2010 BayThreat2010.pdf
Dec 30, 2010 27th Chaos Communication Congress 27C3_Julia_Wolf_OMG-WTF-PDF.pd
[Part 1/4] http://www.youtube.com/watch?v=4F2xMw3987
[Part 2/4] http://www.youtube.com/watch?v=2zbWTUkrfJM
[Part 3/4] http://www.youtube.com/watch?v=fvUB3xcMavw
[Part 4/4] http://www.youtube.com/watch?v=k9gyaYebjyQ
On Sep 9, 2010, just after my presentation at Sec-T, I was
interviewed by a reporter. This was the result:
Included here by permission of the copyright holder.
Adobe Reader is a tremendous risk
by Frida Sundkvist
Do you have Adobe Reader installed? Then neither your documents
nor passwords are protected completely. Expert advice is to
replace the PDF reader while Adobe solves the problems.
The last year has highlighted many security problems with PDF
files. Often it's enough that a user has a PDF reader installed to
be targeted by an attack.
– PDF is the biggest threat today.
You can do almost anything without being detected. Adobe Reader is a
tremendous risk, says Julia Wolf, security researcher at Fire
Julia Wolf is in Stockholm to speak at the Sec-T Security
Conference. She says that despite that, it is good that people are
discovering the problems with Adobe Reader or other PDF readers.
– My advice is to uninstall Adobe Reader and select another PDF
There are major weaknesses in the pdf
format which means that infected
files are difficult to
detect and there are also vulnerabilities in Adobe
Since PDF reader is by far the most popular, hackers have put
of their resources into targeting Adobe.
pdf-attack often goes unnoticed. If the user has a PDF reader
installed, it may be enough to accidentally go to the wrong website.
Seconds later, your personal documents uploaded to a waiting third
party. Your keystrokes are recorded and thus more people than you
may know your password.
– I have heard many different amounts
on how much money companies lose. The conclusion is that there is a
lot, she says.
In May Julia Wolf experimented with rewriting
a pdf file to see if antivirus software detected the infected file.
Only nine of 41 anti-virus detected it. The remaining 32 anti-virus
programs, didn't detect it. No surprise, says Julia Wolf.
Thomas Kristensen, security expert at Secunia, is on the same
track. If users in a company do not need very advanced PDF files,
the company should
consider changing pdf readers.
you are a criminal and want to cause as much harm as possible, which
route do you choose? In almost all cases, you choose Adobe Reader
before for example Foxit, he says.
Two things will help if
you still want to keep Adobe Reader. The first is according to
Thomas Kristensen to disable two things: the Flash Plugin and
only open files they need in their work.
security expert at Symantec, does not think that the solution is to
uninstall Adobe Reader, but he understands that line of thinking. He
argues instead that it's most important to patch, that is to say to
update the security settings in Reader.
– All PDF readers
have security holes and it can take a long time to train users to
use a new program, he says.
The only way to be protected from
PDF attacks is to not have a PDF reader installed. Towards the end
of the year, comes the next update for Adobe
version will have isolated the program itself within a
"sandbox" (see article above).
Photo Caption: [Some kind of Swedish idiom about poking fun
at security, or saws, or something] Problems with
Adobe Reader have been known for a
long time. "You can
do almost anything without being detected," said Julia Wolf,
security researcher at Fire Eye, which calls on companies to use
different PDF reader until Adobe sorts out its program.
I can read French better than Google Translate, so this is the
relevant excerpt from http://blog.scrt.ch/2011/01/07/27c3-we-come-in-peace/.
The vulnerabilities of PDF
This last decade, the Portable
Document Format, or PDF, is now clearly the most common format for
the publication of electronic documents.
It is used in the
publishing industry as a reference format as much for printing as
for the publication of ebooks.
And the PDF reference
implementation is still and always the popular Adobe Acrobat, in
both versions, Writer or simply "Reader".
Unfortunately, as Julia Wolf has mentioned in her presentation,
PDF is a standard that has more or less been agglomerated over the
needs of it's original version, and Acrobat is the result of this
organic evolution: more than 15 million lines of code, Acrobat is
bigger than Mozilla firefox, bigger than the Linux kernel and the
majority of the code was written in the '90s, in Adobe's secret
The result is that it's possible to easily fool
The syntax of PDF and along with other reasons that
the standard has no explicitly described method for validation of
PDF documents allows the creation of PDF documents that are at the
same time executables, which contain malicious code or that pose as
any of a number of data formats.
In a PDF, one can call
system commands, execute arbitrary programs, form documents that
display different contents depending on the PDF reader program used,
and so on.
Julia Wolf is focused on Acrobat, because it is
still today the most common PDF client, but the standard on on which
these programs are based is sufficiently confusing and complex that
the exploitation of vulnerabilities is possible with each
Stuff that should be here, but isn't:
- Testing Samples
- PDF Portmanteau technical
- PDF Quine, with explanation
- Misc. stuff,
like that forwards-backwards-pages PDF file (It's in the Test
Once I've organized it all, I'll publish it on this blog. But I'm
rushing to get this one [the blog post you're now reading] published.