Exploiting Digital Signatures on PDF Documents

Three years ago, information circulated that a class of exploits called "shadow attack" could change the contents of a PDF document that had been digitally signed with a digital signature. This information is very interesting especially in this New Normal era where the use of electronic and digital signatures is increasing, especially in Indonesia.

The threat impact of successful exploitation of digital signatures is certainly quite large. As will be explained in sufficient detail in this blog post, the legality aspect of digital signatures is legally recognised by different parties in different countries, so if they are successfully exploited and forged, many parties will be harmed.

In Indonesia, there are at least 9 official companies at the moment that have been recognised by the Ministry of Communication and Information as electronic certification providers.

Figure 1. Electronic Certification Providers in Indonesia

Each company will have an application that is used to validate digital signatures. Hopefully, the relevant parties can verify the application based on the information provided here.

In this article, I try to discuss in more detail the types of attacks against digital signatures on PDF documents as discussed by researchers at the Ruhr University in Germany, where the vulnerability classes were discovered and first published by them.

Understanding these vulnerability classes is useful not only for users of digital signatures, but also for providers of digital signatures, since the process of validating digital signatures can basically be done in two ways: desktop applications and online applications.

Electronic & Digital Signatures

On 30 June 2000, the President of the United States - Bill Clinton - enacted a federal law that facilitates the use of electronic and digital signatures in interstate and foreign commerce by ensuring the validity and legal effect of contracts.

This means that important government documents can be legally valid if they are electronically or digitally signed, even if they are not wet-signed (manually signed by a human). Since then, the use of electronic and digital signatures has expanded, not only in government environments, but also in non-government environments, such as for commercial needs.

Countries outside the United States have also begun to enact electronic and digital signature legislation, including Indonesia. A complete list of countries that allow the use of electronic and digital signatures can be found at the following link.

In Indonesia, the law is regulated in Article 12 of Law No. 11 of 2008 on Electronic Information and Transactions.

💡

Did anyone notice that at the beginning of the article I wrote in full: Electronic and Digital Signatures, and in the title of the article it is not the use of electronic signatures but the use of digital signatures? The reason is that there is a difference between the terms electronic signature and digital signature. It is important to understand this distinction before getting into understanding the structure of PDF documents and also understanding the class of vulnerabilities discovered by the Ruhr University research group.

Electronic Signature

Electronic signatures are used to sign electronic records. The record (or document) is usually prepared in advance and then sent using a specialist e-signature solution provider, and the recipient can then easily sign it at the touch of a button using any method (e.g. email).

The requester creates a document (e.g. PDF) and sends it to the server running the e-signature application.
The e-Signature application receives the document and extracts the information of the person who needs to sign the document.
Having identified the person required to sign, the application instructs the person to apply an electronic signature.
This signing process can vary and depends on the implementation of the application. Essentially, the individual electronically signs the document (possibly using a digital pen facility).
The application will record when the document was signed, who signed it and whether the signature is valid (according to the pre-entered data).

The electronic signature process is similar to a wet (manual) signature, but is done electronically using an application. The form of the signature can be a wet signature, a fingerprint or a stamp.

Here is an illustration of an electronic signature using the Adobe e-sign application.

Figure 3. Illustration of the use of electronic signatures with the adobe e-sign application

Digital signature

If in electronic signatures there is a visualisation of the signature form that can be validated by humans or applications (for example: validation of the signature form to prevent fraud), then in digital signatures there is no visualisation that can be displayed other than a validation confirmation that the signature entered in the document is valid.

How is this possible?

Because digital signatures involve cryptography and an infrastructure known as PKI (Public Key Infrastructure).

If the implementation of electronic signatures is dependent on a device/application/infrastructure (for example: an application made by a product vendor), then the implementation of digital signatures is broader in nature, where the validation of the validity of the signature can be done by anyone (not dependent on the monopoly of a single vendor).

To illustrate this, consider the following figure. I have adapted the following illustration from David W Youd's website, slightly modified to make it easier to understand.

Budi - cryptographically, will create two types of keys. A public key and a private key. The private key will be kept by himself, while the public key will be distributed to his friends.

As shown above, if Anton wants to send a secret message to Budi, he will use Budi's public key to create a secret message and send it to Budi.

To open Anton's secret message, Budi will use his private key so that the message can be read. Another illustration looks something like this.

Public encryption using PKI is also known as asymmetric encryption, which means that when data is encrypted using a private key, it can only be decrypted using a public key. Conversely, data encrypted with a public key can only be decrypted with a private key.

This concept is very important because the asymmetric nature is then used in the concept of digital signatures.

In addition to decrypting messages encrypted with a public key, Budi's private key can also be used to create a digital signature to be applied to a document.

Here is an illustration,

Figure 7. Digital Signature by Utilising the Private Key

Here is the explanation,

The document to be digitally signed is hashed. Hashing can be compared to a function that processes changes from one form to another. Hashing data is a common activity in the computer world and is used for various purposes including cryptography, compression, checksums and data indexing. The hashed data will be unique, one bit will change and the hash result (message digest) will change. The message digest also cannot be restored to its original form, so the hash function is called a one-way function.
Budi then uses his private key to encrypt the message digest to create a digital signature. At this stage we can see that the digital signature is specific to the document being signed and is also unique because it uses Budi's private key.
The resulting digital signature is then applied to the original document.
From the above diagram, we can already see the difference between electronic signatures and digital signatures, where digital signatures involve a cryptographic process (hashing) using an identity created using PKI (private key and public key).

Certificate Authority

To complete the understanding of digital signatures, we will add a discussion of another component in the PKI infrastructure, namely what is known as a "Certificate Authority (CA)".

Returning to the basic private and public key encryption scenario above, there is a problem that can arise, namely the question: if Budi's friends (Anton, Joki, Irma) receive Budi's public key so that they can send secret messages to Budi, how can they be sure that what they receive is really Budi's public key?

What if there is a friend of theirs named Jack who wants to do bad things to Budi by pretending to be Budi and then distributing the public key using Budi's name so that in the future when Budi's friends make a secret message, Jack can read the message?

Well, to prevent the above from happening, a trusted party must be appointed to validate the authenticity and validity of Budi's public key. So, if Budi (or an individual on Budi's behalf) used to be able to distribute public keys directly to his friends, for the sake of mutual security everyone agreed that public keys could only be trusted if there was a certificate "stamp" from one party.

This party is called a "Certificate Authority (CA)".

Figure 8. The role of CA (Certificate Authority) in PKI infrastructure

In the figure above, Budi sends his public key to the CA. The CA will then ask for additional information related to Budi, such as name information, department in the company, location where he works, etc. In addition to Budi's information, there is also information about the certificate issued by the CA, such as its expiration date, serial number, etc. The information on the certificate is a control mechanism for the CA so that if one day it is needed, the CA can revoke the certificate so that the validation (if someone questions the validity) of the certificate issued to Budi, then the CA can declare that the certificate is no longer trusted.

The CA will perform a hash on the certificate data + Budi's public key to create a message digest, the message digest will then be encrypted using the CA's private key to create what is called a digital signature. The digital signature is then attached to the digital certificate and distributed to Budi's friends.

Figure 9. CA Digital Signature Validation - Performed by Budi's friends

What will happen when Budi's friends receive Budi's digital certificate from the CA? Budi's friends like Anton, Joko and Irma will do the following:

Budi's friends need to verify that Budi's digital certificate sent by the CA really comes from the CA (authentication) and that the information in it has not changed (integrity). The cool term for this activity is that the digital signature mechanism guarantees the authentication of the data owner and the integrity of the data sent. To perform the validation, Budi's friends take the public key of the CA (the public key is freely available) and decrypt the digital signature of the CA, which is attached to the digital certificate. The result is a message digest (hash result of Budi's certificate + Budi's public key).
Budi's friends will then perform a hash function on Budi's certificate + Budi's public key data to get the message digest. If the message digest is exactly the same, then we can be sure that Budi's digital certificate is indeed validated by the CA and has not changed. With this information, Budi's friends can validate that the validity of Budi's public key is indeed from Budi and not from other people like Jack. They believe this because CA is the party that does a kind of KYC (Know Your Customer), so CA is believed to have directly validated Budi. Now that the public key is trusted to belong to Budi, his friends can use the public key to send secret messages to Budi using his public key.

This is how CAs participate in the PKI infrastructure. Understanding how the CA digital signature validation process works in the above flow is very helpful in understanding the types of attacks that can be applied to digital signature provider applications.

Are digital signatures the only application that can use the private key and public key schemes in the PKI infrastructure? The answer, of course, is no. There are many other applications, such as implementing SSL to secure HTTPS communications. Or, more interestingly, the implementation of the secure boot mechanism on the iPhone (later exploited by checkm8), as shown in the following figure.

When the iPhone is manufactured at the factory, Apple embeds the public key (Apple Public Key) in the BootROM so that the chain of trust mechanism applies, i.e. the iPhone will not be able to power on (boot) if the images inserted into the iPhone do not originate from Apple. How does Apple ensure that a secure image must come from Apple and not someone else's crack? By encrypting the image (the image here contains iBoot) with Apple's private key. Back to the asymmetric nature of PKI, where data such as iPhone images encrypted with a private key can be opened by the public key planted in the BootROM. Simple, right? 😊

What is Chain of Trust? As shown above, the next step in the boot process after running the iBoot image is to run the kernel. The kernel is also encrypted, with the key placed on iBoot, so that only kernels encrypted with the correct key can be run. And so on, hence the chain of trust.

The same principle can be applied to other devices, such as IoT devices, where an iPhone-style secure boot using PKI is used to ensure that the device only runs software from the official owner or official company.

So much for our discussion of PKI and its key components. Next we will discuss the structure of PDF documents.

Portable Document Format

Is PDF the only document that can be digitally signed? Of course not. There are many other document types. However, the vulnerability found by the Ruhr University research group is specific to PDFs (in fact, PDF documents are so common that it is interesting to study them).

To understand the nature of the vulnerability, it is first necessary to understand the structure of PDF files.

PDF files are actually very simple. PDF itself is a specification that is controlled by the company Adobe and then released to the public. By following these specifications, application developers can create desktop applications or online (web) applications that can read files in the PDF format.

In our discussion this time, because it relates to digital signatures, then by following the PDF specification, the application can validate the authenticity of digital signatures.

Before we go any further, let's take a look at the following simple structure of a PDF file.

Figure 11. Simple Structure of PDF File Created Using a Text Editor

The contents of the file above were created using a normal text editor, for example: notepad. In the example above I used vim.

When opened using a PDF file reader application, the results are as follows.

Figure 12. FIle hello.pdf Opened With a PDF Reader Application

The following explains some parts of the PDF file structure above.

Figure 13. Simple Structure of a PDF File

The header section defines the PDF specification used by the file, in the example above the PDF 1.7 specification.

This is followed by the body section, which consists of several objects. The first object (1 0 0) is of type Catalog, which means that it provides an initial mapping of the content structure of the PDF file. For example, in the PDF file above, the Catalog object contains information that the second object is an object with the category Pages. The third object is a Page, which is a page ready to be displayed by the PDF file reader application. The third object defines the resources and also the content of the page. The Resources object (4th object) defines the font used on the page. While the 5th object defines the content of the page, for example the sentence "Hello, this is a sample PDF file", wrapped in a stream.

The next section contains a cross-reference table, which is information about the location of each object in the file according to its byte position. For example, the first object is located at the 11th byte, so the cross-reference table is written,

0000000010 00000 n

To better understand the structure of PDF files, I suggest you try it directly by following the instructions at the following youtube link,

Now, a PDF file that is very "rich" (rich because it can define everything from javascript, multimedia video, text with different fonts, etc.) can insert a digital signature definition section according to the PDF specification.

This section is then being used, and also exploited.

PDF Digital Signature

The digital signature feature in PDF files is implemented using another feature called Incremental Saving (or Incremental Update), which allows PDF files to be modified without changing their content.

Figure 14. Incremental Update on PDF File

The figure above shows that PDF files can be updated. Earlier we saw the simple structure of a PDF file and how the cross-reference table defines the location of objects in the PDF file. PDF file reader applications such as Adobe Acrobat Reader only follow the information contained in the PDF file structure to display the contents of the PDF file or to do other things such as verify digital signatures.

Note that PDF file reader applications read from the bottom up (after reading the header information), so the startxref location is at the bottom and the startxref byte location definition is placed before the EOF.

The digital signature feature uses the incremental update feature, as shown in the figure below. This allows a digital signature to be added to a PDF document after it has been created.

Figure 16. Simplified Illustration of Adding a Digital Signature to a PDF Document

The figure above shows that the PDF file is updated by adding data or information related to digital signatures, namely by adding a new Catalogue section where a new object called Signature is defined in the Catalogue. Inside the Signature object there are definitions such as the content that will be filled by the digital signature itself, and also other information such as the byte range.

💡

PDF file reading applications, such as the Adobe PDF Reader desktop application, or online validation applications will attempt to validate the validity of digital signatures attached to PDF files using various implementation methods.

Why?

Because the PDF specification does not specifically lock the implementation of digital signature verification on PDF files, so each developer has their own implementation method, their own logic. Or if you use a library, then it depends on the implementation of the library, where if the logic can be exploited then the effect is that the PDF file is considered to have valid digital signature information when in fact it is invalid.

Attack on PDF signatures

A research group from the Ruhr University published their findings in early 2019. They performed a number of logical tests based on how digital signature verification is implemented by various desktop and online applications, and then attempted to exploit these implementations.

Let's take a look at some of the exploitation techniques they published. Basically, the activities performed are similar to fuzzing mechanisms, but performed at a high (logical) level.

Universal Signature Forgery (USF)

The main idea of USF is to defeat verification by providing invalid content in the signature object or removing references to the signature object.

Thus, despite the fact that the signature object is provided, the validation logic cannot apply the correct cryptographic operation.

As mentioned earlier, technically each digital signature on a PDF file is defined in a PDF signature object, e.g. 5 0 obj.

This object contains all the information needed to validate the signature. The most important part to exploit is that the signature object contains a /ByteRange entry, which defines the byte offset used to calculate the hash of the signature. The signature itself is then typically stored in a /Content entry of the PKCS7 blob datatype.

USF attacks manipulate the entries in the signature object to confuse the signature validation logic, as shown in the figure below.

Figure 17. Variations on the exploitation of the Signature object to confuse the implementation logic of an application that verifies digital signatures of PDF files

If the exploit is successful, the desktop application (or online validation) will display panel-like information that the signature in the PDF is valid and belongs to a specific person or entity.

Here is an example of what it looks like when the signature from Amazon (invoicing@amazon.de) is valid, but the content of the PDF document has been replaced with $1 trillion in refund information.

Figure 18. Illustration of Successful Exploitation

Incremental Storage Attack (ISA)

Under normal circumstances, incremental saving is used to add annotations to a PDF file, for example. The annotations themselves are incrementally saved after the original content of the PDF as part of the body of the new PDF.

In the ISA exploit class, the attacker takes a PDF that has been signed by the original person. They then add new content (pages, annotations, etc.) and save it at the end of the file using incremental updates.

This mechanism is essentially not a type of attack, but a feature of the PDF itself. However, the vulnerability arises when the signature validation logic does not recognise that the file content has been updated, i.e. new unsigned content has been added to the file.

The following are some variations that can be tested to see if the implementation of a PDF reader application is vulnerable or not,

If the PDF file reader/validation application has a vulnerability, the new content (the content of the PDF file) will be displayed and the application will not notice that the document has been changed or updated.
the application will be unaware that the document has been changed or updated.

Again, the bug in this category is due to the implementation of the PDF file reader application.

Final Thoughts

For more detailed information on the different types of digital signature attacks or exploits on PDF files, please visit the pdf-insecurity website.

Of course, new types of vulnerabilities can still be found because, once again, the success of exploitation is highly dependent on the implementation of each application.

Hopefully the information presented in this paper can be more or less a reference for colleagues who work as security auditors, penetration testers who get the task of auditing digital signature validation applications, or bug hunters who have their eyes on bounties from digital signature vendors around the world so the report is not just about XSS ;).

The auditing process usually requires a PoC (Proof Of Concept). The Ruhr University research team has provided several exploits as PoCs. We can test them and also modify them if necessary. This is one of the reasons why I try to explain the basic structure of a PDF file, how to create it manually, so that it can then be used to create custom exploits. So that if there are readers who are developing digital signature applications or have access to electronic certificate providers, they can create their own exploits as PoCs to test.

As the Ruhr University team stated, they are coordinating with the German CERT agency to inform various application developers of this type of vulnerability, both desktop applications and online applications.

For desktop applications, the responsibility for updating will be put back on the individual user, but for online applications it can be coordinated with the application developer.

In this new normal era, the need for digital signatures will increase, so the organising parties need to pay attention to the threat of security aspects where the effect is that the digital signature validation process becomes invalid and will have a negative impact on many parties.

-- MRS