Security Experts:

Connect with us

Hi, what are you looking for?



Establishing Correspondence Between an Application and its Source Code

How Combining Two Completely Separate Open Source Projects Can Make Us All More Secure

When you run an application, how can you verify that what you are running was actually built from the code that a trusted developer wrote?

How Combining Two Completely Separate Open Source Projects Can Make Us All More Secure

When you run an application, how can you verify that what you are running was actually built from the code that a trusted developer wrote?

Imagine that an open source developer writes the code for the software and publishes it on Github or another a website. You, the end user, can download the software right from that location, compile it, and run it. You believe that what you’re running is what the developer wrote because you cloned the repo via a secured channel or you verified the source package that you downloaded via the project website by comparing the hash of the file that you downloaded against the value that the developer published on their website and signed with a PGP key (despite what security folk want to believe, it is often hyperbolically claimed that nobody actually does this).

Trusting Source CodeWhen you get the application from a distribution rather than from the developer, you are trusting that the distribution performed the same validation that you would have done (or better). Now you, as the end-user, get the installable package from the distribution. You install it on your system and you run the executable contained within the package. You don’t have to compile it because that is one of the value-added functions of the distributor. The package management system handles the verification of the package before it is installed.

Now, as the end-user, how do you know that the executable that you were running is what the open source developer originally wrote? Many distributions alter the code before they ship it to apply bug patches and security fixes. How does the distributor know the the code in the package is compiled of the code from the developer plus their fixes? In most cases, distributions have automated build systems that take the code and emit packages. But what if one of the systems in the build system is corrupted? How do the developer, the distributor, and you figure this out?

The Reproducible Builds Project intends to solve that problem. The goal of the Reproducible Builds Project is to ensure that if a package is built on one system and then again on a different but similar system, the outputs are directly comparable. This can be verified using the diffoscope tool produced by the project or, in many cases, by comparing cryptographic hashes of the two packages to ensure that they are identical.

It is harder to achieve reproducibility than it is to verify reproducibility because many packages are not reproducibly buildable for a variety of reasons. These reasons the include the inclusion of timestamps somewhere in the pacakge, alternate build directories, differences in the versions of the build tools, variable directory inclusion, and other small details of the build system that get included into the final package.

The Reproducible Builds project is writing tools to facilitate comparison of two separately generated packages, fixing toolchain issues, and working with the upstream developers to fix any problems which cause the source not to be reproducibly buildable. Subtle corruptions of individual machines in a build system become detectable when previously reproducibly buildable packages suddenly start failing verification.

In summary, the Reproducible Builds project is working to ensure the package will be identical to packages built on another similar system using the same source code so that the integrity of the package can be validated. Packages that can be reproducibly built include deb and RPM packages, which carry within them the cryptographic hashes of the files that are included in the packages. This includes the executable as well.

The Integrity Measurement Architecture (IMA-appraisal) component of the Linux kernel has the capability of validating a file’s integrity based on the file’s signature stored as an extended attribute, before allowing the file to be accessed (for example, before a file can be executed or a library loaded). For the signature validation to succeed, the file signature’s public key must be on the IMA keyring. Only trusted keys, those keys signed by a key on the system keyring, may be added to the IMA keyring.

 Soon it will be possible to enroll the signed hashes from the package management system as IMA attributes during the installation process. Then, if you configure your system to be IMA enforcing, you will know that every running application came from your trusted distribution.

If your trusted distribution uses reproducible builds, then you will be able directly trace the chain of integrity of the executing process back to the original code and know that the code has not been subverted during delivery. Of course, you still have to protect your system keyring, IMA keyring and policy, trust the compiler, and trust (or validate) that the developer is producing code without backdoors, but what you have is verifiable evidence of correspondence between the executing process and the source code, which is a new level of integrity.

This model of integrity is not completely realized today. Some steps are not yet complete – IMA is still considered experimental, patches to Debian packaging system to include signed file hashes have been submitted, but not yet accepted, the Reproducible Builds Project has made amazing strides but work has not yet been completed.

This vision for a full chain of integrity from developer to executing process is taking shape and is tantalizing close to realization. I look forward to the day when we can run our applications with the full knowledge that they come to us intact and as intended by the developer or the distribution.

Checking Source Code Integrity

Written By

Click to comment

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

Join this webinar to learn best practices that organizations can use to improve both their resilience to new threats and their response times to incidents.


Join this live webinar as we explore the potential security threats that can arise when third parties are granted access to a sensitive data or systems.


Expert Insights

Related Content


Less than a week after announcing that it would suspended service indefinitely due to a conflict with an (at the time) unnamed security researcher...

Risk Management

The supply chain threat is directly linked to attack surface management, but the supply chain must be known and understood before it can be...


Apple has released updates for macOS, iOS and Safari and they all include a WebKit patch for a zero-day vulnerability tracked as CVE-2023-23529.

Application Security

Drupal released updates that resolve four vulnerabilities in Drupal core and three plugins.

Cloud Security

VMware vRealize Log Insight vulnerability allows an unauthenticated attacker to take full control of a target system.

IoT Security

Lexmark warns of a remote code execution (RCE) vulnerability impacting over 120 printer models, for which PoC code has been published.

Application Security

A CSRF vulnerability in the source control management (SCM) service Kudu could be exploited to achieve remote code execution in multiple Azure services.


GoAnywhere MFT users warned about a zero-day remote code injection exploit that can be targeted directly from the internet