TL;DR: Badges are not magic. They are just image hotlinks, and therefore you need to be able to trust the third party who serves them.
Badges are those cute little colourful rectangles that you see at the top of many READMEs. They often display things like the current version of the software, number of downloads, code coverage, results of vulnerability scans, build status, etc.
How do badges work?
Badges are simply images, nothing more. When you add a badge to your README, you add some code that looks like this:
which is rendered as HTML code that looks like:
<a href="https://pypi.python.org/pypi/pyhocon" rel="nofollow"> <img src="http://img.shields.io/pypi/v/pyhocon.png" alt="pypi"> </a>
(I know that’s not exactly how it’s rendered on GitHub. For example, GitHub proxies the images through camo. I’ve simplified the example for clarity.)
What happens when a user visits your README? Their browser sees these image tags and makes additional requests to fetch those images from the badge provider. Then the badge provider serves the images and the badges appear.
What’s the risk?
What happens when a badge provider:
- goes out of business and someone else buys the domain?
- decides they don’t want to pay for the bandwidth anymore?
- inject ads into badges to pay for the bandwidth?
- sells tracking data about your users to pay for the bandwidth?
- gets hacked to inject malware into badges?
In all of these cases, your users could be exposed to “bad stuff”. We’ve actually seen examples of similar scenarios in the past:
pypip.in went out of business and all their badges broke. Eventually a benevolent developer bought the domain and is now serving redirects to the shields.io badge service. The domain could have easily been bought by someone with malicious intents.
A photographer was tired of paying bandwidth costs and swapped out a politician’s banner image with a photoshopped image of the politician having sex. Any time you ‘hotlink’ to an image hosted by a third party, you are fundamentally trusting them not to swap out the contents.
CircleCI recently changed their badges to include an advertisement that their 1.0 API is being sunsetted. While this case was well intentioned, it could have easily been something less charitable. To their credit they realised their mistake and reverted back to the regular badges.
There have been many instances of hackers injecting malware as part of XSS attacks on advertising or other images.
I’m concerned… what should I do?
- Don’t panic! There’s probably no need to be worried.
- Take a look at the badges that your projects are currently using and see who you are implicitly trusting today.
- Make a conscious decision about whether the value provided by the badge outweighs the potential risks of allowing third-parties to host them.
In some cases, you might simply decide to drop the badge from your README.
In cases where the number of unique badge values is finite (ex. build passed or failed), you might be able to apply Sub-Resource Integrity values to the badge. This would prevent badge providers from returning anything other than the expected badges. However, I suspect that this would prove too brittle/infeasible in practice (let me know).
Who can I trust?
This will depend on your project’s needs.
In my mind, the large badge providers are more trustworthy, as they usually have a source of revenue that relies on continued customer/user trust.
shields.io is probably the largest badge provider today. It’s an open-source community project that doesn’t do any tracking of user requests. They are funded through donations. Personally, I consider them to be trustworthy for my projects. Of course, by being the largest, they also have the largest potential value to hackers.
(DISCLAIMER: I have previously donated to shields.io)
If you simply don’t like the idea of trusting any third party badge providers, you can always host your own instance of the open-source shields.io server.
Badges are a security vulnerability, and maintainers of projects should frame their thinking around them in that way. You shouldn’t necessarily be worried, but you should definitely be aware of the risks they pose to your project.
The open-source community at large is getting more aware of security risks throughout the ecosystem, from dependencies to reproducible builds. Thinking about README badge providers as a calculated risk vulnerability is just another part of the overall hardening process.
Thanks to Kevin Ulug for helping edit this post.