url - Why is Spring de-coding + (the plus character) on application/json get requests? and what should I do about it?

Question

Welcome To Ask or Share your Answers For Others

url - Why is Spring de-coding + (the plus character) on application/json get requests? and what should I do about it?

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

url - Why is Spring de-coding + (the plus character) on application/json get requests? and what should I do about it?

I have a Spring application that receives a request like http://localhost/[email protected]. This triggers a controller that roughly looks like this:

@RestController
@RequestMapping("/foo")
public class FooController extends Controller {
    @GetMapping
    public void foo(@RequestParam("email") String email) {
       System.out.println(email)
    }
}

By the time I can access email, it's been converted to foo [email protected] instead of the original [email protected]. According to When to encode space to plus (+) or %20? this should only happen in requests where the content is application/x-www-form-urlencoded. My request has a content type of application/json. The full MIME headers of the request look like this:

=== MimeHeaders ===
accept = application/json
content-type = application/json
user-agent = Dashman Configurator/0.0.0-dev
content-length = 0
host = localhost:8080
connection = keep-alive

Why is Spring then decoding the plus as a space? And if this is the way it should work, why isn't it encoding pluses as %2B when making requests?

I found this bug report about it: https://jira.spring.io/browse/SPR-6291 which may imply that this is fixed on version 3.0.5 and I'm using Spring > 5.0.0. It is possible that I may misinterpreting something about the bug report.

I also found this discussion about RestTemplate treatment of these values: https://jira.spring.io/browse/SPR-5516 (my client is using RestTemplate).

So, my questions are, why is Spring doing this? How can I disable it? Should I disable it or should I encode pluses on the client, even if the requests are json?

Just to clarify, I'm not using neither HTML nor JavaScript anywhere here. There's a Spring Rest Controller and the client is Spring's RestTemplate with UriTemplate or UriComponentsBuilder, neither of which encode the plus sign the way Spring decodes it.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T19:22:52+0000

Original Answer

You are mixing 2 things, a + in the body of the request would mean a space when header has application/x-www-form-urlencoded. The body or content of the request would be dependent on the headers but a request can just have a url and no headers and no body.

So the encoding of a URI cannot be controlled by any headers as such

See the URL Encoding section in https://en.wikipedia.org/wiki/Query_string

Some characters cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL: for example, the character # can be used to further specify a subsection (or fragment) of a document. In HTML forms, the character = is used to separate a name from a value. The URI generic syntax uses URL encoding to deal with this problem, while HTML forms make some additional substitutions rather than applying percent encoding for all such characters. SPACE is encoded as '+' or "%20".[10]

HTML 5 specifies the following transformation for submitting HTML forms with the "get" method to a web server.1 The following is a brief summary of the algorithm:

Characters that cannot be converted to the correct charset are replaced with HTML numeric character references[11] SPACE is encoded as '+' or '%20' Letters (A–Z and a–z), numbers (0–9) and the characters '*','-','.' and '_' are left as-is All other characters are encoded as %HH hex representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding) The octet corresponding to the tilde ("~") is permitted in query strings by RFC3986 but required to be percent-encoded in HTML forms to "%7E".

The encoding of SPACE as '+' and the selection of "as-is" characters distinguishes this encoding from RFC 3986.

And you can see the same behaviour on google.com as well from below screenshots

Also you can see the same behaviour in other frameworks as well. Below is an example of Python Flask

So what you are seeing is correct, you are just comparing it with a document which refers to the body content of a request and not the URL

Edit-1: 22nd May

After debugging it seems the decoding doesn't even happen in Spring. I happens in package org.apache.tomcat.util.buf; and the UDecoder class

/**
 * URLDecode, will modify the source.
 * @param mb The URL encoded bytes
 * @param query <code>true</code> if this is a query string
 * @throws IOException Invalid %xx URL encoding
 */
public void convert( ByteChunk mb, boolean query )
    throws IOException
{
    int start=mb.getOffset();

And below is where the conversion stuff actually happens

    if( buff[ j ] == '+' && query) {
        buff[idx]= (byte)' ' ;
    } else if( buff[ j ] != '%' ) {

This means that it is an embedded tomcat server which does this translation and spring doesn't even participate in this. There is no config to change this behaviour as seen in the class code. So you have to live with it

Categories

url - Why is Spring de-coding + (the plus character) on application/json get requests? and what should I do about it?

url - Why is Spring de-coding + (the plus character) on application/json get requests? and what should I do about it?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags