The HTTP Query Method

This paragraph is the key:

> The QUERY method provides a solution that spans the gap between the use of GET and POST. As with POST, the input to the query operation is passed along within the content of the request rather than as part of the request URI. Unlike POST, however, the method is explicitly safe and idempotent, allowing functions like caching and automatic retries to operate.

nonethewiser · a year ago

But doesn't this beg the question: why not GET with query params?

I'm not necessarily against it - I've had the urge to send a body on a GET request before (cant recall or justify the use-case, however).

Reasons I can think of:

- browsers limit url (and therefore) query param size

- general dev ergonomics

The practical limits (ie, not standard specified limits) on query param size seems fair but its worth mentioning there are practical limits to body size as well introduced by web servers, proxies, cloud api gateways, etc or ultimately hardware. From what I can see this is something like 2kb for query params in the lower end case (Chrome's 2083 char limit) and 10-100mb (but highly variable and potentially only hardware bound) for body size.

In both cases it's worth stating that the spec is being informed by practical realities outside of the spec. In terms of the spec itself, there is no limit to either body size or query param length. How much should the spec be determined by particular implementation details of browsers, cloud services, etc.?

treve · a year ago

With a request body you can also rely on all the other standard semantics that request bodies afford, such as different content-types, content encoding. Query parameters are also often considered less secure for things like PII and (less important these days) query parameters don't really define character encoding.

But generally the most important reason is that you can take advantage of the full range of mimetypes, and yes practicality speaking there's a limit on how much you should stuff in a query parameter.

This has resulted in tons of protocols using POST, such as GraphQL. This is a nice middle ground.

chrsig · a year ago

Servers have to limit header size as well, since content-length isn't available. Without a bounded header size the client can just send indefinitely.

Also just impossible to read, and url encoding is full of parser alignment issues. I have to imagine that QUERY would support a json request body.

nine_k · a year ago

Size can be a very practical concern if your query parameter is e.g. a picture.

Not showing the URL in various logs can be a concern if your query parameters are sensitive.

cryptonector · a year ago

Reasons:

  - URI length limits (not just
    browsers, but also things
    like Jersey)
  
  - q-params are ugly and limiting
    by comparison to an arbitrarily
    large and complex request body
    with a MIME type

However, q-params are super convenient because the query is then part of the URI, which you can then cut-n-paste around. Of course, a convenient middle of the road is to use a QUERY which returns not just query results but also a URI with the query expressed as q-params so you can cut-n-paste it.

MBCook · a year ago

This avoid changing the definition of GET. Who knows how many middle boxes would mess with things if you did that because they “know” what GET means and so their thing is “safe”.

Until GET changes.

People aren’t using QUERY yet so that problem doesn’t exist.

TacticalCoder · a year ago

> Unlike POST, however, the method is explicitly safe and idempotent, allowing functions like caching and automatic retries to operate.

A shitload of answers to GET requests, although cacheable, are stale though. If you issue a GET for a page which contains stuff like "Number of views: xxx" or "Number of users logged in: yyy" or "Last updated on: yyyy-mm-dd" there goes idempotency.

Some GET requests are actually idempotent ("Give me the value of AAPL at close on 2021-07-21") but many aren't.

Stale data won't break much but there's a world between "it's an idempotent call" and "I'm actually likely to be seeing stale data if I'm using a cached value".

I mean... Enter in you browser "https://example.org", that's a GET. And that's definitely not idempotent for most sites we're all using on a daily basis.

kbolino · a year ago

Idempotence does not mean immutability, it means that two or more identical operations have the same effect on the resource as a single one. Since GET operations, by virtue of also being safe, generally have no effect at all, this is almost always trivially true. Just because the resource's content changed for some other reason doesn't mean GET is not idempotent.

Timon3 · a year ago

Idempotency in the context of HTTP requests isn't about the response you receive, but about the state of the resource on the server. You're supposed to be able to call GET /{resource}/{id} any amount of times without the resource {id} being different at the end. You can't do the same for POST, as POST /{resource} could create a new entity with every call.

A view counter also doesn't break this, as the view counter isn't the resource you're interacting with. As long as you're not modifying the resource on a `GET` call, the request is idempotent.

stackghost · a year ago

I suppose the counterargument is that "users logged in" or "views" is stale all the time, because it could have changed while the response was in flight.

If you really need live data for "views" or what have you, then perhaps the front end should be querying the backend repeatedly via a separate POST.

TYPE_FASTER · a year ago

Yes. I was skeptical at first. Why add another method? This explains why.

solardev · a year ago

GraphQL reads are like that. You POST a query with the content inside the request body. This would be a nicer verb for it.

- no URIs (see below response to your point about discovery) - nothing is exposed about RPC idempotence, which makes "routing" fraught - lack of 3xx redirects, which makes "routing" hard - lack of cache controls - lack of online streaming ("chunked" encoding w/ indefinite content-length)

- URIs!! - idempotence is explicitly part of the interface because it is part of the protocol - generic status codes including redirects - content type negotiation - conditional requests - byte range requests - request/response body streaming

[0] https://datatracker.ietf.org/wg/httpbis/about/ [1] https://zulip.ietf.org/#narrow/stream/225-httpbis [2] mailto:httpbisa@ietf.org (I think) [3] https://mailarchive.ietf.org/arch/browse/httpbisa/ [4] https://httpwg.org/ [5] https://github.com/httpwg [6] https://github.com/httpwg/http-extensions https://github.com/httpwg/http-extensions/blob/main/draft-ietf-httpbis-safe-method-w-body.xml [7] https://github.com/httpwg/http-extensions/issues

There is currently no way to determine whether there's, for example, an application/json representation of a URL that returns text/html by default.

OPTIONS and PROPFIND don't get it.

There should be an HTTP method (or a .well-known url path prefix) to query and list every Content-Type available for a given URL.

From https://x.com/westurner/status/1111000098174050304 :

> So, given the current web standards, it's still not possible to determine what's lurking behind a URL given the correct headers? Seems like that could've been the first task for structured data on the internet

treve · a year ago

Servers can return something like:

    Link: </foo>; rel="alternate" type="application/json"

You could return multiple of these as a response to a HEAD request for example.

You could also use the Accept header in response to a HTTP OPTIONS request, or as part of a 415 error response.

    Accept: application/json, text/html

https://httpwg.org/specs/rfc9110.html#field.accept (yes Accept can be used in responses)

.well-known is not a good place for this kind of thing. That discovery mechanism should only be used in situations where only a domainname is known, and a feature or endpoint needs to be discovered for a full domain. In most cases you want a link.

The building blocks for most of this stuff is certainly there. There's a lot of wheels being reinvented all the time.

westurner · a year ago

> https://httpwg.org/specs/rfc9110.html#field.accept (yes Accept can be used in responses)

I think that would solve.

HTTP servers SHOULD send one or more Accept: {content_type} HTTP headers in the HTTP Response to an HTTP OPTIONS Request in order to indicate which content types can be requested from that path.

tarasglek · a year ago

Yeah no way to ask for openapi schema definition.

westurner · a year ago

https://schema.org/WebAPI

https://github.com/schemaorg/schemaorg/issues/1423#issuecomm... :

> Examples of how to represent GraphQL, SPARQL, LDP, and SOLID APIs [as properties of one or more rdfs:subClassOf schema:WebAPI (as RDFa in HTML or JSON-LD)]

jessekv · a year ago

giobox · a year ago

I was just working with OpenSearch's API recently, who sort of abuse the semantics of the HTTP spec by using GETs with a body to perform search queries, largely to solve a similar problem. A "QUERY" message type would probably easily replace the GET with a body used by OpenSearch etc. I'd go as far as to argue thats largely what this new QUERY type is - official recognition for a "GET" request with a body.

> https://opensearch.org/docs/latest/api-reference/search/

BugsJustFindMe · a year ago

It's not abuse of the spec. The spec says that GET is allowed to have a body. Whether an endpoint must honor it is left undefined, and this has lead people to believe wrongly that that means endpoints are not allowed to expect it. But certainly the endpoint itself defines what it does with your requests, and that's what actually matters.

QUERY is just GET once more with feeling, because people are worried about existing software that made the wrong decision about denying or stripping GET bodies and would need to be updated to allow them through, and detecting an unimplemented verb is I guess maybe simpler than detecting some middle layer being a dick and dropping your data.

This is not a correct interpretation of the spec, as far as I can tell. I did some more research on this + primary sources on my blog if you're interested in why and why people are confused about this:

https://evertpot.com/get-request-bodies/

pgris2 · a year ago

I don't like using the body. GET's are easily shareable, bookmarkeable, etc, (even editable by humans). etc. thans to query strings. I rather have a GET with better (shorter) serialization for parameters and a standarized max-length of 16k or something like that.

imbnwa · a year ago

They discuss this desire in the draft:

>A server MAY create or locate a resource that identifies the query operation for future use. If the server does so, the URI of the resource can be included in the Location header field of the response (see Section 10.2.2 of [HTTP]). This represents a claim that a client can send a GET request to the indicated URI to repeat the query operation just performed without resending the query parameters.

Deleted Comment

lokar · a year ago

HTTP is used in many contexts where no human will see the request

mtrovo · a year ago

I can't stop thinking how HTTP REST is just an abuse on the original design of HTTP and the whole thing of using HTTP verbs to add meaning to requests is not a very good abstraction for RPC calls.

The web evolved from being a tool to access documents on directories to this whole apps in the cloud thing and we kept using the same tree abstraction to sync state to the server, which doesn't make a lot of sense in lots of places.

Maybe we need a better abstraction to begin with, something like discoverable native RPCs using a protocol designed for it like thrift or grpc.

jarjoura · a year ago

HTTP as a pure transport protocol keeps coming back to the default, because it works. Its superpower is that it pushes stateless (and secure) design from end to end. You have to fight a losing battle to get around that, so if you play along, you can end up with better power efficiency, resiliency and scalability.

REST is just very simple to understand and easy to prototype with. There's better abstractions on top of HTTP, like GraphQL and gRPC (as you mentioned), but you can layer those on after you have a working solution and are looking for more performance.

HTTP3 is on the way this decade and I'm excited for its promises. Given how long it took for HTTP2 to standardize, I'm not optimistic it will be soon, but it does mean we have a path forward.

Two4 · a year ago

HTTP2 has already been mostly replaced by HTTP3, iirc. We now mostly have a split between HTTP1.1 And 3, if I'm remembering an article I read correctly.

> The web evolved from being a tool to access documents on directories to this whole apps in the cloud thing and we kept using the same tree abstraction to sync state to the server, which doesn't make a lot of sense in lots of places.

The first part is correct, but hard disagree on the rest. HTTP makes a lot of sense for RPC-ish things because a) it can do those things better than RPC, b) HTTP can do things that RPC typically can't (like: content type negotiations and conversions, caching, online / indefinite length content transmission, etc).

HTTP is basically a filesystem access protocol with extra semantics made possible by a) headers, b) MIME types, and if you think of some "files" as active/dynamic resources rather than static resources, then presto, you have a pretty good understanding of HTTP. ("Dynamic" means code processes a request and produces response content, possibly altering state, that a plain filesystem can't possibly do. An RDBMS is an example of "dynamic", while a filesystem is an example of "static".)

REST is quite fine. It's very nice actually, and much nicer than RPC. And everything that's nice about REST and not nice about RPC is to do with those extensible headers and MIME types, and especially semantics and cache controls.

But devs always want to generate APIs from IDLs, so RPC is always so tempting.

As for RPC, there's nothing terribly wrong with it as long as one can generate async-capable APIs from IDLs. The thing everyone hates about RPC is that typically it gets synchronous interfaces for distributed computations, which is a mistake. But RPC protocols do not imply synchronous interfaces -- that's just a mistake in the codegen tools designs.

Ultimately the non-RESTful thing about RPCs that sucks is that

Conversely, the things that make HTTP/REST good are:

> Maybe we need a better abstraction to begin with, something like discoverable native RPCs using a protocol designed for it like thrift or grpc.

That's been tried. ONC RPC and DCE RPC for example had a service discovery system. It's not enough, and it can't be enough. You really need URIs. And you really need URIs to be embeddable in contents like HTML, XML, JSON, etc. -- by convention if need be (e.g., in JSON it can only be by convention / schema). You also need to be able to send extra metadata in request/response headers, including URIs.

(Re: URIs and URIs headers, HATEOAS really depends on very smart user-agents, which basically haven't materialized because it turns out that HTML+JS is enough to make UIs good, and so URIs in headers are not useful for UIs, but they are useful for APIs.)

It took me a long time to understand all of this, that REST is right and typical RPCs are lame. Many of the points are very subtle, and you might have to build a RESTful application that uses many of these features of HTTP/REST in order to come around -- that's a lot to ask for!

The industry seems to be constantly vacillating between REST and RPC. SOAP came and went; no one misses it. gRPC is the RPC of the day, but I think in the end the only nice thing about it is binary encodings and schema, and that it won't survive in the long run.

TZubiri · a year ago

En.wikipedia.org/wiki/constitutionalism

apitman · a year ago

Looks interesting. Sadly it will likely be hamstring by CORS rules. When designing an API, I frequently send all non-GET requests as POSTs with content type text/plain in order for them to classify as simple requests and avoid CORS preflights, which add an entire round trip per request. Obviously only safe if you're doing proper authorization with a token or something. Another fun bit is that you have to put the token in the query string, going against best practices for things like OAuth2, because the Authorization header isn't CORS approved. CORS enforcement is an abomination.

Because of the semantics of QUERY CORS rules should apply as to GETs -- one would think anyways. The Internet-Draft ought to say something about CORS though, that's for sure.

arcuri82 · a year ago

it looks great. Covering a few pain points in modelling APIs when you need to retrieve data idempotently but the URL is not identifying a specific resource, ie, "The content returned in response to a QUERY cannot be assumed to be a representation of the resource identified by the target URI".

This is a "work in progress". Is there any estimation of when it will be finalized by? Something like during 2025, and frameworks/libraries starting to support it by something like 2026? Just to have a reference, anyone remember how long it took for PATCH?

This appears to be a Working Group item of the HTTPbis WG [0] which has a Zulip stream [1], a mailing list [2][3], a site [4], and a GitHub organization [5], thus lots of ways to send feedback. The Internet-Draft itself has a GitHub repository [6]. For this I think a GitHub issue/issues [7] would be best; some already exist now for issues in this thread.

Many things never need explicit support, because good HTTP citizens allow unknown HTTP methods (and treat them basically as POST). True for for example PHP and fetch().

Node.js needs explicit support, this got added a few months ago. There's a good chance that as long as your server supports it, you can start using it today.

jkrems · a year ago

> When doing so, caches SHOULD first normalize request content to remove semantically insignificant differences, thereby improving cache efficiency, by: [...]

That part sounds like it's asking for trouble. I'm curious if this will make it to the final draft. If the client mis-identifies which parts of the request body are semantically insignificant, the result would be immediate cache poisoning and fun hard-to-debug bugs.

If it's meant as a "MAY", then that seems kind of meaningless: If the client for some reason knows that one particular aspect of the request body is insignificant, it could just generate request bodies that are normalized in the first place..?

Request body normalization is not really feasible, not without having the normalization function be specified by the request body's MIME type, and MIME types specify no such thing. Besides, normalization of things like JSON is fiendishly tricky, and may not be feasible in the case of many MIME types. IMO this should be removed from the spec.

Instead the server should normalize if it can and wants to, and the resulting URI should be used by the cache. The 3xx approach might work well for this, whereas having the server immediately assign a Content-Location: URI as I propose elsewhere in this thread would not allow for server-side normalization in time for a cache.

orf · a year ago

Yeah, that’s nuts and is obviously flawed behaviour that can interact poorly with any number of things - not least of all any kind of checksums within the response.

I’m surprised to see that in a RFC.

Edit: it’s only for the cache key:

> Note that any such normalization is performed solely for the purpose of generating a cache key; it does not change the request itself.

Still super dangerous.

Edit edit: I just typed out a long message on the GitHub issue tracker for this, but submitting it errored and I’ve lost all the content. Urgh