How to (and how not to) design REST APIs

This falls down as soon as it makes a fundamental misunderstanding of what makes a REST api into a REST api.

It gives this as a ‘bad’ example:

   GET /v3/application/shops/{shop_id}/listings/{listing_id}/properties

With the justification that “The {listing_id} is globally unique; there's no reason for {shop_id} to be part of the URL. “

No the point of the API is that /v3/application/shops/{shop_id}/listings/{listing_id}/properties is a globally unique identifier. Your belief that parts of that id have global meaning outside the context of that identifier is irrelevant - that path is the identifier for the resource.

And having hierarchical paths is useful because you can do things like manage permissions on parts of the hierarchy - users might have permission to check listings in certain shops and we can characterize that as them having permission on /v3/application/shops/{shop_id}/listings/*.

Directory structures of resource identifiers are good and logical and not a ‘bad’ API design practice at all. You might as well argue the UNIX file system is a bad design because all the files have a unique inode id so paths are completely unnecessary.

stickfigure · 2 years ago

You apparently have not actually used Etsy's API.

No, the shop id is not in fact part of the globally unique identifier of an Etsy listing, and the properties are not dependent on the shop. Etsy listings have a 1:N relationship with Etsy shops.

The API was a mistake, which they are slowly correcting - they've already changed:

    GET /v3/application/shops/{shop_id}/listings/{listing_id}

to:

    GET /v3/application/listings/{listing_id}

...and I presume they will eventually change the rest of the listing-related endpoints over time.

Managing permissions using the hierarchy of a URL is silly at best, dangerous at worst. The first thing any attacker will do is plug in an alternative shop id and see if it grants access to the non-permitted listing. If permissions are attached to the shop (and for Etsy, they are) the server needs to load the listing, figure out the associated shop, and then check permissions. The client cannot be trusted to provide the correct shop id, so there's no point in asking for it.

jameshart · 2 years ago

That’s a critique of Etsy’s API not of good REST resource identification.

No plugging in a shop you have permission to doesn’t work if your resources are hierarchical any more than plugging ~/passwd let’s you read /etc/passwd because you have read access to your home directory. Those are different resources and one of them exists and is locked down and the other one doesn’t exist.

mekoka · 2 years ago

> Managing permissions using the hierarchy of a URL is silly at best, dangerous at worst.

Or perhaps what you call silly is just you being unaware of what you don't know. There are valid cases to handle permissions using the structure of URLs. As well, the danger you allude to comes from handling it naively. Even the hypothetical attack you suggest might be among the first thing any non-tech savvy person might think of trying.

The scenario you're describing above is simply one of dealing with redundant information in a situation where inferring the whole from the part is not detrimental (for the platform). A case can certainly be made that with that simplification, some optimization opportunities are also lost. Perhaps Etsy doesn't need them. Others might.

> The client cannot be trusted to provide the correct shop id.

The client cannot be trusted period. If I provide a signed cookie that contains a list of authorized shops and they return something else, good thing that cookie is signed. Also good thing the cookie contains the shops, no need to touch the disk if the URL doesn't match the list.

snapetom · 2 years ago

You apparently haven't read Fielding's paper. Etsy isn't doing it right. If you read Fielding's paper, OP's point is correct. The whole URL is a resource.

pmontra · 2 years ago

The author gives a reason for that recommendation

> [having shop_id in the URL] inevitably causes problems when your invariant changes down the road - say, a listing moves to a different store or can be listed in multiple stores.

Basically the choice is between having a perpetual unique URL to a listing or multiple ones, maybe valid at the same time and some of them maybe invalid in future, when a listing is removed from a shop.

A visitor with a valid unique listing id will always be able to look at the product. If there is a shop id in the URL that URL might become invalid and the visitor loses access to the product and had to search for it again, adding friction. With the global unique is the visitor will discover that the product is offered by another shop (maybe a new one from the same tenant?) which is usually not important.

Permissions for the listing could be handled by matching the shops a user has access to with the shops the listing belongs to.

jameshart · 2 years ago

That’s what 302 responses are for.

REST does not mean ‘parameters in the path not in the query string’.

mvdtnz · 2 years ago

You're sacrificing usefulness for purity. Never a good bet. I agree with the author.

jameshart · 2 years ago

I’m not arguing for purity I’m arguing that these guidelines are not good guidance for designing ‘REST APIs’.

If you are designing a ‘REST API’ you have already committed to ‘purity’. If you follow this guidance you are not designing a REST API you are designing a JSON over HTTP api with parameters in the query string.

schema: oneOf: - $ref: '#/components/schemas/Object' - $ref: '#/components/schemas/ObjectWithMetadata' - $ref: '#/components/schemas/ObjectWithChildren' - $ref: '#/components/schemas/ObjectWithChildrenAndMetadata

Some good points - particularly about not returning arrays (I've made that mistake!)

But I feel 410 instead of 404 is pretty controversial:

> There are many layers of software that can return 404 to a request

Anything in your stack can return any HTTP error code - I don't see why 404 is special.

> When calling (say) GET /things/{thing_id} for a thing that doesn't exist, the response should indicate that 1) the server understood your request, and 2) the thing wasn't found. Unfortunately, a 404 response does not guarantee #1.

The server is free to return other codes for other classes of problems. The server could return 400 for a bad request, and leave 404 for "thing wasn't found", indicating it understood the request but it wasn't found.

Also surprised not to see RFC 7807 / RFC 9457 (Problem Details) not mentioned in the "structured error format" section.

DSMan195276 · 2 years ago

> Anything in your stack can return any HTTP error code - I don't see why 404 is special.

I'm surprised you don't - in my experience 404's are by far the most common response to get when you haven't wired things up correctly. Sure anything in the stack _can_ return any code and response they want, but you're still much more unlikely to come across a 410 rather than 404. If that unlikeliness saves you support calls down the line then that's pretty good.

layer8 · 2 years ago

With REST 404s due to missing resources (non-existing ID), you should generally get corresponding error information in the body (as also described in TFA), and clients should log/display that information. That should make enough of a difference. There’s a lot of “should” here, of course, but instead of teaching developers to not use 404, it would be better to teach them to create and handle error responses appropriately.

nunez · 2 years ago

There are two reasons behind 404 being a bad response code to use for empty results.

- Did I get a 404 on this endpoint because the endpoint doesn't exist? Or did I get that because the object I was looking for doesn't exist? Great, I need to dig into the response body to find out, indicate that I can either get a 200 or a 404 with this endpoint, and deal with the odd case where the API returns HTML regardless of the MIME type in the Accept header if the endpoint itself is not there because "fuck you, couldn't be bothered".

- Some HTTP libraries will consider anything that's not a 1/2/3xx an error. That can be annoying to deal with.

nedt · 2 years ago

If anything that's not 1/2/3xx is a problem then 410 won't be a solution. And I doubt that your http library having issues to handle 4xx can handle 1xx correctly.

theandrewbailey · 2 years ago

That rule is a hot take.

> You could use 404 but return a custom error body and demand that clients check for a correct error body. This is asking for trouble from lazy client programmers. It might or might not be "your fault" when clients see eventually inconsistent data, but the support calls they send you will be real.

Make sure that your 404 responses were always documented, then tell them to RTFM.

stickfigure · 2 years ago

The problem is that the support call comes in as "your DELETE call isn't actually deleting". Sure it's not your fault, but it imposes a cost on you to investigate. And of course the first time you go directly to RTFM without checking will be the time it actually is your bug.

404 is special because it's so incredibly common. Why take the risk? There are other perfectly good error codes that - in practice - don't have this issue.

noiv · 2 years ago

I prefer to consider 404 as protocol error and missing thing as business error. That way 404 signals wrong endpoint and 200 + error message + empty result set signals wrong id.

Or even more simple: Anything other than 200 means check infrastructure docs and if you don't like the 200 check the business requirements.

stickfigure · 2 years ago

As a client I generally dislike APIs that use 200 for error conditions. The problem is that API implementors often change the structure of the response.

    GET /thing/THG123
    # on success:
    {"id":"THG123", "name":"thingie"}
    # on failure:
    {"error":"no such thing"}

Working in typed languages, this requires parsing the response, determining success or failure, then reparsing the response into the appropriate type. Annoying.

Of course it's not always like that, some APIs will put both the error and data in a wrapper object and one field or the other will always be null:

    {
        "error": null,
        "result": {"id":"THG123", "name":"thingie"}
    }

This is less annoying but it's still tedious. We could eliminate the wrapper if we only had an out-of-band signal to indicate whether the client should expect a success response or an error response... like maybe an HTTP status code? I mean, it's right there, why not use it?

esafak · 2 years ago

> Some good points - particularly about not returning arrays.

I don't get that one; why is an object with an array property more evolution friendly than an array of objects?

layer8 · 2 years ago

Because you can add new properties for response-level global information on an object, but not on an array.

alxmng · 2 years ago

“RESTful” API design is mostly bike-shedding.

There’s no standard. Every REST API looks different. Clients have to refer to documentation anyway, so consistent URL patterns achieve nothing. People waste large amounts of time over totally inconsequential minutiae like whether to use singular or plural words in URLs.

Separating idempotent calls from non-idempotent calls is useful, but REST overcomplicates this. All that’s needed is read and write calls, yet REST has get, post, patch, put, delete…

REST is also inefficient. Clients could read the data they need in one HTTP request, but most “RESTful” APIs force clients to make many requests for the sake of what is essentially aesthetics.

tkiolp4 · 2 years ago

Agree. But I pick REST (or “json over http”) any day of the week instead of graphql, soap, grpc, etc.

nine_zeros · 2 years ago

Graphql just for the sake of graphql is a disaster for backend engineers.

marcosdumay · 2 years ago

You are supposed to do use those 3 through some kind of heavy tool, while it's well understood that you do rest with just an http library.

Rest is simpler, but comes at the cost of a lot of nice things like automatic endpoints generation and type verification. The problem is that the heavy tooling tends to not be there or not work correctly. But this is not a win for that kind of simplicity, that's a reason to improve the protocol design.

waynesonfire · 2 years ago

What's your view on this,

https://news.ycombinator.com/item?id=38103310#38104983

plugin-baby · 2 years ago

JUP - Just Use POST

recursivedoubts · 2 years ago

I’d just like to interject for a moment. What you’re referring to as REST, is in fact, JSON/RPC, or as I’ve recently taken to calling it, REST-less. JSON is not a hypermedia unto itself, but rather a plain data format made useful by out of band information as defined by swagger documentation or similar.

Many computer users work with a canonical version of REST every day, without realizing it. Through a peculiar turn of events, the version of REST which is widely used today is often called “The Web”, and many of its users are not aware that it is basically the REST-ful architecture, defined by Roy Fielding.

There really is a REST, and these people are using it, but it is just a part of The Web they use. REST is the network architecture: hypermedia encodes the state of resources for hypermedia clients. JSON is an essential part of Single Page Applications, but useless by itself; it can only function in the context of a complete API specification. JSON is normally used in combination with SPA libraries: the whole system is basically RPC with JSON added, or JSON/RPC. All these so-called “REST-ful” APIs are really JSON/RPC.

respectfully, https://htmx.org/essays/#hypermedia-and-rest

cxr · 2 years ago

I'm on your side, but every time you say "a hypermedia" it kills me.

I'm not a native English speaker so, honest question: isn't h a consonant with its own distinct sound, so a instead of an?

sorry, an hypermedia

Similarly if I ever heard someone say _an_ hotel...

AlexandrB · 2 years ago

Found a contradiction that I don't understand. From rule 1:

    # GOOD
    GET /products   # get all the products
    GET /products/{product_id} # get one product
    
    # BAD
    GET /product/{product_id}

But then in Rule 2:

    GET /shop/{shop_id}/listings              # normal, expected

Shouldn't that be "/shops/{shop_id}/listings"? Or is it plural only if you can actually GET the path (i.e. there's no GET for just "/shop") and otherwise it should be singular?

robertlagrant · 2 years ago

Plural or singular seems far too marginal to be good or bad. I use singular.

iterati · 2 years ago

Same. My table names are also singular.

prisonality · 2 years ago

I use singular too, but I always wonder to those sticking with plural - what's the convention for words which plural and singular are the same.

ie - like 'staff' or 'species' or 'aircraft'.

Then I can add suffix to those singular ie - 'staffList', 'speciesList' etc

This is the Etsy API, but actually was a typo on my part. They use the plural /shops (as I showed in the other Etsy example). I've corrected the original article, sorry about that!

elevation · 2 years ago

Rule #1 is terrible advice.

Avoid plural nouns in English API endpoints because English is full of irregular plurals. For example:

goose -> geese child -> children index -> indices vertex -> vertexes analysis -> analyses

This makes English plurals unpredictable especially for for non-native speakers and hurts API consistency and discoverability.

Also consider that for a CRUD interface you may need the singular form anyway (POST api/student/create), and adding the plural means doubling the API route namespace.

It's cleaner and simpler to stick with singular nouns.

You use plurals anyway to fetch collections:

    GET /students

So you can't escape the problem unless you want `GET /child` to fetch multiple children.

Also, you should avoid verbs in URLs (IMHO, of course). You're adding to the students collection, so post to students:

    # BAD
    POST /student/create

    # GOOD
    POST /students

3cats-in-a-coat · 2 years ago

An API is not an essay, in OOP you write Array<Student> and not Array<Students> and yet you understand the type is about an array of students. Getting hung up on grammar in an API is probably the dumbest problem to have.

If you think `GET /student` is confusing, or more importantly, structurally restrictive as an API, you can think about it as `GET /student/filter` where the "filter" may be a specific student id, or a range of ids, or other conditions such as `GET /student/top` or `GET /student/graduated` and then all students will be just the filter "all" or: `GET /student/all`.

As for `POST /student/create`... it doesn't matter. To use one of Fielding's own examples from his blog, how'd you turn a lamp on and off via REST? Would you be like `POST /lamp`? No. It's unclear WTF is happening.

jordanrobinson · 2 years ago

While I do agree with this in almost all cases, I have found scenarios where there are actions that don't map easily to a HTTP verb and need something more explicit.

What I've generally done in these cases is pretty similar to https://cloud.google.com/apis/design/custom_methods which also explains the problem better than I can.

I'd be interested as to how you'd solve some of these problems without an explicit verb in the path.

bonzini · 2 years ago

"POST /students" is a create action, but verbs are fine for individual entities, for example "POST /students/ID/enroll".

how do you differentiate between plural vs singular of:

`GET /staff`

pavlov · 2 years ago

Does “GET /students” return all the students in the system? Probably not.

So in fact you’re fetching some subset of students anyway, and the size of the returned set might be one or zero depending on your query.

Given that, “GET /student” seems just as meaningful because neither the singular nor the plural can fix the ambiguity about what you’re actually getting.

marcellus23 · 2 years ago

> for a CRUD interface you may need the singular form anyway (POST api/student/create)

Why? What's wrong with api/students/create?

> Avoid plural nouns in English API endpoints because English is full of irregular plurals.

I don't buy this. I mean, yes, it's true, but how often do people really need to write these endpoints after initially writing the client code?

rswail · 2 years ago

Verbs in a URL are an API "smell" for me.

URLs refer to a resource that you can manipulate. What resource is /students/create referring to?

takinola · 2 years ago

You really should not include the action in the URL ie rather than

GET api/student

POST api/student/create

DELETE api/student

it should be

POST api/students

GET api/students

DELETE api/students

> index -> indices vertex -> vertexes

I guess it’s in support of your point, but if you’re going to pluralise index as “indices”, why wouldn’t you use “vertices” for vertex?

golergka · 2 years ago

It gets even more unpredictable for everyone involved when it's a non-native speaker who writes the API schema.

physicsguy · 2 years ago

I’d add:

* If you’re going to forbid people changing a parameter with a PUT or PATCH request, then the schema for these shouldn’t list them as parameters. This seems to creep in to APIs constantly as people are lazy and will use the same serializer method as for POST with an additional check somewhere in the code that changes the response. Just don’t do it!

* Don’t change the response format based on query parameters. It makes it hard for typed languages to use the API because the client has to handle all of the weird response types you’ve got. Inevitably you end up with more and more getting added and it any client becomes crazily complicated. 99% of the time it’s not worth the bandwidth saving - and if there’s lots of useless information that clients don’t want, it’s worth thinking about whether the API design is right in the first place.

* Stick to one mechanism for doing things. Pagination and sorting behaviour should be the same for all endpoints. The end user doesn’t care that you’re a hip microservices company where teams don’t talk to each other - if the APIs behave weirdly and inconsistently between themselves, it will be hard to use.

janfoeh · 2 years ago

I very much agree with your first and third point, from experience. As for the second one — if consuming dynamic data structures is hard in typed languages, maybe they are not the right tool for that particular job?

What I have seen is endpoints trying to corral their responses into one-size-fits-all schemas in the situation you're describing, with predictable outcomes. Lots of overhead in most situations, tricky documentation, lots of optionals.

Under that premise, I have to say that at least for generic APIs with many differing clients, the idiosyncrasies of typed-language clients would not rank too highly on my list of design considerations — not when they are in the way of simpler, easier to understand responses.

Joker_vD · 2 years ago

As for the second point, that's what Accept header is for. And I personally never had much trouble in Go with deserializing all those "weird response types" but it may depend on one's coding style.

> I have to say that at least for generic APIs with many differing clients, the idiosyncrasies of typed-language clients would not rank too highly on my list of design considerations

Hey, would you like to consume an exchange format that has meaningful distinction between strings and atoms? Those come from the dynamically-typed languages area!

So, for 2, what I mean is stuff like:

    GET /api/object/<id>?withAdditionalMetadata=1&expandChildren=1&.....

So then the OpenAPI schema has to be something like:

So inevitably the client ends up being quite complex to handle this.

switch007 · 2 years ago

fsaintjacques · 2 years ago

I highly recommend anyone to read Google's [AIP](https://google.aip.dev/). There's even a grpc schema linter for it. Put more focus on the resource data design than nitpicking on transport details. I would consider the best lessons to be:

- Optional but supported user defined identifiers, it's so frustrating to work with API that passes you back an identifier.

- String identifier (names) for resources, with some kind of type namespacing, i.e. the prefix in the author's document - Consistent set of fields (create_time, update_time, annotations, ...)

- Avoid dynamic map (this is a JSON self-inflicted wound)

aleksiy123 · 2 years ago

Second this. Reading this while working at google made me better design. Some that stand out to me are.

Resource Oriented Design: https://google.aip.dev/121

Declarative Friendly APIs: https://google.aip.dev/128

Declarative friendly makes writing scripts, pipelines so much better because of idempotency. It also pairs very naturally with resource Oriented design.

Long Running Operations: https://google.aip.dev/151

LROs are applicable to any request that runs longer than a second or a couple of seconds. Having a unified interface can be very powerful for implementing offline task workers and pipelines.

Filtering: https://google.aip.dev/160

This one is probably controversial as it's makes implementing basic filtering quite a bit harder. I haven't quite seen the issues it's supposed to solve play out in practice but it's interesting nonetheless.