#StandWithUkraine

Russian Aggression Must Stop


Web API patterns I dislike

2023/08/03

Over the past year and a half I've been working in the area of "developer experience" at my day job, specifically working on simplifying and abstracting telecommunications APIs and packaging them into a format that regular developers who cannot name all of the components of a 5G network architecture could understand.

Doing this, I've been exposed to various kinds of APIs, I've done a little bit of design of my own and I've worked on creation of SDKs, software libraries essentially, which consume those APIs and expose them as programming language patterns: classes, methods, models etc.

Because of this, I feel like I've developed some kind of an idea or at least a few strong opinions about how APIs should and shouldn't be designed. I'll be focusing on the API patterns I particularly dislike for this post, because pointing out annoyances is much easier than proposing good ideas.

Forgetful APIs

Sometimes I am presented with a CRUD (create, read, update, delete) API, which takes in an object with various parameters on creation. Then, when I fetch the very object that was created, I get back a different model with some of the fields stripped out.

Suppose you create the following resource with POST /new-resource:

  {
    "field1": "data",
    "field2": "data",
    "notifyMeEvery": "5min"
  }

And then a GET /resource/1 gives you back:

  {
    "field1": "data",
    "field2": "data",
  }

This is quite annoying, especially from SDK development point of view, because now you either maintain two models, one for creating an object and one for retrieving it, or you end up with fields that are unset sometimes and set other times.

Optimally, an API remembers all of the object fields and passes them back to you when you ask for the object, so that you can fully recreate the object that went out. This may be helpful if you later decide to update some of the information of that object or you decide you want to create a small variation of an existing object in your information system. It also just generally gives you a better sense of security and confidence when you know that the data you put into that object actually made it into that object.

Mixing results in body and headers

This one is just really weird to me, but it comes up every now and then.

A typical JSON API will have its response data in the response body in the form of a JSON object. It may contain things like resource IDs and the various fields of the relevant data object.

However, some APIs I've seen arbitrarily decide that the response body is not enough for them and they insist on putting necessary data into HTTP headers in addition to the content that is already in the body.

For example, I've seen APIs that omitted resource IDs from the body and instead put them into a LOCATION, RESOURCE or ID response header.

You might be able to guess how fun these APIs are to use unless you read the API documentation very carefully. You might make an API request and assume that some information is just missing unless you knew to look under the response headers.

Technically speaking this doesn't make the API harder to use, API specification languages like OpenAPI Specification allow you to be very clear about this and code generators may even produce code that handles this decently well and even hand-rolled code can quickly and easily access the information from headers. But it's such a baffling pattern that I just don't understand why you'd ever opt for this over just adding one more field into your response body.

Resource ID is not an ID

Speaking of resource identifiers, once upon a time some smart web API designer decided that it would be cool if APIs returned URIs instead of boring old resource IDs.

This means that the object you get back from the API might look like this:

  {
    "id": "https://this-is-the-full-url.com/resources/resource/1",
    "field": "data",
    "so": "on"
  }

Now, this can actually be quite handy in some cases, especially if your API follows RESTful patterns where every resource has GET, PUT/PATCH and DELETE endpoints to manage an individual instance of a resource.

However, where this is less fun is APIs which aren't properly RESTful. These APIs will point you to the GET endpoint of the resource but the PUT and DELETE might exist somewhere else entirely. Now when you are abstracting this API, you need to perform string manipulation every time you create the resource so that you can extract the resource ID from the field for use in with PUT and DELETE.

Sometimes it's honestly just easier to return a plain old UUID or whatever internal field ID you may be using to identify your resources rather than trying to be helpful in unhelpful ways.

Webhooks

Webhooks are a fairly common pattern and probably decently well-liked too. But in case you are not familiar with them, they are a method of asynchronous communication through HTTP by having one side subscribe to a webhook for notifications and the receiving server will then send out an HTTP request of its own to the URL supplied by the other party.

At one point in time if you were just going to do your communication with HTTP, this was basically your only way of asynchronous messaging. And since there are a lot of things that require more time than an HTTP timeout period will allow, webhooks are pretty much everywhere. Telecommunications APIs in particular are absolutely full of them. However, webhooks come with some problems that can make them annoying.

Firstly webhooks will require that the client become a server at the same time. This is not so bad if you are developing specifically against this particular webhook or your application happens to be running an HTTP server anyway, but it is very annoying if you are just trying to play around with a web API casually. Webhooks also assume that your server is accessible, which means that using them requires you to open ports on your firewall or buy cloud resources to run your application on. Oh, and debugging why you are not receiving notifications from your webhook is a really fun experience, because it can go wrong in so many ways and you'll probably never have a log message to help you.

Because of this HTTP server element, webhooks also architecturally take over your application. Trying to shove a web server into an abstraction over a webhook takes some serious doing and often ends up pretty leaky, not to mention opening HTTP servers behind a user's back is a.) not going to work and b.) not very polite. So, you instead just let the user set up an HTTP server and now your application architecture is forced to be that of an HTTP server.

And if you ever encountered callback spaghetti in JavaScript, I have an entirely new kind of callback spaghetti for you in your webhook notification handler. If you were particularly smart, your webhook might allow multiple handlers to be set up for different purposes. If not, now the notification handler endpoint will automatically violate the Single Responsibility Principle and try to handle different cases and life-cycles using a soup of if-else-if or switch case statements.

I know that I probably cannot talk people out of using webhooks because they are so simple, but I really wish people considered things like Server-Sent Events or Websocket streams more. Hell, sometimes I feel like it would be better to just provide an endpoint to poll against, since at least that can be hidden decently easily in an SDK without forcing the user to turn their app into a full-blown HTTP server with all the worries that come with it.

Long PDF tables for API documentation

Not really an API pattern, per se, but a relevant annoyance that I run into fairly often is API documentation in the form of long lists or tables in PDF documents. This stuff tends to happen when there's a need to ship some documentation about a product to a customer and in the past all documentation has been PDFs.

PDF tables and lists just don't lend themselves for easy-to-consume API documentation. Sometimes the tech writers are nice about it and provide some example curl lines or something like that, but the other times it's just long lists of parameters with barely a word of explanation what each of them actually means. Combine this with poor naming conventions or something that originated from a telecommunications standards board and you are going to have really tough time figuring out what you are actually reading.

If you are going to provide exhaustive documentation about your API, I really suggest just going for an IDL (interface description language), such as OpenAPI Specification, Smithy or GraphQL. These often have the benefit of providing helper tools for you to explore and test the API and will help you do code generation if you need to get started quickly. You could even throw them onto some kind of an API portal tool to make them easy to browse.

SOAP

I have one positive thing to say about SOAP, which is that luckily I don't run into it very often. However, they do still exist and the documentation that still exists for them wastes so many PDF pages.

There's good reasons we are collectively moving away from SOAP.

Conclusion

There's a few API patterns that I've come across and which I feel like have caused me enough trouble for me to complain about them.

Maybe this was informative to you, maybe it was not. Maybe one day I will do more posts about good web API design, maybe I will not.

But let us conclude here, thank you for reading and hopefully the APIs you build or interact with are helpfully designed!

>> Home