Modeling Errors in GraphQL
GraphQL excels in modeling data requirements. Modeling errors as schema types in GraphQL is required for certain kinds of errors. In this post, let's analyze some cases where errors contain structured data apart from the message and the location information.
GraphQL Errors
GraphQL is an excellent language for writing data requirements in a declarative fashion. It gives us a clear and well-defined concept of nullability constraints and error propagation. In this post, let's discuss how GraphQL lacks in certain places regarding errors and how we can model those errors to fit some of our use-cases.
Before we dive into the topic, let's understand how GraphQL currently treats and handles errors. The response of a GraphQL query is of the following structure -
{
"data": {
"foo": null
},
"errors": [
{
"message": "Something happened",
"path": ["foo", "bar"]
}
]
}
Error extensions
The Schema we define for GraphQL is used only in the data field of the response. The errors
field is a well-defined structure - Array<{ message: string, path: string[] }>
in its simplest form. The Schema we define does not affect this Error.
Let's say the client queries a field using an ID. How can the client know from the above error object whether the Error is due to an Internal Server Error or the ID is Not_Found? Parsing the message is a no-go because it is not reliable.
Luckily, in GraphQL, there is a way to provide extensions to the error structure - using extensions
. The error.extensions
can convey other information related to the Error - properties, metadata, or other clues from which the client can benefit. As for the above example, we can model the response to be -
const err = {
data: {},
errors: [
{
message: "Not Found",
extensions: {
code: "NOT_FOUND",
},
},
],
};
Errors for Customers
When we have a GraphQL API that delivers content to the end-user - the customers, i.e., we have two levels of users -
- The Developer or user of the API - UI/UX/front-end developer.
- The Customer or end-user - The one who does not see any technical layers but gets the product's experience in its most presentable format. The Front-end developer builds this experience using data from the GraphQL API.
Since using the word user might be confusing, from now on, Developer will refer to the front-end developer, and Customer will refer to the end-user.
When we have an API whose data is directly consumed by two levels of these users - Developer and Customer, there might be different error data requirements. For example, let's take mutations
- when the Customer enters an invalid email address,
- The Developer who uses the GraphQL API needs to know that the Customer has entered an Invalid Email address via a parseable format - a boolean or enum or whatever data structure you choose will work except parsing the error message.
- The Customer needs to care about the error message in a nicely styled format close to the text box. Also, for different languages or locales, the error message needs to be in the corresponding translated text.
Let's try to model this using the error extensions discussed above -
{
"data": {},
"errors": [
{
"message": "Die E-Mail-Addresse ist ungültig",
"extensions": {
"code": "INVALID_EMAIL"
}
}
]
}
While this would work, we soon end up in a case where multiple input fields in a mutation can be invalid. What can we do here? Do we model them as different errors or fit everything into the same Error.
The Customer errors still need to be usable by the Developers to propagate it. The front-end developers are the ones ultimately transforming our data structures to UI elements. So they need to understand the Error to highlight that input text-box with a red border. So, to make it easy, let's try modeling these as a single error with multiple validation messages -
{
"data": {},
"errors": [
{
"message": "Multiple inputs are invalid",
"extensions": {
"invalidInputs": [
{
"code": "INVALID_EMAIL",
"message": "Die E-Mail-Addresse ist ungültig"
},
{
"code": "INVALID_PASSWORD",
"message": "Das Passwort erfüllt nicht die Sicherheitsstandards"
}
]
}
}
]
}
The codes INVALID_EMAIL
and INVALID_PASSWORD
will help the front-end dev or Developer highlight the field in the UI, and the message will be displayed to the user right under that text-box.
All this leads to a complicated structure very soon and is not as friendly as the data modeled with a GraphQL schema.
Why you no Schema?
The biggest problem we face in modeling these in the extension object is that it's not discoverable. We use such a powerful language like GraphQL to define each field in our data structure using Schemas, but when designing the errors, we went back to a loose mode of not using any of the ideas GraphQL brought us.
Maybe, in future extensions of the language, we can write schemas for Errors as we write for Queries and Mutations. The developers using the Schema get all the benefits of GraphQL even when handling errors. For now, let's concentrate on modeling this using the existing language specification.
Errors in Schema
We want to enjoy the power of GraphQL - the discoverability of fields of data, the tooling, and other aspects for errors. Why don't we put some of these errors in the Schema instead of capturing them in extensions?
For example, the mutation discussed previously can be modeled like this -
- mutation returns a
Result
type Result
type is aunion
ofSuccess
,Error
.- Error schema contains necessary error info - like translated messages, etc.
type Mutation {
register(email: String!, password: String!): RegisterResult
}
union RegisterResult = RegisterSuccess | RegisterError
type RegisterSuccess {
id: ID!
email: String!
}
type RegisterError {
invalidInputs: [RegisterInvalidInput]
}
type InvalidInput {
field: RegisterInvalidInputField!
message: String!
}
enum RegisterInvalidInputField {
EMAIL
PASSWORD
}
This structure looks exactly like the one we designed above inside error extensions. The advantage of modeling it like this would be that we are using the benefits of GraphQL for errors.
When you have a hammer,
Now, with the idea of modeling errors as Schema types, we are left with more questions than answers -
- Should I model all errors as GraphQL types?
- How should I decide when to use error extensions and when to use GraphQL types for modeling errors?
- etc.
When we have multiple teams maintaining the platform, many people contribute and think about modeling different parts of the Schema. There should be clear definitions for the different aspects of the existing data structures and the idea behind how we reached such solutions. The design and the Schema are changed far fewer times than it is read/used.
GraphQL gave us the mindset of "Thinking in Graphs". If we suggest a new way of modeling errors, we need to talk about this mindset and its ideas. Not all errors fit into this modeling (error types in Schema), and it will make the GraphQL API less usable if we approach it by looking at all the errors as nails.
Classification
To model errors, let's try to find some analogies. I want to think about modeling these errors in terms of programming language errors. For example,
- Go: Error vs. panic
- Java: Error vs. Exception
- Rust: Error vs. runtime exception
The programming languages also model errors as two variants. In one model (an error
type in go), we inform the Developer who uses the function. The Developer decides either to handle it or to pass it through. In the other variant (a panic
in go), we skip everything and bring the program to a halt. We inform the end-user of the program that something has happened. This small variation captured as two different things help us understand the intention of data in errors.
Part 1. Action-ables
What is an error? It tells us that something is wrong and gives us some information on what action can be taken. We can think of errors as containers of action-ables. When modeling them, we classify them into different groups depending on who can take that action.
In GraphQL context, for some errors, the front-end takes care of it - either by a fallback or a retry. In case of some other errors like the invalid inputs, the front-end cannot take action; only the Customer who entered the invalid input can fix the input.
Instead of modeling the errors loosely, we now have a concrete use-case - model it for whoever can take action.
Part 2. Bugs in the system
Errors convey information - either to Developer or Customer. If the Error is conveying some bug in the system, it should not be modeled as schema error types. Here, the system means all the services and software involved in our entire product and not just the GraphQL service. It is essential because it separates the end-user / Customer vs. Developer who uses the API - the end-user looks at our product as one thing, not many individual services.
In the 404 Not Found
case, if we had modeled the errors as schema types, it would make the Schema less usable. Let's take a product look-up use-case -
{
product(id: "foo") {
... on ProductSuccess {
success
}
... on ProductError {
error
}
}
collection(id: "bar") {
... on CollectionSuccess {
products {
... on ProductSuccess {
success
}
}
}
... on CollectionError {
error
}
}
}
This way of handling errors at every level is not friendly for front-end developers. It's too much to type in a query and too many branches to handle in the code.
Part 3. Error propagation
We also have to remember not to disrupt GraphQL semantics of error propagation. If an error occurs in one place in the query, it propagates upwards in the tree till the first nullable field occurs. This propagation does not happen with error types in Schema. It is essential to model these schema error types for only specific use-cases. We go back to Part 1: Action-ables - we design these types for actions that the end-user or Customer can take.
The Problem type
Naming is half the battle in GraphQL. Since the name error
is already taken by the GraphQL language (response.errors
), it would be confusing to name our error types in Schema as Error
. As we did before to look for inspirations, there is a well-defined concept in RFC 7807 - Problem details for HTTP API. So, we will call all our errors in Schema as Problems and, as it has always been, all other errors as errors.
The above register schema with the Problem
type would look like this -
type Mutation {
register(email: String!, password: String!): RegisterResult
}
union RegisterResult = RegisterSuccess | RegisterProblem
type RegisterSuccess {
id: ID!
email: String!
}
type RegisterProblem {
"translated message encompassing all invalid inputs."
title: String!
invalidInputs: [RegisterInvalidInput]
}
type InvalidInput {
field: RegisterInvalidInputField!
"translated message."
message: String!
}
enum RegisterInvalidInputField {
EMAIL
PASSWORD
}
Problem or Error
Problem refers to the Error as a Schema type. ** Error** refers to the Error that appears in the response.errors
array with an error code at error.extensions.code
.
Case 1: Resource Not Found
404s are bugs in the system in case of navigation. If the user navigates from the home page to a product page and ends up on a 404 page, some service selected an id that leads to 404 when resolved and this has most likely been the case upon selection. It's not something because the user entered some input. Also, these errors need to be propagated. So, this becomes an Error with an error code as NOT_FOUND
and not a Problem.
Case 2: Authorization
Authorization errors are of the Error type and do not fit a problem type. Here, the action taker looks like it's the Customer who needs to log in. But, the UI can take action here and show a login dialog box to the Customer. In apps, the app decides to take the Customer to the login view. The action belongs to the Front-end and only then the Customer. So, we model it for the developer/front-end as an Error with error code NOT_AUTHORIZED
and not a Problem.
Case 3: Mutation Inputs
Mutation Inputs is the only case where it is crucial to construct Problem types. It contains inputs directly from the Customer, and only the Customer can take action for this. So, we model these errors as Problems and not Errors.
Case 4: All other bugs / errors
Any runtime exception in the code or Internal Server Errors from any backends that the GraphQL layer connects to should be modeled as Error and need not contain an error code. This way, it is easy for the front-end to treat all non-error code responses as Internal Server Errors and take action accordingly - to retry or show the Customer an error page.
Conclusion
We have discussed Problem type as a possible solution where the error object in the GraphQL response does not suffice the use-cases. But we have to be careful about not overusing this for many use-cases where the error extensions already provide enough value.
We have to understand that the Problem type in unnecessary places does make the query and front-end code complicated. Our GraphQL Schema should try to simplify and provide a friendly interface.
Related posts
In case you are interested, here are further posts in the GraphQL series -
- Introduction to how we use GraphQL at Zalando
- Understanding GraphQL Directives: Practical Use-Cases at Zalando
- GraphQL persisted queries and Schema stability
- Optimize GraphQL Server with Lookaheads
We're hiring! Do you like working in an ever evolving organization such as Zalando? Consider joining our teams as a Software Engineer!