087: The JSON API and Orbit.js with Dan Gebhardt
- 01:33 - The JSON API Spec and Pain Points it Solves
- 08:40 - Tradeoffs Between GraphQL and JSON API
- 19:33 - Orbit.js
- 26:30 - Orbit and Redux
- 32:22 - Using Orbit
- 37:24 - What’s coming in Orbit?
CHARLES: Hello everybody and welcome to The Frontside Podcast, Episode 87. My name is Charles Lowell, a developer here at the Frontside and your podcast host-in-training. Joining me today in hosting the podcast is Elrick Ryan. Hello, Elrick.
ELRICK: Hey, what's up, Charles?
CHARLES: How are you doing today?
ELRICK: I'm doing great.
CHARLES: Are you pretty excited?
ELRICK: I'm very excited for this podcast because this is a topic that I've heard a lot about but don't know much about and it just seems so awesome that I'm just very stoked to hear all the details today.
CHARLES: Yeah, me too, especially because of who's going to be giving us those details, he's one of the kindest, smartest, most humble and wonderful people that I've had the pleasure of meeting, Mr Dan Gebhardt. Hello, Dan.
DAN: Hey, Charles. Hey, Elrick. Thanks for having me on. I really enjoyed listening to this podcast. It's nice to be part of one.
CHARLES: It's good to have you finally on the show. We talked over chat and we talked over email and we meet every once in a while and conferences and it's great to get to share more widely some of the great conversations that always arise in all of those contexts. For those who don't know you, you are a founder at Cerebris and that is your company, which is involved very heavily in a lot of open source projects that people are probably familiar with. One of them that we're going to be talking about today is JSON API. I bet most people didn't know that you are one of the biggest driving factors behind both the specification and several of the implementations out there.
DAN: Yeah, that's been a pretty core focus of my open source work for the last few years. Actually JSON API Spec, which is perhaps a somewhat confusing name for those who aren't familiar with it. It was started by Yehuda Katz in almost three and a half years ago, I think now and it hit 1.0 a couple years ago and has stabilized since then and we've seen a lot of interesting implementations on top of it. There are some exciting stuff that's actually coming soon to this Spec that I'd like to share with you guys today.
CHARLES: To give us a little bit of context, why? What pain am I experiencing that JSON API is going to solve or it's going to address or give me tools to deal with?
DAN: One of its prime motivators is the elimination of bikeshedding. There's a lot of trivial decisions that are made with every implementation of an API and JSON API makes a lot of those decisions for you about how to structure your document, how to include relationships and lengths and metadata in a resource, how to represents relationships from hasOne/hasMany. Even polymorphic relationships have a type of that data. JSON API has opinions about all these things at the document structure level and it also has opinions about our protocol usage, how to use HTTP together with this media type to make requests and for servers to return responses, how to create a resource, how to add resources to relationships and things like that.
CHARLES: Also, it's not just this is a serialization format. It's very much like also delving into the individual interactions and how they should be structured, more about the conversation between client and server.
DAN: Yeah, in that way, it is somewhat unusual as a media type that covers both.
CHARLES: Can you dig into that a little bit because I'm very curious? Something made my ears prick up was when you said, it tells you how to, for example add relationships to a resource. What would that look like?
DAN: A lot of the influences behind JSON API are hypermedia-related. It's influenced by RESTful principles and includes a lot of hypermedia aspects. One aspect is how a resource represents relationships in terms of the data in document, the type and the ID that specify a linkage to that another resource in the same document but it can also include links to discover those relationships. There's a self-link for a relationship and a related link for relationship and the self-link will return the data for that relationship in the type/ID pairs.
The related link will return the related resources. The Spec doesn't have strong requirements or any requirements about URL usage but instead, it describes where to find resources through these hypermedia links. If you want to say, add records to a relationship, you'd follow the self-link for that relationship when that was returned with a resource. Then you would send a post to that endpoint and you would include the relationship data in the terms of type and ID pairs. It gets down to that specifications so that removes the ambiguity of how to interact with these resources and mutate them and retrieve them.
CHARLES: I see, so is there an idea then that you are going to explicitly model the relationships as individual resources? Or is that the recommendation or the requirement?
DAN: The link for a relationship would point to an endpoint, which would then model the relationships that are represented that endpoint, so to say just to speak a little more concretely because certainly, it makes some simple concept sound a lot more esoteric than they really need to be. Let's just talk about an example. Let's say, we're talking about articles and comments and maybe an author. Let's say, you've fetched a collection of articles from an article's endpoint and within the article resource, you would have a relationships number, which would include comments and then comments could have links, which one of the links would be a self-link and a related link. The related link could be followed to then retrieve all the comments for that particular article.
You could also, if you wanted to add a comment for that article, post to the self-link for that relationships. You post to that whatever endpoint that is specified. Maybe it's 'articles/1/comments.' It could be anything that you want. Now, the Spec does have some recommendations to make everything fit nicely in terms of the URL design patterns and such but those are not, by any means required but having those recommendations just eliminates more bikeshedding opportunities. We find it that people who are gravitate towards the Spec really appreciate having a lot of these trivial decisions made for them so even if we don't want to come down and be hard line about requiring those particular answers, we can at least provide some guidance for how things can work together nicely. There's a whole recommendation section on the site for things like URL design patterns.
CHARLES: Right, so things that aren't prescribed but these are best practices that are recognized.
DAN: Yeah, exactly.
CHARLES: A question then that comes to mind, it sounds like JSON API solves a lot of these bikesheds or just kind of comes in and takes one side or the other for modeling both the resources and the relationships between those resources so there's the... I don't want to call it a schema but the boundaries around which resource are very clear and where they live and how they connect together.
I was hoping we could maybe contrast that with some another approach, which is also become very popular and that's the GraphQL approach where you're essential assembling views at runtime for the client. It's very easy to marshal the data that you need to present to your view because you've got only one endpoint, as opposed to having to coordinate between them. I can understand the appeal of that and I was wondering if you have any insight into what the tradeoffs are between the systems and what are some of the capabilities that one can do that the other can't.
CHARLES: Yeah, sure. I'm glad that you brought that up because I feel like GraphQL has become a real juggernaut, at least because of its marketing. It's very effective in being marketed for its use to developers and it's capabilities versus REST, as if a RESTful system can't possibly achieve the same outcome or the same efficiency. I'm glad to compare and contrast the two. To be honest, one of our short term goals is to better tell the story on the JSON API site, which was always a kind of a more technical spec-y site and a marketing site. That hasn't really helped its uptake as much as it could as some of the GraphQL sites are very sleek and polished. Anyway, let's get down to it.
GraphQL allows you to basically define the data that you want for a particular view and that can bring together multiple related resources. It defines a way to specify exactly which fields you want in that graph of resources. We'll just stick with the articles, comments and authors example. You can specify that you want a collection of articles and perhaps the comments-related to that and the authors and you could have it all assembled in a single response.
JSON API also allows you to do just that. It allows you to make requests for multiple related resources to constrain the fields that are returned for each resource and to include all of these related resources in a single document. The main difference in the representation is that JSON API requires that resources only be represented once in a single document. GraphQL may have repetition of resources throughout the document that's returned. For an instance, your articles that may nest authors and those authors like Charles Lowell, may have written three of those articles and that representation of that author is going to be repeated in that JSON API compound document, which is a term for document, which has a primary dataset combined with related resources.
That single author would only be returned once as related resource and the linkage between the primary data and the related resources would be established to type/ID pairs. Instead of having the author represented three times, the same type/ID pairs would just be providing that linkage to the same author and that author resource would only be represented once.
This happens to be ideal for client-side applications that number one, basically want to minimize the size of a payload that sent. Number two, don't want to after-handle repetition of data by doing extra processing of pushing the same record multiple times into a memory store that is keeping that data. I think that GraphQL is well-suited to applications that request data and display that data pretty much as returned. There is no intermediate holding onto that data in, say a memory store for later access. Basically, it lines up well with a component library link React, which wants to display that data that's returned from the server. If it wants to display that collection again, it will simply request that collection again and pretty much throw away the data once it has been rendered.
CHARLES: I can see that. Dan, you and I might be some of the only folks who remember. I don't know if you ever did any Microsoft Access Program.
DAN: Yes, I did, believe it or not.
CHARLES: Doesn't it feel a little bit like the Access pattern all over again, where you have your components requesting data from basically, constructing a query and requesting it --
CHARLES: And then throwing it up on the screen.
DAN: You're going deep there but I do remember that. Definitely, there is that same paradigm.
CHARLES: It's really powerful.
DAN: It is and it's pretty accessible too because it's a direct representation of what you've requested and there's no intermediate processing. I guess the question is, whether that intermediate processing provides some value. Actually, holding onto that data provided some value because as far as I'm concerned, GraphQL is great for that rendering of DOM data, where the data has no meaning except outside of the rendering. But if you want to actually have models that have some intelligence about that data, then you want to use a store to keep those models in and you want to be able to reuse those models for other purposes.
CHARLES: What might be an example? What’s a concrete use case that we can ground this discussion?
DAN: I would say that the big one is offline. You simply can't have just DOM data that's useful in any way in an offline application or an optimistic application, where you are doing some things client-side and only say undoing them if a request fails. But if your data is DOM and only structured for a particular view, then all you can do with that is redisplay that view. But if you understand the schema of your data and that data is available in a store, then regardless of whether you have a network connection, you can actually display that data in different ways. If that same article shows up in a collection in a list, you could also display that article on its own in a different format with more fields. If you want to, say allow editing of that data, you could allow for an editor when your app is offline. Allow changes to be made to that data and then redisplay it because you understand the fields that are in that data.
CHARLES: Right and then at some point later, then spool those changes back to the server.
DAN: That's right.
CHARLES: It almost sounds like, ironically if a system like JSON API where you have very concrete boundaries around each of the underlying resources in your data model, it allows you to essentially do rich-querying on the client and not just the server.
DAN: Yes, that's absolutely true.
CHARLES: Because I feel like what you just described to me it's like, now we have some sort of store over which we can map all kinds of different queries to our own liking and there's no dependency on the server.
DAN: Yeah and if you just want more web app to be pretty much a view representation of what's on the server and without additional intelligence, then GraphQL really lines up well with your needs because any extra processing you're doing is just not valuable to you. But I think a lot of the really interesting things being done in client-side applications are where your client application is pretty loaded with a lot of intelligence and you're out there autonomous and able to make sense of data. In that case, then thinking about the data only as it pertains to views is not nearly as powerful.
CHARLES: Right, so you could do something like that with GraphQL but then you would have to, essentially structure your queries such that they drew the boundaries around the individual resources anyway, rather than composing them on the server. You'd have to query them discreetly into a store and then run your local operations. Then I guess at that point, it's like what are you doing?
DAN: Yeah, you're still doing the extra processing of handling the repetition of any nodes that repeat and such. That's just extra processing you have to do but I agree that you certainly could structure your GraphQL queries to return data that is then loaded, say to a store that really has awareness of the data types but I don't think that is --
CHARLES: But then you're defeating the purpose, right?
DAN: Yeah, it's not its selling point and it's not its strong suit.
CHARLES: You've done a lot of work on the JSON API Spec. JSON API allows you to fetch discrete resources and their relationships but still, keeping one representation of each resource in the payload so it's optimized for wanting to do client-side processing and have intelligence based on these entities, which are in a store. You actually maintain a fairly mature, at this point, framework called Orbit, which helps you do some of these things.
Now, what Orbit does today and I understand that you've got a lot of new features that are really exciting, that are coming down the pike. Before we get into those, what is Orbit and what do you use it for and how does it use JSON API?
DAN: Orbit is a data access and synchronization library, which sounds sufficiently vague because it has a lot of low-level primitives for structuring client-side. Also, actually isomorphic can be run on the server and nodes so it's not even only used for client-side purposes but that was its original purpose. The abstraction that it includes are allowed for synchronization of data changes across multiple sources of data. Source of data might be represented by, say a JSON API server, an in-memory store, an IndexedDB database in your browser, a local storage, all of these sources of data can support an Orbit interface, which provides their access to their data and also broadcasts changes to that data.
In order to coordinate the changes across multiple sources, say to back up all of your data that's in memory to IndexedDB source, you can observe the changes on one source and then sync those changes up with another. For instance, you want to structure an offline application which you have been in-memory store, which uses client-generated IDs, which then syncs up with a backend JSON API source and every change that gets made to the memory store needs to be backed up, you could configure multiple coordination strategies between the sources to make sure that the data flows so that every change that is made to the store is immediately backed up to IndexedDB. If it can't be backed up, then it fails.
You can add some error handling and then when you're online, you can then also sync those changes up with a backend so you're basically pushing those changes that are local to a remote store and you're not slowing down your offline app, which you're communicating with optimistically and then only handling, say synchronization failures when there is a problem. In order to handle those problems, Orbit sources are very deterministic about their tracking of changes and they provide get-like rollback capabilities so you can look at the history of changes to a particular source and reset the history to any point there and basically handle conflicts and merges in a very get-like way.
Often I use cases, the primary driver of Orbit's whole architecture, I realized that it needed to be able to give you the tools to handle any conflicts that happen when changes get sync up. Also, give you the different tools to model all the different places of data is kept in order to support the offline mode. That's a kind of a broad overview of Orbit.
There's a new guide site, OrbitJS.com for those who want to dig a little deeper into it. The data is structured in the JSON API format internally to the store and the standard operations are very much influenced by the standard JSON API protocols that are allowed in the base Spec over creating records and removing records and all that crud for both records and relationships. That's where JSON API comes into Orbit.
CHARLES: Right, I see. The primary use case for Orbit is offline. Is that fair to say?
DAN: Yeah, that was the primary driver, although it's just not the primary --
CHARLES: It seems like you could use this in a lot of places where I might use Redux or something like that, like on the server to model... I don't know, a chat app.
DAN: Yeah, definitely.
CHARLES: I have a bunch of different information streams coming together and how am I going to merge them and make sense of them.
DAN: Yeah, in fact, that it's primitive level. Orbit has essentially an async redux-like model for queuing up changes and applying those changes. The change sets are all immutable. There's actually a lot of immutability use here throughout the library. In order to ensure that the changes that are applied are tracked deterministically, we just can't have those changes mutating on us. There is definitely some overlap with Redux concepts in terms of the general tasks or action concepts in Redux but instead of Redux's synchronous approach, everything in Orbit is async.
CHARLES: What does that mean? Redux is synchronous in the sense that there's a natural order to all actions. For those of us familiar with Redux, are you saying it would be like a store where actions can be dispatched at any time or is it more like, I've got multiple stores happening and I need to resolve them somehow so each one is synchronous? How can I make sense of that?
DAN: In Redux, the actual application of an action is performed synchronously.
CHARLES: Right. You can have asynchronous processes but there is a natural order to the actions that those asynchronous processes yield and then those are applied synchronously to the Redux Store.
DAN: Yeah. To compare and contrast Orbit and Redux, I guess you'd first have to say there's a primary difference of --
CHARLES: I think there are a lot of people are familiar with Redux. I think it's not so much to compare and contrast it but just to use it is as an analogy of like, "Here's how it's the same here. Here's how it's different," because that's compare and contrast.
DAN: There you go.
CHARLES: But not in terms of evaluating them but it's like, "Maybe I should be using this instead."
DAN: Right, they are sort of on different levels, although there are some primitives in Orbit to it and it's shipped across multiple libraries. There are some primitives, I think that could be useful outside of the main Orbit data application. Anyway, the way that Redux state changes are applied, the function is synchronous is all I was getting at and on Orbit, every state change that applied to a source is asynchronous so the result is never applied immediately. You'll always get a promise back and you'll never have that application happen immediately. That's one clear distinction.
Another is that a redux has a big singleton global state for the entire application. Orbit very much has a model of state per source so there can be any number of sources in a particular application and the source might be an in-memory source or might represent a browser storage in XTP or might represented a socket that streaming data in. All of these have state at different, temporarily distinct state that even if they all converge to a common state, the Orbit models separately so that there's a set state per source. I'm just contrasting the global apps state that exists in Redux with the per source state in Orbit.
CHARLES: It sounds like there's nothing that would be fundamentally incompatible of using Orbit really in conjunction with Redux, where Redux is kind of a materialized view of all of your different data sources presented as what you're going to render off of, right?
DAN: Yeah. You could use it in a similar way to Redux Saga, where Orbit fills the role of Saga, where it's doing the asynchronous actions that results flow back into the Redux state.
CHARLES: I'm just imagining having one big global atom, which is your Redux store and now I'm saying, prescribing this is an optimal architecture but I'm saying, one way it could work is it picks and chooses and assembles off of the different sources as new data becomes available. As the states change for those sources, it can be integrated into a snapshot state, which is suitable for rendering or provides one view for rendering.
DAN: Yeah. You're basically talking about the in-memory source, perhaps merged with other applications state, which is not so resource-specific and that is possible to model.
CHARLES: What I think I might be hear you saying is you could also just use another source which is the merge itself.
DAN: Yeah. I'm not sure how much we want to continue this thought exercise because the architecture becomes almost not something I'd recommend. But I would actually like to explore how Orbit and Redux could be used together optimally. I played around a bit with Redux but I have not written a full-fledged application with it, other than a [inaudible] location. I definitely defer to you for Redux best practices and such and how people are using it in real world applications but I'd be really interested to talk that over again soon.
CHARLES: Well, I just certainly don't count myself a Redux expert, although we have developed some applications with it. We'll put that on the back burner or something to explore it later. I will say this, I find Redux to be both wonderful and terrible, kind of in the same way the Java is both wonderful and terrible. We’ll leave it at that.
ELRICK: That was going to be my question. This is why I was very excited to hear about today was Orbit because I've heard so much about it. In terms of the implementation of Orbit into an application, what would that look like from a high-level? Has anyone used Orbit in the production app? Have you built any apps using Orbit?
DAN: Yeah, definitely. There are people using Orbit with React, with Vue, with Angular and with Ember and there's an integration library called ember-orbit which makes Orbit usage really easy in Ember. In a lot of ways, working with ember-orbit feels a lot like working with Ember Data but it allows a lot more flexibility. I suppose one of its strengths and weaknesses is there's a lot of configuration that's possible because there's a lot of possibilities. The internals are exposed of how data get synchronized so you can define your strategies and sync up different sources.
In terms of how it's actually used in an application, you'd start by modeling your data in terms of the resources that are in the application. You'd have a schema that defines your articles, your comments and your authors, just to keep that example going. Then that schema would be shared among all the sources in your application. You would have one source, say that might be the in-memory source and another source that is the representing a browser storage so you could, say swap out either local storage source or an IndexedDB source and use either one to provide that backup roll.
You would declare those sources, you connect them to each other with strategies so that, say when memory storage changes, you would then sync that change to the browser storage source. Then you'd have back up and you'd be able to then, refresh your page and view the same data you were looking at before. Now then, if you probably want to wire up a remote source so that you're communicating with the servers so you bring in JSON API source and you would then set up a new strategy for working with that. You have to decide like, "When my memory storage changes, do I want this change to happen optimistically or pessimistically?" By that I mean, "Do I only want it to appear successful if it's been confirmed by the server."
Depending upon whether you want to be optimistic or pessimistic, you setup your strategies a little differently. If you handle this change pessimistically, you wanted to block success on the successful completion of pushing of that change to your remote server. You have the set of tragedies that define the behavior of your application and then doing your crud operation is probably pretty much directly with your memory source. Then if you wanted to, say do an edit in a form, you might fork the store, now the store keeps its data in an immutable data structures. That forking that store is very cheap so you don't have a bunch of data that's copied. You're just keeping a pointer to that and getting a new pointer to that same immutable data structures. Every time they get changed, there been new references.
There's an immutability under the hood but you're pretty well insulated from the annoyances of working with that immutable data structures. At that form, you make your changes, you then merge your changes back, you'd get a condensed change set of operations that then can flow through your strategies. It flow through to your backup source. It could flow through to the back to the server. I think it would feel pretty familiar for users of Ember Data because there are a lot of the API influences came from that library. But obviously, people are using just plain Orbit with other libraries, with other frameworks and finding it useful there but it definitely involves a little more configuration up front to do all that wiring that might be more implicit in library like Ember Data.
CHARLES: I understand that before we go, there is some pretty exciting new things coming in Orbit. Do you feel like you're ready to mention a couple of those things or they've been kind of mixed in with the conversation?
DAN: Let's see. I have the guides up, which I mentioned, which is pretty new in the last couple of months. In the last year, we did a rewrite and Orbit is now completely in TypeScript and there are no external dependencies. For a while there, I was using RxJS and observables internally and immutable JS so there's now an internal immutable library. It's lighter-weight with fewer dependencies now. I'm excited about that and finally feel like I can recommend people digging in with the guides that are up. I'm hoping to get up the API docs soon.
I will say I'm excited. I just got back from a retreat in Greece. Séb Grosjean who owns the company, BookingSync does this amazing thing with the Ember community, where for the group that's working on Ember Data, he invites them every year to come to his family's place in Greece. He grew up working with his family on his rental properties, which was the inspiration for his company, BookingSync and said, "This is a fantastic opportunity that for us to get together and collaborate in a really nice place," and I had a really productive time this last week.
This is the very first time I had gone. It was just fantastic and I worked with the Ember Data team. Igor Terzic and I spiked out some interesting collaboration between Orbit and Ember Data so I'm really looking for it to see where those go and hopefully, we'll see a little bit more Orbit, either directly or just through influence appearing in Ember Data. I'm looking forward in working more closely with the Ember Data team. We'll see what comes of that.
CHARLES: Yeah, I, for one am very excited to see it. I'm resolved now. I'm just looking at these guides. These look fantastic and I'm resolved to give Orbit, at least a try here, either in some of our applications or maybe trying to spin up some new ones and have it the basis for some of ideas I've been playing with.
DAN: That would be awesome and there's a [inaudible] channel, which I hang out into if you have any questions, if anyone out there does.
CHARLES: Before we go, if anyone is interested in JSON API, is interested in Orbit, is interested in Cerebris, we mentioned a lot of things that in one way or another, map back to you. How do we get in touch to find out more about these different entities/projects?
DAN: I'm at @DGeb on Twitter. My company site is Cerebris.com. Also check out, OrbitJS.com for the new guides. Reach out to me. I'm on the Ember Core Team so I'm also hanging out in the Ember community Slack, depending upon what you want to talk with me about. I'm in all these different places so I love to hear from you all.
CHARLES: All right. Fantastic. We'll make sure that we put those in the show notes and I guess that's about it. Do you have anything else you want to leave folks with, any talks, papers or big news coming around soon?
DAN: Something that we didn't really get a chance to talk about today, which I'm really excited about is JSON API operations, which is an extension to the base Spec, which I'll be proposing very soon. There's a future to the JSON API once it hit 1.0 a couple of years ago. It didn't just stop. We're looking at different ways to extend the base Spec and use it for different and interesting purposes. JSON API operations, I think one of the most interesting ones.
The idea is basically to allow for multiple requests that are specified in the base Spec to be requested in a batch and perform transactionally on the server so the Spec will define how would each request gets wrapped. Each operation very much confirms with the base Spec concept of a request. For implementations, there's a lot of opportunity to reuse existing code for how to handle each particular operation but to provide a whole new set of capabilities by allowing you to batch them together and process and transactional it because it just unlocks a ton of different things you can do, all based on the same base concepts from JSON API. I'm really excited to have something to announce soon about that.
CHARLES: That sounds like it might solve a lot of problems that are always associated with those things. It always comes up. What's our batch API look like? I don't think I've been on a project that didn't have a months-long discussion about that. I ended up like kicking it down the road and I'm just flumping something in place.
DAN: Yeah, all those messy edge cases where people figure out how do we create multiple related records altogether in a single request and people do it ad hoc and do it with embedding and such and want to standardize that in the same way, that we've standardized the base operations.
CHARLES: Well, that is really exciting, Dan. I wish you the best of luck and we'll be looking for it.
DAN: Thanks a lot. Thanks for having me on, guys.
CHARLES: It was our pleasure. Thanks. With that, we will say goodbye to everybody. Goodbye, Elrick. Goodbye, Dan. Goodbye everybody listening along at home. As always, you can get in touch with us. We're at @TheFrontside on Twitter or you can see our website at Frontside.io or just drop us a line at Contact@Frontside.io. Always love to hear from you with new podcast topics, anything that you might be interested in so look forward to hearing from you all and see you next week.