065: Data Loading Patterns with the JSON API with Balint Erdi
In this episode, we talk to Balint Erdi, author of Rock and Roll with Ember.js, about data loading with the JSON API.
- 01:58 - What is JSON API? Advantages
- 03:22 - Tooling and Libraries
- 05:49 - Relationship Loading
- 07:51 - Designing a Data Loading Strategy
- 11:23 - Pitfalls of Not Designing a Data Loading Strategy
- 13:53 - Ember Data
- 16:37 - Pagination & Sorting
- 23:06 - Writing a Book
- 25:48 - Implementing Searches with Filters
- 31:08 - What’s next for Balint?
- Balint Erdi: Data Loading Patterns with JSON API @ EmberConf 2017 (Talk)
- Balint Erdi: Data Loading Patterns @ EmberConf 2017 (Slides)
- JSON API By Example by Adolfo Builes
- ember-cli 101 by Adolfo Builes
- 33 Page Minibook + Coupon Code!
CHARLES: Hello, everybody and welcome to The Frontside Podcast, Episode 65. My name is Charles Lowell. I'm a developer here at The Frontside. With me, also from The Frontside is Elrick Ryan. Thank you for being with us, Elrick. I know this is your first podcast.
ELRICK: This is my first podcast. It's great to be here.
CHARLES: All right. Fantastic. Yes, we hired Elrick a little bit ago and it's been fantastic. I'm glad to get you on. With us today is a really awesome guest. His name is Balint Erdi. I actually like to tell a little bit of a story when I have an anecdote and I do have one about you that I think you might like, although you might not even remember it. But it was shortly after EmberConf. Last year you and I got on a pairing session remotely and I don't even remember what we were working on but I was struggling with this way to decorate objects without changing them, without touching them or mutating them in any way and you showed me this technique of actually decorating it by creating a new object with the old object as the prototype. Do you remember that?
BALINT: Yes. I totally knew. How could I forget?
BALINT: Wow. Amazing.
CHARLES: Thank you. I don't know if I ever said, "Thank you," but thank you, thank you, thank you.
BALINT: Yeah, no problem. I also learned a lot from this pairing session actually. I didn't know that my small contribution made such an impact. I'm glad to hear that.
CHARLES: Yeah, that was fantastic. We need to actually make that happen again. I don't know why we only did that once.
BALINT: Yeah, we should.
CHARLES: Anyway, we're here to talk about data loading, it’s something that is absolutely critical to building good frontend and building UI and yet, it's something that the users never really see. Sometimes, it feels like it's 90% of the problem.
BALINT: Exactly, yeah.
ELRICK: Yeah, that's so true.
CHARLES: We're going to talk about techniques that we use and you use and, in particular JSON API, what it is and what's so great about it. So, what is JSON API for folks who've never heard of it?
BALINT: JSON API is a standard way to build APIs. I think the specification has reached 1.0, I would say two years ago or three years ago. I remember it was in June, I'm not sure which year. It basically lays down everything that's usually consider when you build an API: how do fetch relationships, how to paginate data, how to sort, all of these things that I think developers tend to invent again and again.
I think probably the biggest advantage of JSON API is that it just declare a standard way to do that. It basically reduces the byte sharing going on at the start of the project. Well, not just at the start, later on too. In my talk at EmberConf, I coined a JSON API the conventional over configuration for APIs.
CHARLES: I see. Pagination is something that everybody does. Why need to byte share over the syntax, like the actual data from it?
BALINT: All of these things are things that everybody does. It's just that everybody does it differently. There's a lot of discussion going on which the best ways, for example when there's a team, at least every team that I was involved with had several discussions going on about what data formats to send data in and how to paginate and all these, I think details where it's more important to get to an agreement just to agree on something and move on than to get it perfectly, if at all there is a perfect way to do it.
CHARLES: I guess my question is if you have the standard way of doing of everything, what kind of tooling can you build, that can you kind of inherit for free? At what level, both from the low level and then up to the top level? When I say top level, I mean what the user is seeing.
BALINT: By low level, I guess you mean the actual libraries that implement JSON API in different frameworks, right?
CHARLES: Exactly. Are there now a lot of libraries out there so whatever I'm using, if I'm using JSON API, is it available in a lot of different ecosystems now?
BALINT: Yes. It definitely is. There is a full page on the JSONAPI.org, on the official JSON API page that just list all of the different libraries that are now implementing all these languages. I have experience with Rails, probably at Ruby too and there are three libraries and I think all of them are pretty good just for implementing JSON API.
The one I use is called JSON API resources and it's very telling. Well, it's a rather simple application but I basically didn't have to write a single line of code. I've only had to write very little code in the server to implement a JSON API specific feature. Most of the relationships could be implemented with just declaring JSON API resources and then the name of the resource in the Rails application. For all the other things, I really didn't have to do that much so in every time, it was just adding one or two lines or changing a configuration value, then it was just there.
CHARLES: Now, how do you choose then what relationship you want to load and which order? Is that controlled by JSON API?
BALINT: It's controlled by the frontend. It is a frontend application that's going to send these requests to the backend so that's where you should consciously think about what relationships you are fetching and how.
CHARLES: Right so part of the specification is a way of specifying which relationships you want to load. In my understanding, part of JSON API is an interface along with say, the user, "I want to load all of the posts that this user has made."
BALINT: Yes. JSON API indeed has a keyword called 'included', which you can implement on your backend which does this. If you specify 'include' and then the name of the relationship or relationships for several ones, the backend must comply with that request and also send back those related resources. That is called compound document in JSON API parlance.
ELRICK: Is that the reverse of what they doing in GraphQL because at GraphQL, I think you have to request the relationships you want on the frontend and then it kicks it off to the backend and it gives you the information back. Is it like JSON API that includes that same thing but in reverse?
BALINT: I'm not sure because I'm very familiar with GraphQL but all the things it does is that you are normally fetching a resource and then if you specify 'include' then you are telling the backend, "Please also include these related resources with a primary source."
CHARLES: I think it occupies a very similar concept that you want to have control on the frontend about which resources or what data gets fetched in addition.
ELRICK: Yeah, it sounds very similar.
CHARLES: Well, I'd love actually to do a comparison because I know there's a lot of overlap between GraphQL and this. Maybe we can get into that a little bit later. One of the things that you said back there is having this, gives you kind of fine-grain control over when and how you load your data because that always seems a pretty difficult problem to attack on your frontend because as you're rendering your application, you have to incrementally fetch little pieces of data here and there and make sure that it's already, at the time that you actually need to render something like a component and then it's got the right data at the right time.
It seems like it's this constant dance of whack-a-mole and like, "I'm loading too much here. It's taking too long," versus, "I've got too many loads happening. I've got 20 requests to run the single page." How can I back those up into a single thing? How do you go about thinking and designing that data loading strategy, as you're ready to render pages?
BALINT: That’s a really good question. I think the short answer is how you load data has to be part of the design process. You really have to spend time thinking about how you're doing that based on the needs of the UI. I think the way you need to render the UI will suggest the way to do data loading. Especially in Ember, I think do I really need ways to, for example block the page from rendering too early so anything that you want to render first, you can fetch in this non-blocking way into model hook of Ember. Then anything else, if you're okay with rendering later, you can fetch in other ways like to set up controller hook or from the templates or from controllers or whatever way but not in the model hook.
CHARLES: But if you are doing something outside of the model hook -- because this is something I feel like a pattern that comes up a lot -- and regardless of where you're operating, if you're using Ember, if you're using another framework, you have kind of this top level data loading. But then you have your nested components might need more data, how do you go about loading that data. I guess you have to think about that also. Upfront it's like, "I’ve got this component that might want to request more data," how do I actually design that and think about that of I've got this data that's going to come, who knows, maybe minutes after the initial route is rendered?
BALINT: That would be different but I think that you can apply the same principle. You can fetch some data in the model hook in this blocking way and pass it all it down to your component if you don't want to render the component before the page renderers or you can just even fetch it from the component while the component is rendered.
CHARLES: You're thinking maybe, in terms of streamlining, the rendering process so that you can begin rendering while your data is loading. Is that the use case?
BALINT: Yes. If you, for example fetch the data from your component, then it's just going to fetch the data as needed. You have the whole page load as render and then when the component is render and it fetches the data, then you're going to see other requests go out to the backend. The data comes back and then the component is going to be rerendered with the new data.
I think in most cases, that's totally fine but you might have a use case when you don't want to render the page before the component is all over the data that it needs. In that case, what you can do is once again, if the fetched data is needed in the model hook, it will just pass it all down to the component.
ELRICK: What are some of the pitfalls that you would run into by not thinking about your data loading strategy beforehand that you can pull out and explain?
BALINT: I think the classic one is what was known name in Rails and other framework is 'N + 1 problem' and it's when you fetch many related resources, then you might end up with doing end requests for a number of resources. In the case of Ember, you have for example a blog post as your model and then in your templates you write model.comments, then what's going to happen -- depending on actual library that you use on the backend but I ran into this myself -- is by default, if you are going to make those end requests. Ember solution is really, they just works. I mean, the default solution just works and you might not even run into this because you might not have the scenario. You might just have a few records. But if you have a great number of them, then it's going to be a bad experience.
ELRICK: So someone that just dealing with Ember and then they go and make a request and then see all these requests come back, that would be something that they would then have to turn around and asked or fix within the backend to say like, "Only give me a certain set of this."
BALINT: Yes, exactly.
ELRICK: Or is there something on the Ember side with Ember data that you can say, "Only fetch X amount."
BALINT: Right. I think both. I can speak about using Ember data and JSON API resources so what you can do to mitigate this case is to use links or cause relationship links instead of fetching the comments one by one, the backend can actually drive Ember data to fetch all the comments in one request from the link that it sends the frontend. You first ask for the blog post itself and then a JSON API compliant backend will send back the blog post resource but it's going to have a relationship link inside. An Ember data automatically records this so when it needs the comments for the blog post then it's going to fetch down from that provided URL. Actually, I haven't really talk about it in a lot of detail at EmberConf.
CHARLES: Yeah, I see. I'm going to go off on a little bit of a tangent because I feel like this is a pattern that's coming up more and more. To give a little bit of context, I feel like the way that our data loading strategies have evolved is we're used to the page loads, then we kind of analyze the URL, we decide what data needs to be loaded to render our components and then we pull that data from the server based on a decision and then we do our render. But it seems increasingly more prevalent than we are having a combination of both pulling the data from the server and having the server push data onto us. One is what are the strategies for dealing with that? Given kind of certainly, at least in the Ember world, routes aren't reactive. Then does JSON API actually help with that at all?
BALINT: Yeah, a good question. I don't think that JSON API specifies how [inaudible]. You are thinking about something like web sockets like pushing data from the server sockets. I don't think JSON API covers this summary.
CHARLES: Yeah. It definitely seems like beyond the scope but I'm wondering if there are any thoughts about general strategies, about how to handle this model state that sitting at the top of your render tree. In Ember for example, in your route and how do you handle the fact that the route requests data and how do you handle data coming in after the initial render?
BALINT: If you use Ember data, then you can push data coming in from a web socket, for example to the store. You probably do some massaging on the data that comes in and then you push it to the store and then depending on how you fetch the data, you might have a live collection. For example, if you do a store, find all notifications and then notification is coming in that is going to get displayed right away on your page because then your template is bound to a live collection. But not all of things in Ember data are live collections.
CHARLES: Okay, it's mostly library dependent but if you're using Ember data, then you just push those directly into the store, those live collections. It kind of like a real time query.
BALINT: Yeah, exactly. But what I'm saying is that not all Ember data query that you do are live collections. For example, relationships are probably not, depending on what method you use to fetch that data. You might have to do some additional footwork.
CHARLES: Okay. Now, getting back into the area in which JSON API really does shine at things that can really shave off a lot of time and consequently money from your work, let's talk about those a little bit. For example, you mentioned pagination. How is JSON API going to help me if I want to have paginated data? What are the scenarios on the client where I would need paginated data? Then maybe we can walk back from here's this user interaction that I need paginated data for and how is JSON API going to help me with that?
BALINT: Sure. I guess the typical scenario on the frontend is there is a long list of items and you don't want to overwhelm the user by showing them what it wants. You need to just show them page by page so JSON API recommends to use so it's agnostic about the exact paging strategy that you use. You can use the classic page-based approach or cursor-based one or start of pagination technique. It doesn't really force you to choose the way you want to do that.
It also mentions that, I think the way it frames it, you might want to use the page for your parameter. I think that the libraries, at least at JSON API resources for sure, you use that page parameter to send back paginated data. You have a page and a square brackets number and then page size. The request variables are page number and page size. Then the server knows that I just have to send back the second page if the page size is 25 and they just set that [inaudible].
CHARLES: Then if you're using a library on the server side, then you don't have to do any extra work.
BALINT: Yeah, exactly, depending on the library I guess. JSON API resources made this one really simple.
CHARLES: I see. Then in terms of library support on the client, I assume that there are libraries, like Ember data that automatically will support this so when you're creating these live queries, you can include information about the page.
BALINT: Yes. I'm very in to Ember so I'm not sure about the frameworks but I suppose that there are some ways that make this very easy for the developer in many of these frameworks like Angular and React.
CHARLES: Right. Something that has just bit me on the butt so many times is when I have paginated sorted data. Imagine you've got some infinite scrolling table or not infinite scrolling but you've got some a bunch of table rows that are maybe 300 of them or 3000 of them so you don't want to load them all at the same time. But at the same time, you've got complex sorting that's happening on each column. You might have seven layers of sort. You want to sort by name, followed by ID, followed by date. One of the biggest problems I've encountered is trying to reconcile and sort on the client versus trying to sort on the server. Are there any facilities to help you deal with that?
BALINT: Yes. I think the approach I usually take is if you do it on the server, then do it on the server. Do not mix the two things. In this case for example, if you are sorting and then you change just sort field, I would just send a request to the server to have the items returned according to the new sort criteria. I think that's the simplest approach because as you said, I've probably experienced things can get really messy, if you want to do that on the client.
CHARLES: Then you've got the third page but when you do the sort, the contents of the third page could be anything.
BALINT: Yeah, that's the other thing. I think if you change the sorting criteria, you'd probably want to go to the first page. I haven't thought through all of the scenarios but I think it's really rare that you want to stay on the fourth page while you change the sorting criteria to be created at descending. You probably want to see the first item the way it was created.
CHARLES: Someone should seriously write a book about sorting and pagination and loading these data sets because seriously, I feel there's this tribal knowledge of things that people have learned from screwing it up. There's not a written down way of this is how you build the data loading layer for an infinite dataset so that you can sort, you can paginate and here are the problems that you'll encounter.
BALINT: There's actually a book written by Adolfo Builes.
CHARLES: He wrote a book on Ember CLI, too, right?
BALINT: Exactly, Ember CLI 101. He’s the same guy and he wrote JSON API By Example. I have it on my mental to-do-list to buy that book. I'm not sure if it covers these exact scenarios but he must cover several in that book.
CHARLES: Well, I'll be sure to reach out from because certainly there are a couple of scenarios that have bit me too. The other one is where you have some collection that's paginated and sorted, then someone adds on the client side a thing to that collection. Say, you want to create a new row in that table, well then what do you do with that new row? chances are it's not going to be anywhere on the screen because who knows what the sort order is and the terms of the total sort, which is only the server knows and who knows what page it's going to be on? You get all these problems that compound and it would be great to have one place where people could reference them or have a little cookbook.
BALINT: Sure, absolutely. I think in that scenario, the simplest thing, which I think probably works best like 99% of the cases is again, just to reset the sorting and the pagination. If I create a new record, I really want to see that new record.
CHARLES: Yeah, maybe you put the new record up in a box up, at the top in a special new record place.
BALINT: Sure you can go fancy and do that. That would be a good solution too. But probably you can just reset the page number and show the first page with the new record.
CHARLES: Ah, yeah. I see.
BALINT: It's tricky. It's going to get more complicated than this.
CHARLES: You might have a problem where you don't want to lose that context to the records that you were looking at.
BALINT: That's right. Somebody should write the book in this. I know somebody who wrote a book [inaudible].
CHARLES: It could be you. You haven't written a book in over a year, right?
BALINT: It's been two years already.
CHARLES: It's been two years.
CHARLES: I'm never going to write a book again. I don't know. Do you think you might?
BALINT: I think I might. I actually have this urge.
CHARLES: If I recall correctly, you're one of the few people I've talked to who was like, "Yeah, you know, writing a book, it wasn't that bad. It was kind of an amazing experience." And literally, everyone else I talked to have been like, "Urgh!"
ELRICK: That's really interesting too because his book keeps up with the releases of Ember so that makes it even harder. That's surprising to hear that like, "Oh, it wasn't that bad."
BALINT: Exactly. Well, I kind of put off writing the second book for a while because if I just wrote a book, then I could just be done with it, I would be happy to start writing the second book. But if I have to maintain it for years, I don't have to do that but it's just so much extra work that I'd have to take this into consideration.
CHARLES: Yeah. Essentially, what you've done is you've rewritten the book. In terms of absolute content, given how much everything has changed, you probably flipped over the content of the book in the same way that people flip over the atoms in their body. I think there was something like you don't have a single atom in your body three years from now that you have today. It takes about three years and you have all the matters completely and totally exchanged. It's kind of like that.
BALINT: Yes, it's kind of like that. Well, I think maybe half of the book is still relatively as it was when I released it because since Ember 2 came out, Ember didn't change that much so in the 1.X series it did change a lot and I think my book originally came out when Ember was 1.10, I would say. There's a lot less work that require now to maintain that it was back then. But it had changed a lot for sure.
ELRICK: Yeah, I think I bought the book on its first release. It was 1.10. I guess you automatically get assigned to the GitHub repo so you just see a constant barrage of updates, updates, updates and I'm like, "Wow, Balint is really killing it in updating this book."
CHARLES: It is a good book and everybody should go buy it. The other thing that I want to cover to, as long as we're talking about scenarios that come up again and again, we talked about pagination, we talked about sorting, what about things like search? Is there a uniform mechanism to help you out there?
BALINT: You can implement searches by using filters. It's a JSON API concept of using filters. You can pass a parameter called filter to your query and then a square bracket. You have the name of the field that you want to search and then just pass the value, the search term basically and then backend should return the items that matched according to some criteria. That’s the simple case.
CHARLES: It's a simple case but clearly, it's up to the backend to implement that API.
CHARLES: So I'm wondering what libraries are available if I'm doing something in Elixir or I'm doing something in Sinatra or I'm doing something using Express, how seamless is it because I feel like a lot of times you can run into problems where these leaky abstractions about the fact that one thing is a Mongo backend, then one is based on Postgres. Maybe that's a better example than a different server technology but more of sticking with a single server technology -- let's use Ruby -- but one is I'm using a Postgres backend and when I'm using, say MongoDB or some other key value store. In your experience, if you’ve seen this, how much does the backing store leak into the frontend? Is JSON API a good protection and the ecosystem around it from those leaks?
BALINT: Yeah, that's a good question. I was about the say that the backend needs to be the abstraction that's use you from having to know what kind of persistence layer you use. The frontend shouldn't care about -- or any client of that backend -- whether you use MongoDB or Postgres. That's a responsibility of the API. You can still send, in this case for example, a filter query. However, the backend translates this to database queries. That's his job. I think the answer to your question is that JSON API does protect you from having to know the intricate details of the database.
CHARLES: You might have some work to do but it's possible.
BALINT: Sure. You might have a lot of work to do on the backend but it's possible. But it's not just JSON API that protects you. Any kind of API should protect you from this kind of knowledge.
CHARLES: Right, unless you're using GraphQL.
BALINT: Could be. I don't know exactly how GraphQL works.
CHARLES: Yeah. I think there would be actually nice to read something from somebody who's got a lot of good experience with all of these, like different technologies to make a comparison. I feel kind of in the dark. Unfortunately, the problem with any technology, I feel like most of the comparisons out there, if you're going to compare, most people have a huge implicit bias for one tool and a little bit of experience with the other, maybe dangerous of like, "I don't definitely want to render opinions on my GraphQL and certainly not versus the API that I'm used to because I feel like I can't make a good comparison."
BALINT: Yeah, absolutely. That's a good one.
CHARLES: But it is something that I'm so intrigued because I feel like there's a lot of overlap there but who knows.
ELRICK: They need that to do API like how they had to do MVC, they need to do API comparison.
CHARLES: Yeah. One thing we could do, we could implement JSON API over GraphQL and kind of just move your backend on to the frontend.
BALINT: I think you could but you just said that with GraphQL, you have to know the client on the frontend, like feels the database.
CHARLES: Right. You could technically have a JSON API on top of your GraphQL backend because I think the thing that kind of freaks me out -- this is a crazy idea that no one should ever do it but I hope someone does -- about GraphQL is being as old as I am, I saw so many projects ruined by the visual basic kind of mantra of just like, "Just query your database right inside your components," and that was literally the rope that hung ten thousand projects and just made people despise visual basic development because there was no shield and then literally every button was coupled to your database. But you could theoretically have the best of both worlds where you're sitting on an abstraction that's also on your client but just moving your query language from your server over to your client or something like that.
BALINT: Yes something like that could work, in theory.
CHARLES: In theory. Like I said, no one should ever do that unless they really want to.
CHARLES: Fantastic. The other thing I want to ask you is kind of what do you have cookin’. You got your book, you wrote but you keep updated, you recently have been evangelizing data loading patterns most recently at Ember Conf. What's next? What’s now? Or you're just kind of taking a break?
BALINT: I already have published book a mini-book about these data loading scenarios. Actually, I just cover the things I told them out at EmberConf then add some more but I might make this into a full-fledged book, providing I don't have to update it.
CHARLES: I'm going to write this book --
BALINT: But just once.
CHARLES: -- But just once.
BALINT: Yeah, I still have to find a way of doing that.
CHARLES: All right. We will look for all of those things. One thing that just occurred to me is just how much of actually building UI and building frontend really is about thinking about the structure and flow of how you load your data and how much the user doesn't see that but how important that is to provide a good experience to that user. That's one of the things that sometimes we, as UI engineers don't like to think about but I think it is absolutely true and crucial and foundational. Thank you so much, Balint for coming by and talking with us about these important topics. We will see everybody next week with that. I will bid everybody to do. Good bye, Elrick. Good bye, Balint.
BALINT: Yeah, thank you very much for having me. Goodbye.