Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Author: Martin Kleppmann
This Month Hacker News 2

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems


Review Date:


by weitzj   2019-01-09
I highly recommended reading “Designing data intensive applications “ by Martin Kleppmann to get a thorough overview with lots of references. Reading this book is a timesaver compared to finding all these information across blog posts.

by collinf   2018-11-10
I haven't seen anyone touch on this, but I remember reading about this in Data Intensive Applications[1]. The way that they solved the celebrity feed issue was to decouple users with high amounts of followers from normal users.

Here is a quick excerpt, this book is filled to the brim with these gems.

> The final twist of the Twitter anecdote: now that approach 2 is robustly implemented,Twitter is moving to a hybrid of both approaches. Most users’ tweets continue to be fanned out to home timelines at the time when they are posted, but a small number of users with a very large number of followers (i.e., celebrities) are excepted from this fan-out. Tweets from any celebrities that a user may follow are fetched separately and merged with that user’s home timeline when it is read, like in approach 1. This hybrid approach is able to deliver consistently good performance.

Approach 1 is a global collection of tweets, the tweets are discovered and merged in that order.

Approach 2 involves posting a tweet from each user into each follower's timeline, with a cache similar to how a mailbox would work.


by davidcuddeback   2018-11-10
Another good resource is Designing Data-Intensive Applications [1]. Chapter 2 does a really good job explaining how different categories of databases relate to different data models, including examples of querying graph-like data models using `WITH RECURSIVE` compared to a query language for graph databases.


by jpamata   2018-11-10
Designing Data-Intensive Applications[0] by Martin Kleppmann. There's a previous HN thread about it[1]. Helped me understand a bit more about databases and systems. The book is also very approachable and has the perfect blend of application and theory at a high level that anyone approaching the industry for the first time stands to gain a lot from reading it.

The Architecture of Open Source Applications[2] series is a good one for leaning how to build production applications and you can read it online. The chapter on Scalable Web Architecture[3] is a must-read.





by otras   2018-11-07
I'd recommend the following:

Clean Code: A Handbook of Agile Software Craftsmanship [0] is a great book on writing and reading code.

Similarly, Clean Architecture: A Craftsman's Guide to Software Structure and Design [1] is, no surprise, a book on organizing and architecting software.

Designing Data-Intensive Applications [2] may be overkill for your situation, but it's a good read to get an idea about how large scale applications function.

The Architecture of Open Source Applications [3] is a fantastic free resource that walks through how many applications are built. As another comment mentioned, reading code and understanding how other programs are built are great ways to build your "how to do things" repertoire.

Finally, I'd also recommend taking some classes. I started as a self-taught developer, but I've since taken classes both in-person and online that have been a tremendous help. There are many available for free online, and if in-person classes work better for you (motivation, support, resources, etc), definitely go that route. They're a fantastic way to grow.





by cloakedarbiter   2018-10-04
Designing Data-Intensive Applications by Martin Kleppmann [0]


by sbmthakur   2018-08-27
I second Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. Rather than covering theoretical aspects in detail, it focuses on real-life problems that can be solved using various paradigms.

by chw9e   2018-07-25
As a self-taught developer, I used to think that some of the theoretical elements were overhyped. I can build iOS apps that work, and I did just that for the last 2-3 years. However, many of the programs that I wrote have not been as easy to maintain as I would like and some difficult to fix bugs have popped up overtime, both of which are due to a lack of deeper understanding of CS fundamentals. Last year I started interviewing and was ridiculed at one company in particular for a lack of CS knowledge. Afterwords I started exploring a lot of the CS concepts listed in this link and I have since found numerous ways to improve my code quality and have a better understanding of how CS best practices came to be. I also used to think that algorithms and data structures were relatively useless for an iOS developer, and I was able to do the job without them, thus proving my point. However, after gaining a better understanding, it quickly becomes clear that things like view hierarchies are simply trees and understanding ways to traverse these hierarchies can lead to much cleaner code. With the open sourcing of Swift, I also became more interested in understanding the language, but a lot of the language design decisions didn't make sense to me until I gained a better understanding of CS fundamentals. I have found the programming languages course on Coursera [1] to be particularly useful, and have also greatly enjoyed the book Designing Data Intensive Applications [2]. There's also a great video from this year's WWDC that really inspires algorithm study and use in everyday applications [3].



by dustingetz   2018-07-17
"CP/AP: a false dichotomy"
by throwawaypls   2018-05-19
I read this book titled "Designing Data Intensive Applications", which covers this and a lot of other stuff about designing applications in general.