<Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems> The main reason for wanting to partition data is scalability. For very large datasets, or very high throughput, that is not sufficient: we need to break the data up into partitions, also known as sharding. What we call a partition here is called…
Designing Data-Intensive Applications – Chapter 5: Replication
<Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems> The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair. —Douglas…
Designing Data-Intensive Applications – Chapter 4: Encoding and Evolution
<Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems> Everything changes and nothing stands still. —Heraclitus of Ephesus, as quoted by Plato in Cratylus (360 BCE) Applications inevitably change over time.. schema-on-read (“schemaless”) databases don’t enforce a schema, so the database can contain a mixture of older and newer data formats written…
Designing Data-Intensive Applications – Chapter 2: Data Models and Query Languages
《Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems》https://amzn.to/2WYphy6 Relational DB: MS SQL, MySQL, IBM DB2, PostgreSQL, SQLite etc. Document DB: Cassandra, HBase, Google Spanner, RethinkDB, MongoDB etc. Graph DB: Neo4j,Titan,InfiniteGraph,AllegroGraph,Cypher,SPARQL,Gremlin,Pregel Data Models: Not only on how the software is written, but also on how we think about the problem that we…
Designing Data-Intensive Applications – Chapter 1: Reliable, Scalable, and Maintainable Applications
《Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems》https://amzn.to/2WYphy6 Reliability Continuing to work correctly, even when things go wrong. aka. fault-tolerant or resilient. Fault vs. Failure (fault means one component failed), design system that able tolerance the fault to prevent failure; Hardware Faults: Redundancy is the key; Software Errors: could be more…
Designing Data Intensive Applications – Chapter 3: Storage and Retrieval
<Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems> (If you keep things tidily ordered, you’re just too lazy to go searching.) —German proverb DB needs to do two things: when you give it some data, it should store the data; when you ask it again later, it should give the data…