Designing Data Intense Application – Chapter 6: Partitioning

<Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems> The main reason for wanting to partition data is scalability. For very large datasets, or very high throughput, that is not sufficient: we need to break the data up into partitions, also known as sharding. What we call a partition here is called…

Designing Data-Intensive Applications – Chapter 5: Replication

<Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems> The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair. —Douglas…

Designing Data-Intensive Applications – Chapter 4: Encoding and Evolution

<Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems> Everything changes and nothing stands still. —Heraclitus of Ephesus, as quoted by Plato in Cratylus (360 BCE) Applications inevitably change over time..  schema-on-read (“schemaless”) databases don’t enforce a schema, so the database can contain a mixture of older and newer data formats written…

Designing Data-Intensive Applications – Chapter 2: Data Models and Query Languages

《Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems》https://amzn.to/2WYphy6  Relational DB:  MS SQL, MySQL, IBM DB2, PostgreSQL, SQLite etc.  Document DB:  Cassandra, HBase, Google Spanner, RethinkDB, MongoDB etc.  Graph DB: Neo4j,Titan,InfiniteGraph,AllegroGraph,Cypher,SPARQL,Gremlin,Pregel Data Models:  Not only on how the software is written, but also on how we think about the problem that we…

Designing Data-Intensive Applications – Chapter 1: Reliable, Scalable, and Maintainable Applications

《Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems》https://amzn.to/2WYphy6 Reliability Continuing to work correctly, even when things go wrong.  aka. fault-tolerant or resilient. Fault vs. Failure (fault means one component failed), design system that able tolerance the fault to prevent failure;  Hardware Faults: Redundancy is  the key;  Software Errors: could be more…

Designing Data Intensive Applications – Chapter 3: Storage and Retrieval

<Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems> (If you keep things tidily ordered, you’re just too lazy to go searching.) —German proverb DB needs to do two things:  when you give it some data, it should store the data; when you ask it again later, it should give the data…