2 comments

  • mastermedo40 minutes ago
    &gt; Surprisingly, there are many more stories and publications about bugs caused by weak isolation levels than cases where stronger isolation levels caused impractically low performance.<p>I expected the article to substantiate the claim that serializable brings a large performance hit as in my experience it isn&#x27;t so. The article basically makes the same point.<p>With serializable, you need to be a little careful not to have hot rows. Avoid them by sharding commonly written values. Another way to improve performance is to use true time for ordering non read-then-write transactions. It&#x27;s a little finicky if the database doesn&#x27;t provide such guarantees out of the box. Take Google&#x27;s Spanner as an example. It offers the serializable isolation level and it&#x27;s pretty performant (as long as you account for hot spots).
    • mjb21 minutes ago
      &gt; With serializable, you need to be a little careful not to have hot rows. Avoid them by sharding commonly written values<p>Unfortunately, serializable isolation requires detecting or preventing read-write conflicts (i.e. one transaction writing a row that a concurrent transaction has read). This is the performance impact of serializability: you need to be very careful what you read, because if you read too many rows you prevent any concurrent transactions from updating those same rows. Read-only transactions are OK (because MVCC), and read-only tables are OK (because there&#x27;s no read-write conflict if a table is mostly read only), but tables that are both written and heavily read are where you get performance problems.<p>With snapshot isolation (e.g. Oracle&#x27;s serializable, Postgres repeatable read), only write-write conflicts matter. There it doesn&#x27;t matter what a transaction reads, and reads never need to block (or abort) writers. So what you say is true for snapshot, but not for serializable.<p>Interestingly, serializable&#x27;s lack of need to detect write-write conflicts means that (in some implementations) it can be faster than snapshot for blind writes (i.e. anything that&#x27;s not a read-modify-write under the covers).
  • lukas2211 hour ago
    I would argue that not using serialization isolation level by default is like not using a memory safe programming language by default.<p>Sure, sometimes it&#x27;s too slow, but it should be the default.<p>Very few people can write correct database code at the other serialization levels. Most think they can, but it&#x27;s harder than correct multi-threading, because databases do weird unintuitive things for performance.
    • jiggawatts8 minutes ago
      I recently encountered a query that deadlocked on itself because it used a parallel execution plan and updated multiple indexes in a manner that the different threads could conflict with each other.<p>Naively one would expect that no individual UPDATE or INSERT statement could deadlock in isolation… but there you go.<p>If <i>that</i> is possible, the possibilities across multiple concurrent data mutating queries are beyond human comprehension!<p>Serializable should absolutely be the default!<p>Similarly, all columns should be automatically indexed to at least some degree, like Postgres BRIN indexes at a minimum.<p>Time and experience have shown that the vast majority of developers are pathologically unable to properly define all required indexes ahead of time.