Concurrency control: MVCC
Traditional database use locks while writing, locking out other users/connections.
The postgres way: Multiple versions
- Every row has a version number attached to it.
- During reads, the read query will also query for the highest/latest version of that row.
- During a write, the existing version row is not touch but a duplicate row is created with the incremented version number.
- The next read query will query for the latest version row.
- This doesn’t use any locks, no deadlocks.
- Details: xmin and xmax
- Postgres mains two additional cols called xmin (the transaction ID at which the row was created) and xmax (the transaction ID at which the row was deleted)
- On insert, xmax is null
- On update, the old row xmax is populated with current tx id and new row is added with the current tx id.
- To check if row is visible, every query runs as a transaction, while fetching data, we need to fetch rows where xmin < current transaction (we don’t want to read rows of future transactions) and xmax should be null or less than current transaction (in order to not include future transaction running in parallel)
- We need to also check if xmin and xmax are also committed. Not commit xmin is not included and we can include non-committed xmax.
The problem of count(*)
- While counting rows, it is not straight forward, it depends on the current tx ID and if a row is visible based on the points mentioned above. We need to only count visible rows.
- Auto Vacuum process will delete older version, where
xmax < current tx id
andxmax is committed
(mentioned in Postgres Architecture) - https://wiki.postgresql.org/wiki/Slow_Counting
Each transaction ID is a sequential integer. There can be integer overflow. The transaction ID needs to be wrapped around and the auto vacuum process does a clean up.
https://blog.sentry.io/transaction-id-wraparound-in-postgres/
Drawbacks
- This leads to lots of duplicate rows with different version
- the Auto Vacuum processes (mentioned in Postgres Architecture) will clean the older versions