Database Performance at Scale Notes
source: Database performance at scale by ScyllaDB Team
- Make the driver/client timeouts double the DB timeout. This is done to avoid massive rise in requests as the driver might timeout requests and send more while the DB is still processing the prior ones.
- Every database system has a certain consistency model, and it’s crucial to take that into account when designing your project. There might be compromises to make. In some use cases (think financial systems), consistency is the key. In other ones, eventual consistency is acceptable, as long as it keeps the system highly available and responsive.
- Keep in mind that Little’s Law exists—it’s fundamental knowledge for anyone interested in distributed systems. Quoting it often also makes you appear exceptionally smart among peers.
- Whenever possible, schedule maintenance options for times with expected low pressure on the system.
- If your database management system supports any kind of quality of service configuration, it’s a good idea to investigate such capabilities. For instance, it might be possible to set a strong priority for user requests over regular maintenance operations, especially during peak hours. Respectively, periods with low user-induced activity can be utilized to speed up background activities. In the database world, systems that use a variant of LSM trees for underlying storage need to perform quite a bit of compactions (a kind of maintenance operation on data) in order to keep the read/write performance predictable and steady
- Running on the same cluster, such workloads would be competing for resources.
- As system utilization rises, the database must strictly prioritize which activities get what specific share of resources under contention. There are a few different ways you can handle this. Physical isolation, logical isolation, and scheduled isolation can all be acceptable choices under the right circumstances.
Writes and Read Ratio
- For use cases requiring a distinction between hot/cold data storage (for cost savings, different latency requirements, or both), then solutions using tiered storage (a method of prioritizing data storage based on a range of requirements, such as performance and costs) are likely a good fit.
- Write-optimized databases can improve read latency via internal caching, so it’s not uncommon for a team with, say, 60 percent reads and 40 percent writes to opt for a write-optimized database. Another option is to boost the latency of reads with a write-optimized database: If your database supports it, dedicate extra “shares” of resources to the reads so that your read workload is prioritized when there is resource contention.
To read
Little's Law in distributed systems
Time series distirbuted queue