some queries regarding updates/inserts rate limits for a database(SQL-based or NoSQL based)
In an app, I need to update a large number of records/insert a large number of records near-simultaneously into a database.
Is there any such insert/update limit which will act as a bottleneck for my app? Is there something I can do to maximise the insert/update rates.
Also, regarding the updates, the same record will not be updated– many records in one table (clearly identifiable by a key) will be updated– does this mean that the rate of updates can be significantly higher?
My questions above apply to relational as well as NoSQL databases.
I am now considering using a relational database like SQL Server/MySQL– however at this stage I am willing to consider noSQL databases like Mongo DB/Dynamo DB as well- my final selection of db depends on which database offers higher insert/update limits.
One Solution collect form web for “some queries regarding updates/inserts rate limits for a database(SQL-based or NoSQL based)”
The maximum R/W capacity will highly depend on the hardware (HDD vs SSD) and the number of nodes. Also, I believe, it’s easier to scale NoSQL for write operations as ACID and consistency are not enforced.
For the DynamoDB part of the question
DynamoDB is a managed service. You reserve a given R/W capacity (the so called ‘units’) and Amazon scales your database for you. But still, there is a couple of points to take into account :
- Maximum reserved Throughput is 10,000 for Read
- Maximum reserved Throughput is 10,000 for Write
Basically you consume 1 unit per consistent read or write per second and per indivisible kB. (Well, actually, it’s even trickier).
BUT, in order to scale, your table is split among partitions. So is the reserved Throughput. SO, to reach the maximum possible throughput, you requested Keys needs to be as evenly spread as possible.
These were the database side considerations.
On the client side, there are multiple factors to take into account as well:
- TCP/HTTP latency
- maximum open socket at a time
You may use batch_get_item and/or batch_write_items were grouping is possible to reduce the latency impact.
For example, I’ve never been able to reach the provisioned 1000 write from my office computer, so, you’d better use multiple instances/server to issue your databases request.
- you reserve throughput, Amazon scales to deliver it
- still some gotchas related to the internal implementation
- need to optimize latency
- need to carefully parallelize
I’m not an expert for the other databases types so, I can only guess that the client side considerations will somehow be same, maybe less for MySQL as it uses a binary protocol. Also, you probably will have to manage the databases.