Reducing the database load
In the previous post we discussed the importance of the unique ID for every record. Still we will update the records multiple times a day even if we don‘t change anything. Remember we scrape the data. Assuming 1Million records with 10 different sources e.g. and a scrape interval of 5 minutes we easily have a database load of 1M * 10 sources every 5m
which equals to 120M rec/h
which equals 33K req/s
which has to potential to overload the database depending on the technology.