Page 1 of 1

Background and Basics

Posted: Mon Jan 27, 2025 4:29 am
by Mitu9900
But let's first jump back to the time around the turn of the millennium: Those of us who were already working with personal computers back then will certainly still be familiar with the phenomenon that is generally attributed to Moore's Law: You had to buy a new device every two to three years because the successor model had at least twice the clock speed and twice the memory and this was the only way to ensure that the latest applications (and games) would run on it. While the memory space in new devices is still growing at a similar rate today, processor clock speeds have stagnated in a range of around 4 GHz since around 2005.

At this time, the boom in afghanistan telegram screening the Internet and the emergence of online shops and social media applications led to a rapid increase in the amount of data, which we now like to refer to as "Big Data" [4] . Accordingly, a point had been reached at that time where all the data often did not fit on a single computer, nor could it have been processed there with sufficient speed. An obvious solution to this dilemma is multi-core computers with several processor cores, which are now as standard as cloud systems that distribute their data and processor load across different (possibly virtual) machines (or so-called containers) in a way that is not noticeable to developers and users, and often use NoSQL systems for this purpose. But why are NoSQL systems so much more scalable than "conventional" relational database systems (RDBMS)?
Let's start answering this question with a simple, example data set that will also serve us later on to work out the differences between relational and non-relational databases. An important feature of RDBMS is its great flexibility in storage through normalization of data models and the use of foreign keys to link entities (see [5] ). The following example illustrates this using data for hotels and reviews for a travel guide à la Tripadvisor:

In the example, the ratings are each linked to the hotels and the rating users via a foreign key (columns Hotel and User ) in order to show which user has rated which hotel. For this purpose, relational normalization principles are implemented, which in turn allows maximum flexibility in query design and avoids the known anomalies (with Insert , Update and Delete ) [6] .

Such relational databases, which go back to an idea published over 50 years ago by "Ted" Codd [7] , who was also awarded the Turing Prize for it, are primarily designed for OLTP ( Online Transaction Processing ), i.e. individual access to individual data records. Accordingly, individual data records (the table rows) are stored one after the other, line by line (see Fig. 2, middle), so that they can be quickly loaded en bloc for editing and saved in the same place when changes are made.

Data warehouses, which are designed for data analysis (keyword: OLAP = Online Analytical Processing ) of large, non-constantly changing data sets, usually store the data in columns in order to be able to quickly access certain attributes of all data sets (see Fig. 2, below). This means that aggregations such as totals or averages can be calculated more efficiently.