What is the difference between SQL and HDFS?

The first thing to know is that HDFS and SQL have in common the possibility to store data, however, both the structure and the way of storage of each one is totally different.

The main difference is that SQL is considered a database, while HDFS is the Hadoop File System. Therefore, in SQL we will have the data stored in tables in a structured format, and in HDFS we will have many files with any structure.

The second important difference, and the reason why HDFS is selected, is that this system distributes the data among its different nodes to code what is known as Map-Reduce. Map-Reduce consists of a programming system that allows processing large volumes of data by taking advantage of the parallelism by having the data distributed by different nodes.

Some other differences are that HDFS does not comply with the ACID properties of SQL databases, data cache management is also very different and the implementation cost for its use is much higher on HDFS files than on a SQL database.


Your subscription could not be saved. Please try again.
Your subscription has been successful. Thank you for joining this great data world.


You'll get the latest posts delivered to your inbox.