What is difference between storing data in a blob, vs. storing a pointer to a file?

I read that the data type can be used to store files.

According to MySQL manual page on Blob, A BLOB is a binary large object that can hold a variable amount of data.

Since it’s a data type specific to store binary data it’s common to use it to store files in binary format, being storing image files a very common use on web applications.

For web applications this would mean that you would first need to convert your file into binary format and then store it, and every time you need to retrieve your file you would need to do the reverse process of converting them back to it’s original format.

Besides that, storing large amount of data in your db MAY slow it down. Specially in systems that are not dedicated only to host a database.

I also read that an alternative is to store the file on disk and include a pointer to its location in the database

Bearing in mind all above considerations a common practice for web applications is to store your files elsewhere than your MySQL and then simply store it’s path on your database. This approach MAY speed up your database when dealing with large amount of data.

But I’m a little confused because I’ve read that blob fields are not stored in-row and require a separate look-up to retrieve its contents.

In fact that would depend on what storage engine you are using since every engine treats data and stores it in different ways. For the InnoDB engine, which is suited for relational database you may want to read this article from MySQL Performance blog on how the blob is stored in MySQL.

But in abstract, on MySQL 5 and forward the blob is stored as following:

Innodb stores either whole blob on the row page or only 20 bytes BLOB pointer giving preference to smaller columns to be stored on the page, which is reasonable as you can store more of them.

So you are probably thinking now that the right way to go is to store them as separate file, but there are some advantages of using blob to store data, the first one (in my opinion) is the backup. I manage a small server and I had to create another subroutine only to copy my files stored as paths to another storage disk (We couldn’t afford to buy a decent tape backup system). If I had designed my application to use blobs a simple mysqldump would be everything that I needed to backup my whole database.

The advantage of storing blobs for backups are better discussed on this post where the person who answered had a similar problem than mine.

Another advantage is security and the easiness of managing permission and access. All the data inside your MySQL server is password protected and you can easily manage permissions for your users about who access what and who doesn’t.

In a application which relies on MySQL privileges system for authentication and use. It’s certain a plus since it would be a little harder for let’s say an invader to retrieve an image (or a binary file like a zipped one) from your disk or an user without access privileges to access it.

So I’d say that

If you gonna manage your MySQL and all the data you have in it and must do regular backups or intend to change or even consider a future change of OS, and have a decent hardware and optimized your MySQL to it, go for BLOB.

If you will not manage your MySQL (as in a web host for example) and doesn’t intend to change OS or make backups, stick with varchar columns pointing to your files.

I hope it helped. Cheers

Leave a Comment