Linux Memory: Buffer vs Cache

Do you really understand the differences between buffer and cache?

Assuming you already know the definitions of “Buffer” and “Cache”:

Buffers are temporary storage for raw disk blocks, that is, cache data write to disks, usually not very large (about 20MB). In this way, the kernel can centralize scattered writes and optimize disk writes uniformly. For example, multiple small writes can be merged into a single large write, etc.
Cache is a page cache for reading files from disk, which is used to cache data read from files. This way, the next time these file data is accessed, it can be quickly fetched directly from memory without having to access the slow disk again.

But let me ask you, since Buffer is just a cache for data that will be written to disk. In turn, will it also cache data read from disk? Or Cache is a cache for reading data from files, so does it also cache data for writing files?

If you are able to answer the above two questions, you can skip this article, I think you already have a good understanding of “buffer” and “cache”. But if you can’t, please stay and let me explain further.

“free” Command

To check system memory usage, the first command that comes in your mind is probablyfree , for example:

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           1.9G        1.0G        394M        2.6M        491M        728M
Swap:            0B          0B          0B

Obviously, this output contains the specific usage of physical memory Mem and Swap, such as total memory, used memory, cache, available memory, etc. The cache is the sum of the two parts of Buffer and Cache.

Let’s take a look at buffers and cache definition in free man page:

buffers
              Memory used by kernel buffers (Buffers in /proc/meminfo)

cache         Memory used by the page cache and slabs (Cached and SReclaimable in /proc/meminfo)

buff/cache
              Sum of buffers and cache

We can see that the source data for free command is actually in proc/meminfo file. As I mentioned earlier, /proc is a special file system that provided by Linux kernel, which acts like an interface for users to interact with the kernel.

The /proc filesystem is also the ultimate source of data for many performance tools. In man proc , the definition of buffer and Cached looks like following:

Buffers %lu
    Relatively temporary storage for raw disk blocks that shouldn't get tremendously large (20MB or so).

Cached %lu
   In-memory cache for files read from the disk (the page cache).  Doesn't include SwapCached.
...
SReclaimable %lu (since Linux 2.6.19)
    Part of Slab, that might be reclaimed, such as caches.
    
SUnreclaim %lu (since Linux 2.6.19)
    Part of Slab, that cannot be reclaimed on memory pressure.

At this point, you may think you have found the answers for my questions, “Buffer” is just a cache for data that will be written to disk and “Cache” is just a cache for reading data from files. But, the fact is, “Buffer” can also be used for read and “Cache” can also be used for write.

Experiments

We will do two experiments here, cache for writing and buffer for reading.

Cache for Writing

Let’s log into our Linux host and have two terminals ready. In terminal one, let’s clean up the cache first:

Here /proc/sys/vm/drop_caches is an example of modifying the behavior of the kernel through the proc file system. Writing 3 means cleaning up various caches such as file pages, directory entries, and Inodes.

Still in terminal one, let’s fire up vmstat 2 command:

buff and cache are the Buffers and Cache we saw earlier, and the unit is KB.
bi and bo represent the size of the block device read and write, respectively, in blocks/second. Since the block size in Linux is 1KB, this unit is equivalent to KB/s.

Next, go to terminal two and run the following command:

Now switch back to terminal 1 and observe the changes of buff and cache :

By observing the output of vmstat, we found that when the dd command was running, the Cache kept growing, while the Buffer remained basically unchanged.

Buffer for Reading

Now, let’s do the second experiment. Again, clear the cache in terminal one:

Fire up the vmstat 2 command again, also in terminal one:

You can see that at this point, buff is 0. Now in terminal two, run the following command:

Then, go back to terminal one to observe:

Observe the output of vmstat, you will find that when reading the disk (that is, when bi is greater than 0), both the Buffer and the Cache are growing, but obviously the growth of the Buffer is much faster. This means that when reading from disk, the data is cached in Buffer.

Now we can almost conclude:

The data will be cached in the Cache when reading the file, and the data will be cached in the Buffer when reading the disk.

Conclusion

Here you should find that although the documentation provides a description of Buffer and Cache, it still cannot cover all the details. For example, these two things we learned today:

Buffers can be used either as a “cache for data to be written to disk” or as a “cache for data read from disk”.
Cache can be used either as a “page cache for reading data from files” or as a “page cache for writing files”.