Billion-Files File Systems (BfFS): A Comparison

ggm 5 months ago

They tested very flat filesystems with a small number of directories containing astronomical numbers of files.

There's nothing wrong with that, but the balance of files per directory has a massive impact on some things. Modern fs may implement this better but the VFS file buffer cache shared across all processes suffered when large dir scans had to be done. It destroyed kernel LRU cache state.

Invoking second and third level directory block chains impacts all kinds of things. stat() calls become slow. This is also why things like noatime mount happened (minimising writes also helps with speed, as well as fs longevity) all things I'd hope a modern fs addresses.

Back in the day after disastrous sendmail traffic jams I'd shard the spooldir into subsets, and write in chunks and then mkdir a clean working directory to destroy the horrible chained directory block: it was hundreds of megabytes big, indexing only 1-200 files after deleting thousands.

I think a test would be better modelling a directory tree of 4-5 subdir levels, each one capped to 1000 files and subdirectories as a comparison to 100 directories with 100,000 files in them. Maybe they did and found it made no difference, but I'd imagine it changes the inode ratio as well..