filesaver
|
Very fast multi-threaded file size scanner utility written in C++ and Objective-C. Scans tens of thousands of files a second. It's able to scan my whole disk (which is 3.5+ million files) in close to a minute.
Doesn't care about memory usage at this point. It can generate a LevelDB index, which could solve this issue in the future.
This is a macOS/CLI application that spawns several worker threads that keep scanning your disk to find files' sizes.
A single thread receives results and aggregates them to calculate directory sizes.
The workers work with two queues based on mutexes and condition variables (data::WorkQueue):
If the worker thread finds a directory, it'll also enqueue each of the directories' children onto the workQueue, then other workers can pick-up this recursive scan.
On the aggregation side, the consumer/reader thread FileSaver::entryReader reads entries from the resultQueue. It performs 3 operations on every entry:
When storage is enabled another thread reads from a storage queue and writes it to LevelDB.
At this point there aren't good benchmarks for the tool. I've profiled and optimized different parts and that so far it seems the performance is stellar for its use case of scanning an entire SSD disk and providing an interactive view of it (not necessarily indexing to a database).
The case where it'll be much slower than simple single-threaded scanners is if the number of files is small.
In this case, the thread sleeping and spawning threads that is performed will outweight its benefits, or that's how I understood it.
The tool I tried to use before was duc
, which is an awesome simple C tool for scanning files. It's very fast, but it's single-threaded.
To scan my entire MacBook's disk filesaver
will:
Size scanned | Files scanned | Throughput in files (would accept queries) |
---|---|---|
221.5GB | 5.336.323 | 51389/s |
So it takes 100s to find all the sizes for all files in my disk and put it in a hash map for lookup.
Build with CMake (or the Makefile wrapping it):
The XCode project for the GUI is configured for macOS Catalina and newer.