Deepseeks 3FS' FUSE hackery is super interesting
Deepseek released 3FS some time ago, a filesystem to blast tons of data to your GPU cluster for efficient training since GPU’s needs to go BRRRRR and not wait on IO.
I was checking out their paper and docs and went found this cool hack for FUSE that they employ. FUSE makes it possible to have a Filesystem in UsErspace so that us mere mortals don’t have to hack the kernel too much in order to pretend we can build filesystems.
This allows you to, for instance, access a database as a filesystem. Not saying you should, but you can, if you feel the urge.
What they want at deekseek when training, is fast fast fast troughput. As fast as possible to get data from the clusters onto your GPU.
In order to do this they do some tricks:
- Use FUSE for file listing, directory semantics talking to 3FS
- For
read()sidestep the whole thing and open a door to a magic portal
The portal works like this:
- Get data from NIC (infiniband) (don’t go to the kernel, but put it in shared memory region)
- Read the shared data from the memory in another proces via a ringbuffer
- Let the process consume the data directly
- Celebrate! Look ma no kernel!
Also made me go down into the rabbithole of AF_XDP where we can build this ourselves. Since i don’t have an inifiband card. The upside = SUPERFAST PACKETS, downside rebuilding the whole TCP/IP networkstack.