"Sparse files" are files that are not fully allocated by the filesystem they're on. In other words, they use less disk space than their "file size".
This is achieved by only allocating the units of disk allocation as they're written to; Seeking leaves "holes" in the file.
The file size just becomes the current upper bound of file offsets. Since it can be set by seek() + truncate(), the end of a sparse file is not always allocated either.
$ ls -l D3A4898A4F0505D84351DB0B3D722639 -rw-r--r-- 1 mldonkey mldonkey 244505056 2005-01-09 22:24 D3A4898A4F0505D84351DB0B3D722639 $ du D3A4898A4F0505D84351DB0B3D722639 35548 D3A4898A4F0505D84351DB0B3D722639
In this (hypothetical) example, only around 14% of the file is actually allocated by the filesystem.
"Unixy" filesystems : ext2, ext3, ufs, xfs, reiserfs, jfs,...
Most, if not all, modern filesystems designed for Unix-like operating systems support sparse files. Usually, nothing special needs to be done to create sparse files, programs just have to seek and write freely in files.
Because of the File Allocation Table format, sparse files cannot be supported on those filesystems. When data is written at an offset higher than file size, all disk space for lower offsets is allocated, then file size is adjusted.
NTFS 5 and above support sparse files. Default behavior however, is non-sparse (see above). Applications must first set a "sparse flag" on the file(s).
Currently (January 2005), MLdonkey doesn't set those flags. So, under Microsoft Windows, MLdonkey uses non-sparse files, even on filesystems that support them.
Linux: FIBMAP ioctl, frag (e2defrag), filefrag (e2fsprogs), zum, perforate,...