2008-04-23

Zaitcev, you can.

zaitcev: I know you're not entirely serious, but I don't know why Ted's post has put the dampers on anything. Fact is, it's still open source, you can still fix things, and you can still get them into the main codebase. Or fork. Even better, you've been through SMI code review, so you'd be super familiar :)

Ted's post is wrong in many ways of course. It didn't take 3 years to get a distributed SCM for the sources out there. I don't believe for an instant he's stupid enough to think that getting external committers set up is as simple as "setting up a Mercurial server" (I can assume you remember Teamware). The low rate of contributions is not primarily a result of the cumbersome request sponsor process.

You can't call something astroturfing when it's actually doing useful work, just because that rate is fairly low. Especially when you look at how DTrace and ZFS have been ported everywhere (except Linux).

(Update: I would have preferred to reply as a comment, but you have anonymous comments disabled, and LiveJournal is still broken when it comes to openid. I didn't use my work blog as that's for content not chit-chat.)

7 comments:

Szabolcs Szakacsits said...

Some people do use ZFS on Linux: http://www.wizy.org/wiki/ZFS_on_FUSE

The developer was hired by Sun to help work on Lustre/ZFS on Linux.

John Levon said...

That's more of a FUSE port than a Linux one.

Szabolcs Szakacsits said...

I see, you mean a Linux kernel, not a Linux port. Well, there isn't really much difference from users point of view.

Why would anybody want a kernel port of ZFS? I can't see major advantages, mostly drawbacks. And I don't mean the legal but the technical and economical ones.

The most common arguments are performance related but those are either based on false assumptions or are fixable.

John Levon said...

I'm not exactly a filesystems guy, so I won't scoff too much at the idea that FUSE will ever be performant enough. Besides, there's about a billion other reasons to have ZFS in the kernel.

Just two reasons: ZFS root FS support. FMA.

What are the drawbacks?

Szabolcs Szakacsits said...

The dominant factors for performance seem to be the design, the quality of the implementation and lead time to optimize for the (latest) hardwares/environments.

On commodity hardware Linux can do a million context switches. File system workloads barely need or can do more than a few tens of thousands file operations per second due to storage bottlenecks. Which means maximum about only extra 5% CPU use for block based FUSE file systems which can be compensated in several other ways.

The FUSE based NTFS-3G already outperforms most main in-kernel file systems and it's still completely unoptimized.

NTFS-3G is also used as root file system. For example Ubuntu 8.04 uses this feature to install and run Linux off an unpartitioned NTFS volume.

I can't see any problem with FMA either.

FUSE file systems are not pure user space file systems but in fact a hybrid one. Functionality can be both in the kernel and in user space.

The major drawback is getting things done fast __and__ at the same time let the result be deployed on millions of boxes in heterogeneous environments. Obviously this is not a priority for everybody but it fosters innovation and product maturity.

John Levon said...

Interesting, how can you use FUSE as
a root file system? Are you keeping all
the user-space parts in a ramdisk or
something permanently?

What happens if I 'kill' the userspace
part, or it dies otherwise? Is there
special code to prevent it being
touched?

How do you get decent performance when
you've no direct access to the page cache? (I've never looked at the FUSE implementation BTW)

Szabolcs Szakacsits said...

Yes, using initrd, initramfs.

Yes, things can be configured and some distros already do it, so the fs can't be killed (oom killer, during shutdown by killall, etc).

FUSE supports kernel caching (file attributes, names, page cache) by the usage of several options.

Block based file systems can be also exclusively locked which guarantees the cache coherency. Often user space isn't even involved at all during fsops.

FUSE is fairly powerful and can be used in many different ways. Sometimes it misunderstood, sometimes it's misused.