In my previous post I gave a (very) high level overview of ZFS and why I think it is a solid foundation for vCluster. What I did not say though, was why we chose OpenIndiana over the other operating systems offering ZFS.
Bespoke ZFS powered Enterprise storage solutions created by CatN cloud hosting experts.Find out more Just before Sun was bought by Oracle, Solaris came in 2 flavours. Solaris and Opensolaris. As the name suggests, OpenSolaris was open source and Solaris was the closed source variant. OpenSolaris was to become proper Solaris some day and development was to be made in the open. Because OpenSolaris was opensource, many different projects were born out of it. Nexetnta made a storage appliance. Belenix was a generic desktop with KDE. StormOS was a simple desktop with Xfce. FreeBSD which had lost it’s edge to Linux over the years, took the chance and ported ZFS with great success. Joyent based their cloud 100% on Opensolaris. They even offer their version, called SmartOS, free to download and use in production.
Alas, Sun was bought by Oracle, Oracle closed the Solaris source code and open development of OpenSolaris ended.
Or did it?
Just before Oracle closed the OpenSolaris source, a group of hardcore Solaris sysadmins and companies using OpenSolaris for their business decided to create a distro of OpenSolaris, called OpenIndiana. Their aim was to collaborate and move the platform forward and avoid Oracle and their heavy handed policies altogether. The result was Illumos, which is the kernel, and OpenIndiana which is the full generic server grade OS that OpenSolaris was.
So what choices did we have for ZFS storage?
One obvious one would be OpenIndiana. For all intents and purposes, it is the continuation of OpenSolaris. It is a stable platform, it has features beyond ZFS that are not found in any other system (i’ll talk about these later), but it has a flaw. Familiarity with the platform is very low. Solaris in general and by extension OpenIndiana where nowhere near as popular as Linux of FreeBSD. For one person familiar with Solaris you could find fifty familiar with Linux.
Another option was Nexentastor. Nexentastor is an Illumos distro that is commercially backed by a company called Nexenta. With Nexentastor one can make a storage appliance out of practically any PC less than 2 years old. Familiarity with the platform is not an issue because the user/admin interacts with the appliance through a web interface. It is a purpose built system just for creating storage appliances.
Another option was FreeNAS. FreeNAS is practically the same as NexentaStor, but using FreeBSD underneath instead of Illumos. Just like NexentaStor, FreeNAS is a purpose built operating system with a nice web interface on top.
The last option was FreeBSD. FreeBSD is the father of practically all UNIX and derivatives, going back to the late 70′s. Over the years it has gained a reputation of stability that any other platform would be envious of. Just like OpenIndiana, not many people are familiar with FreeBSD.
So why did we choose OpenIdiana in the end?
First lets compare the different solutions based on some general features:
| Nexenta | OpenIndiana | FreeNAS (FreeBSD) | FreeBSD 9 | |
|---|---|---|---|---|
| Cost | Free/commercial | FREE | FREE/commercial | FREE |
| Vendor Support | YES | NO | NO(in U.K.) | NO |
| Web GUI | YES | YES (napp-it) | YES | NO |
| Stable | YES | YES | YES | YES |
| HA | Yes Commercial | Yes difficult | Yes Commercial | YES (HAST) |
| ZFS Version | 28 | 28 | 15 | 28 |
| Snapshots | YES | YES | YES | YES |
| ZFS Send/Receive | Commercial only | YES | YES | YES |
| Encryption | NO | NO | NO | YES* |
| Virtualisation | NO | YES | NO | YES |
| xattr | YES | YES | YES | YES |
| iSCSI | YES | YES | YES | YES* |
| NFS | YES | YES | YES | YES |
As you can see NexentaStor and FreeNAS are almost identical in features, especially if one takes in account commercial support.
One of the things we will use extensively in vCluster is ZFS send/receive. This is a ZFS feature where a snapshot of a filesystem can be sent locally or over the network to another ZFS server and have an identical replica of the data remotely. Note that this is not the same as rsync, because rsync syncs files, whereas ZFS syncs blocks. This is significant because ZFS will send the difference in blocks, which means that syncs are significantly faster than rsync, plus they are checksummed.
This rules out NexentaStor, for now at least, because we are not prepared to pay a license for such a basic feature of ZFS.
Having that in mind, I started evaluating FreeNAS. The system I used has a Quad Core Xeon Processor @ 2.5 GHz, 32GB RAM, 22 2.5″ 7200rpm 750GB SATA disks and 2 OCZ 32GB SSDs used as ZFS zil accelerators (or slogs or logzillas as some ZFS engineers call them). It also has 3 Adaptec RAID 5805 controllers.
One thing to note is that ZFS hates RAID controllers with a passion. If you have to use a RAID controller configure it so as to present the disks as JBOD or at a minimum configure each disk as a RAID0 array with a single member. Remember that ZFS is a volume manager combined with a filesystem. It also handles RAID, having single parity, double parity, triple parity, mirror and stripe modes of operation.
So back to FreeNAS, I configured the system, I setup the network and I started benchmarking. We need the system to perform well as an NFS server for Linux clients. So I used a system in the lab that has 8GB RAM, 2 250GB SATA disks and a Quad Core Xeon @ 2.5 GHz, as an NFS client.
Initially I wanted to establish the speed of the FreeNAS server locally. So I run iozone, which is included with FreeNAS, to see just how fast the system I built was. After some fiddling around with the various iozone options I ended up running the following test: iozone -az -g 2G /mnt/tank/test -b /mnt/tank/iozone.xls
This command runs iozone in automatic mode, trying block sizes from 4K up to 16384K and writing and reading a file that starts from 4K in size up to 2GB in size.
The result was the following:
Sequential write:
Random write:
Hmmm. What is happening here? Sequential writing and random writing at 4GB/s?? How is it possible to write a file randomly at the same speed as writing it sequentially?!?
The answer is that ZFS is caching and using RAM so aggressively that if not told otherwise it will eat up *all* RAM minus 1GB by default. Yep, you read that right, all RAM minus 1GB. So the graphs in reality show the speed that the ZFS caching subsystem work. Also note the jump in performance when the benchmark reaches 128K block size in the first writer test. This is because this is the default block size when you create a zfs filesystem.
Ok then. How can we eliminate caching so we can see how the system really performs?
Simple. just add -o to iozone, so that it forces the system to actually write every single time to stable storage before it continues. So what do we get now?
Sync writer
Random sync writer
What is this? Again sequential write and random write go at the same speed?!?
If you remember I created the ZFS pool with 2 zil accelerators, slogs from now on. These are used by ZFS exactly when something wants to write to the system using sync. These SSDs can push 75000 IOPS, which translates to ~550MB reads and ~500MB writes per second. Not bad. and if you notice the graphs go up almost linearly with the block size.
Here are some reader stats that go off the scale.
I used the SSDs as read caches now.
High performance, SSD backed ZFS storage solutions for enterprise.Find out more Reader
Random Reader
Note the speed when the file size is the same as the block size. 10GB/s!! This is straight from RAM.
This is all fine you say, but you want this to be an NFS server!
OK then here it is:
Linux NFS writer performance:
Linux NFS random writer performance:
Again this makes no sense!! 2.5GB/s from the network? This is impossible! The server uses one nic at 1Gbit/s. This translates to a theoretical maximum of 100MB/s, not 2.5GB!!!
Well you see, Linux caches too. To solve the conundrum look at the green line lower in the graphs. This is the 2GB file being transmitted through the network. You will notice that it is going constantly at 1/5th of 500MB/s, which is equal to 100MB/s! This is because the Linux computer does not cache the 2GB file, so it writes it directly to stable storage, in this case the NFS export from FreeNAS.
So then, we have established that the server is more than adequate to cope with the load expected from vCluster and we have established that it can saturate a 1GB network link. So why did we not use FreeNAS?
The answer is xattrs. You see, in vCluster we use SELinux extensively, which relies on xattrs so do its job. It turns out that FreeBSD and by extension FreeNAS do no tsupport xattrs at all!
How annoying.
In the next blog post I will continue with benchmarks and more on OpenIndiana.










Looking forward to the rest of your adventure/ success. We are just now diving into the ZFS/ Openindiana world as a storage solution so any tips and suggestions are welcome.
Nice article. One question though, in your features comparison chart you list “Virtualisation” as a row and that it is not supported on Nexentastor. What is meant by this? Thanks.
Thanks for the nice words.
By virtualization, I mean KVM, jails, zones etc.
Nexentastor is of course biult on top of nexenta which supports zones, but it really is not meant to used as a hypervisor.
On the other hand openindiana supports both zones (jails in FreeBSD) and KVM. This way you can run a linux VM in KVM on openindiana, backed up by zfs, so you snapshot it, send/receive it etc.
could please please guide me where I can find command to configure storage if I use Open indiana? By the way, your article is excellence but I think it would be perfect if you show a demo how to create a zpool, then storage (iscsi and NFS) on openindiana so all people can started doing by themself. I would appreciate if you do this . Thanks
Can you add SMART Monitor on the compare list?
Is there a link to an installation document available?
Look forward to your Benchmark on OpenIndiana VS NexentaStor
plz do more on Setup and Bugs on NFS and CIFS
for HomeLAB withVMs and NAS for home
Great work.
FreeNAS 8.3 Beta 2 will update to ZFS v28 with xattr support. Do you think you’ll move or retest with it?
Nexenta CE does zfs send/recv thru the cli. That was a show stopper for me too until I dug deeper and saw that. Still not ideal tho with an 18TB raw limit..
Hi Alex,
an interesting article.
One correction, xattr are supported on FreeBSD ZFS for a while by now:
http://en.wikipedia.org/wiki/Extended_file_attributes#FreeBSD
“Since FreeBSD 8.0, extended attributes are also supported on ZFS filesystem.”
FreeBSD 8.0 was released in November 2009.
On my FreeBSD-9 (build from April 2012)
# zfs list -o name,xattr zpool/jails/data/samba/shares/accounts
NAME XATTR
zpool/jails/data/samba/shares/accounts on
Please correct it in your article because it is misleading for readers, and who bothers to scroll down to the comments?
Thank you
Peter
Petros – are you running a custom build, or perhaps one of the development branches? I’m getting different results here.
On a fresh install of FreeBSD 9.1-RC1, I’m still seeing xattr set to “xattr = off / temporary” and “zfs set xattr=on myPool/dataset” just spits out “property ‘xattr’ not supported on FreeBSD: permission denied”
Would you be willing to post the commands you used to create your pool and dataset hierarchy? Possibly against 9.1-RC1 as well?
Thanks for the ray of hope
Hi Thomas,
it is a “standard” FreeBSD 9-stable..
I am getting the same result using xattr on the command line (not supported)
However, it all seems to work:
# touch /zpool/test2/test
# getfacl /zpool/test2/test
# file: /zpool/test2/test
# owner: root
# group: wheel
owner@:rw-p–aARWcCos:——:allow
group@:r—–a-R-c–s:——:allow
everyone@:r—–a-R-c–s:——:allow
# setfacl -m u:petros:rwxcosW:allow /zpool/test2/test
# getfacl /zpool/test2/test# file: /zpool/test2/test
# owner: root
# group: wheel
user:petros:rwx——Wc-os:——:allow
owner@:rw-p–aARWcCos:——:allow
group@:r—–a-R-c–s:——:allow
everyone@:r—–a-R-c–s:——:allow
# zfs list -o xattr zpool/test2
XATTR
off
Strangely, I have xattr=on on samba shares but I don’t know how that happened..
I believe somehow the user land tools are partially misrepresenting the truth but underlying everything seems to be fine.
I will investigate further if I have time.
Regards
Peter
Awesome write up, thanks dude. Having the internal battle with myself atm if I go OI or FreeNas… have to make a decision soon!
Hi:
What software you use or recommend for HA in OpenIndiana?
Thankz!
RSF-1 is a commercially available HA for OpenIndiana (among others)
Thank you for this great article. It really helps me.
Thanks for the write-up. I’ve been running NexentaStor for two years and I’m considering a jump to OpenIndiana. I’m thinking to time to lose that 18B limit…
Interesting article, but need to point out that you can do zfs send/receive in NexentaStore/Nexenta. It can be down by command line and going down to the OpenSolaris/Illumos core.
option expert_mode=1
!bash
Then run any of the supported OpenSolaris commands… Including zfs send/receive….
“FreeBSD is the father of practically all UNIX and derivatives, going back to the late 70′s.”
You have your history and UNIX all wrong.
FreeBSD development began in 1993. BSD UNIX, and in particular the last version 4.4BSD-Lite, is what FreeBSD and NetBSD was based on. The MACH kernel, which Apple’s MacOS X is derived from, was based on the 4.2BSD kernel.
Most of the commercial UNIX flavours, on the other hand, which carry the official UNIX certification, were based on AT&T UNIX System V which was first released in 1983. This includes Solaris, although SunOS was based on BSD.