Donate-large

If you like this blog, please make a little donation.

It's a secure process, and with your generosity I will be able to review more hardware and software.

Just click the "Donate" button below and follow the easy instructions, and I will thank you eternally.

Tuesday, January 17, 2012

Microsoft returns customers to the past, ReFS, the Resilient File System

Microsoft has just announced the name and features of his newest operating system, to be introduced with Windows 8 Server.

You can read the announcement in their official blog.

Here I will introduce some concepts and why this new filesystem, the Resilient File System (or ReFS) is not a next generation one, but a return to the past. Also I will compare current and "next generation" offerings from Microsoft (that's NTFS and ReFS) with competition from other filesystems (btrfs, HFS+, ZFS, UFS2, HAMMER).

Metadata: Data about the data, usually refers to timestamps, filename and the location of blocks where the data it's associated to resides.

Alternate Data Streams (ADS), Extended Attributes (xattrs): This is the name given to arbitrary data associated with the main data, mainly used for describing information about the data. It is extensively used by Mac OS X clients, as well as Windows Explorer. Some antivirus software also use this associated data to store information about last scanning of files. Please note that what UNIX/Linux and Macintosh call Extended Attributes are Alternate Data Streams in Microsoft nomenclature.

Extended Attributes (Microsoft nomenclature): These are a kind of alternate data streams limited to 64 Kbytes per file and consisting of name=value pairs. Introduced and extensively used by OS/2, they could be implemented as ADS or xattrs.

Resource Fork: Essential data for pre-Mac OS X applications and documents, contains data that is usually needed to describe the data part of the file (called Data Fork) or is the only data present for the file (like some document formats exclusive to Macintosh systems).

Finder Info: Essential metadata about files and folders for all Mac OS versions (up to and including X 10.7 Lion).

NOTE about the Resource Fork and the Finder Info as well as xattrs on Mac OS X: Traditionally Windows Server systems used ADS to store the Resource Fork and the Finder Info on volumes shared with Macintosh clients. The same is done for the Resource Fork, Finder Info and xattrs for all NTFS solutions existing on Mac OS X (Paragon NTFS, Tuxera NTFS and Apple's NTFS). When Mac OS X encounters a filesystem that does not support storing this information (like FAT32, or ReFS), it creates separate files for storing it, being "filename" the data fork, "._filename" the Resource Fork and Finder Info for "filename" and ".DS_Store" for the folder's Finder Info.

Object IDs, Catalog Node IDs, inode numbers: Is the name given in Windows, Macintosh and UNIX (incl. Linux) worlds to an unique identification inside a filesystem that corresponds to a single file, so in the case it is moved it can still be accessed using the same identification wherever it's new location (as long as it's inside the same volume).

Transparent file compression: This means the filesystem automatically compress the file data. While it is a slow operation when writing, is a lot faster while reading, and for files that are accessed mostly for reading and almost never for writing (that is, operating system and application files, but not documents) it can provide a speed up and alleviate space congestion, as the file takes really less space in disk (and so the disk reads it faster) and the decompression is still faster than the difference in reading the uncompressed data directly from disk.

Transparent file encryption: This allows files to be transparently encrypted with a key known only to a single user, so any other cannot access its contents. This is specially useful in shared volumes and servers where files can be stored in locations accessible by more people than those authorized to access a determined file.

Sparse files: A sparse file is a file where data is dynamically allocated. For example, an application can prevent it will need 1Gb for a file, and so it request that to the filesystem. If the filesystem supports sparse files the file will only take on disk the space that is really used, and not the whole reserved. Sparse files can also contain holes, so if a piece (cluster/block) of the file is never filled (is always 0s), it is never allocated with real space.

Symbolic links, Alias, Shadows: A symbolic link is a file that contains the location of another file. Mac OS calls them Alias and OS/2 called them Shadows.

Hard links: A hard links happens when to files are really, in disk, the same file, so the changes done to one affect the others, that can have a different location, name, timestamps, so on.

Differences between symbolic and hard links: Usually a symbolic link can point to files that reside in another volume, and they use not only the same blocks for metadata as the hard links, but also the blocks needed for the location of the real file (usually 4 or 8 Kbytes). Also if the real file is deleted, all the symbolic links associated with it become useless, while in the case of hard links, all of the links are "the real file" and it only disappears when all of the links are deleted.

Quotas: Quotas allow system administrators to specify that a user cannot exceed a determined allocation, so for example, if the system disk's is 2000Gb, the maximum each user can ever fill (his quota) is 100Gb, with 200Gb reserved to the administrator. This can usually be set per-user or per-group, and is really important in big corporations with hundreds of users sharing a same disk.

Data deduplication: Data deduplication is a feature that's offered by some filesystems and some third party enhancements for filesystems (it can be more or less implemented using hard links) where files that contain the same data are stored only once

Filesystem conversion: A feature almost never offered by any filesystem. It allows to automatically convert between two filesystems without losing any data in the process, neither having to copy/backup it to another drive and then again back. Was offered from HPFS->NTFS and is currently offered from FAT32->NTFS and from ext3/4->btfs.

Checksum: A checksum is a mathematically obtained number from a calculation over the data (or the metadata) that changes as long as any single bit (smallest piece of computer data possible) and is unique to that data. It allows us to find corruption and deduplicate identical files.

Journaled filesystem: A journaled filesystem is a filesystem where any change to the metadata is first done in the journal, which is then slowly applied to the real location of the metadata. If anything prevents the change to be written in the journal the old metadata still is present, and if anything prevents the change to be written in its real position the journal still says the change must be applied, so we never loss anything (we get the change, or we get nothing, no middle points, no corruption).

User defined transactions: Similar to how a journal works with metadata, a user defined transaction allows changes to the data be only finally committed when specifically marked as finished by an application.

Bootable: Means that the operating system can be installed and boots from this filesystem, taking full advantage of its features.

 

What is then offered by ReFS?:

ReFS adds copy-on-write allocation for metadata, that is, metadata is written to another location, and if time allows, the old location is deprecated. The same that a journal does without the journal.

It adds mandatory checksum to metadata and optional checksum for user data, all of them "scrubbed" to check if they are still valid, recovering from correct copies (if they exist) or sending a fail event to the system log (if they don't exist) on a failure.

Integrates with upper layer Windows 8's Storage Spaces to stripe (part, separate, piece) data for enhanced performance.

It removes support that was present in NTFS for: Alternate Data Streams (xattrs, Resource Forks and Finder Info) as well as OS/2's Extended Attributes, transparent file compression and encryption, sparse files, hard links, quotas, filesystem conversion and boot abilities.

Oh, and it removes support for storing separate "8.3" filenames for DOS network clients (no one cares).

It also allows for bigger volumes, files and directories.

 

Comparing with competing filesystems:

Please note in this comparison bootable means only on the native system, and while some features where added lately (like transactions to NTFS and journaling to HFS+) the comparison counts all of the currently as of 17th January 2011 implemented ones (stable or beta, but not just "planned"). Also partially implemented features (NTFS sparse files for example) are counted as a Yes.

 

btrfs

ext4

HAMMER

HFS+

NTFS

ReFS

ZFS

Manufacturer

Oracle

Theodore Ts'o

Matthew Dillon

Apple

Microsoft

Microsoft

Sun (Oracle)

Native system

Linux

Linux

DragonFly BSD

Mac OS X

Windows

Windows

Solaris

Year of introduction

2007

2006

2008

1998

1993

2012

2005

ADS/xattrs

Yes

Yes

Yes

Yes

Yes

No

Yes

Object Ids

Yes

Yes

Yes

Yes

Yes

No

Yes

File compression

Yes

No

No

Yes

Yes

No

Yes

File encryption

Planned

No

No

No

Yes

No

Yes

Sparse files

Yes

Yes

Yes

No

Yes

No

Yes

Symbolic links

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Hard links

Yes

Yes

Yes

Yes

Yes

No

Yes

Quotas

Yes

Yes

Yes

Yes

Yes

No

Yes

Deduplication

Yes

No

Yes

No

No

No

Yes

Conversion

Yes

Yes

No

No

Yes

No

No

Metadata checksum

Yes

No

Yes

No

No

Yes

Yes

Data checksum

Yes

No

Yes

No

No

Yes

Yes

Transactions

Yes

No

No

No

Yes

No

Yes

Journaled

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Bootable

Yes

Yes

Yes

Yes

Yes

No

Yes

Integrated data striping and recovery from failure

Planned

No

No

With XSAN

No

With Storage Spaces

With Zpools

Max file size

16 EiB

16 TiB

1 EiB

8 EiB

16 EiB

16 EiB

16 EiB

Max volume size

16 EiB

1 EiB

1 EiB

8 EiB

16 EiB

16 EiB

16 EiB

Conclusion:

Microsoft's new "Resilient File System" is nothing that cannot be implemented on top of NTFS. It's just a huge lost of features that's giving nothing new and taking a lot of features that users, developers and administrators use, plan to use, love and want.

When I read the official MSDN blog post about it I though "yeah Microsoft good joke" while I checked the calendar to see if today was April's Fool. Seeing it's not, my conclusion is that I will stay with NTFS as long as possible and if I somehow want the only interesting feature about ReFS (resilience) I will move to a serious filesystem like ZFS, btrfs or even, HFS+.