Donate-large

If you like this blog, please make a little donation.

It's a secure process, and with your generosity I will be able to review more hardware and software.

Just click the "Donate" button below and follow the easy instructions, and I will thank you eternally.

Tuesday, January 17, 2012

Microsoft returns customers to the past, ReFS, the Resilient File System

Microsoft has just announced the name and features of his newest operating system, to be introduced with Windows 8 Server.

You can read the announcement in their official blog.

Here I will introduce some concepts and why this new filesystem, the Resilient File System (or ReFS) is not a next generation one, but a return to the past. Also I will compare current and "next generation" offerings from Microsoft (that's NTFS and ReFS) with competition from other filesystems (btrfs, HFS+, ZFS, UFS2, HAMMER).

Metadata: Data about the data, usually refers to timestamps, filename and the location of blocks where the data it's associated to resides.

Alternate Data Streams (ADS), Extended Attributes (xattrs): This is the name given to arbitrary data associated with the main data, mainly used for describing information about the data. It is extensively used by Mac OS X clients, as well as Windows Explorer. Some antivirus software also use this associated data to store information about last scanning of files. Please note that what UNIX/Linux and Macintosh call Extended Attributes are Alternate Data Streams in Microsoft nomenclature.

Extended Attributes (Microsoft nomenclature): These are a kind of alternate data streams limited to 64 Kbytes per file and consisting of name=value pairs. Introduced and extensively used by OS/2, they could be implemented as ADS or xattrs.

Resource Fork: Essential data for pre-Mac OS X applications and documents, contains data that is usually needed to describe the data part of the file (called Data Fork) or is the only data present for the file (like some document formats exclusive to Macintosh systems).

Finder Info: Essential metadata about files and folders for all Mac OS versions (up to and including X 10.7 Lion).

NOTE about the Resource Fork and the Finder Info as well as xattrs on Mac OS X: Traditionally Windows Server systems used ADS to store the Resource Fork and the Finder Info on volumes shared with Macintosh clients. The same is done for the Resource Fork, Finder Info and xattrs for all NTFS solutions existing on Mac OS X (Paragon NTFS, Tuxera NTFS and Apple's NTFS). When Mac OS X encounters a filesystem that does not support storing this information (like FAT32, or ReFS), it creates separate files for storing it, being "filename" the data fork, "._filename" the Resource Fork and Finder Info for "filename" and ".DS_Store" for the folder's Finder Info.

Object IDs, Catalog Node IDs, inode numbers: Is the name given in Windows, Macintosh and UNIX (incl. Linux) worlds to an unique identification inside a filesystem that corresponds to a single file, so in the case it is moved it can still be accessed using the same identification wherever it's new location (as long as it's inside the same volume).

Transparent file compression: This means the filesystem automatically compress the file data. While it is a slow operation when writing, is a lot faster while reading, and for files that are accessed mostly for reading and almost never for writing (that is, operating system and application files, but not documents) it can provide a speed up and alleviate space congestion, as the file takes really less space in disk (and so the disk reads it faster) and the decompression is still faster than the difference in reading the uncompressed data directly from disk.

Transparent file encryption: This allows files to be transparently encrypted with a key known only to a single user, so any other cannot access its contents. This is specially useful in shared volumes and servers where files can be stored in locations accessible by more people than those authorized to access a determined file.

Sparse files: A sparse file is a file where data is dynamically allocated. For example, an application can prevent it will need 1Gb for a file, and so it request that to the filesystem. If the filesystem supports sparse files the file will only take on disk the space that is really used, and not the whole reserved. Sparse files can also contain holes, so if a piece (cluster/block) of the file is never filled (is always 0s), it is never allocated with real space.

Symbolic links, Alias, Shadows: A symbolic link is a file that contains the location of another file. Mac OS calls them Alias and OS/2 called them Shadows.

Hard links: A hard links happens when to files are really, in disk, the same file, so the changes done to one affect the others, that can have a different location, name, timestamps, so on.

Differences between symbolic and hard links: Usually a symbolic link can point to files that reside in another volume, and they use not only the same blocks for metadata as the hard links, but also the blocks needed for the location of the real file (usually 4 or 8 Kbytes). Also if the real file is deleted, all the symbolic links associated with it become useless, while in the case of hard links, all of the links are "the real file" and it only disappears when all of the links are deleted.

Quotas: Quotas allow system administrators to specify that a user cannot exceed a determined allocation, so for example, if the system disk's is 2000Gb, the maximum each user can ever fill (his quota) is 100Gb, with 200Gb reserved to the administrator. This can usually be set per-user or per-group, and is really important in big corporations with hundreds of users sharing a same disk.

Data deduplication: Data deduplication is a feature that's offered by some filesystems and some third party enhancements for filesystems (it can be more or less implemented using hard links) where files that contain the same data are stored only once

Filesystem conversion: A feature almost never offered by any filesystem. It allows to automatically convert between two filesystems without losing any data in the process, neither having to copy/backup it to another drive and then again back. Was offered from HPFS->NTFS and is currently offered from FAT32->NTFS and from ext3/4->btfs.

Checksum: A checksum is a mathematically obtained number from a calculation over the data (or the metadata) that changes as long as any single bit (smallest piece of computer data possible) and is unique to that data. It allows us to find corruption and deduplicate identical files.

Journaled filesystem: A journaled filesystem is a filesystem where any change to the metadata is first done in the journal, which is then slowly applied to the real location of the metadata. If anything prevents the change to be written in the journal the old metadata still is present, and if anything prevents the change to be written in its real position the journal still says the change must be applied, so we never loss anything (we get the change, or we get nothing, no middle points, no corruption).

User defined transactions: Similar to how a journal works with metadata, a user defined transaction allows changes to the data be only finally committed when specifically marked as finished by an application.

Bootable: Means that the operating system can be installed and boots from this filesystem, taking full advantage of its features.

 

What is then offered by ReFS?:

ReFS adds copy-on-write allocation for metadata, that is, metadata is written to another location, and if time allows, the old location is deprecated. The same that a journal does without the journal.

It adds mandatory checksum to metadata and optional checksum for user data, all of them "scrubbed" to check if they are still valid, recovering from correct copies (if they exist) or sending a fail event to the system log (if they don't exist) on a failure.

Integrates with upper layer Windows 8's Storage Spaces to stripe (part, separate, piece) data for enhanced performance.

It removes support that was present in NTFS for: Alternate Data Streams (xattrs, Resource Forks and Finder Info) as well as OS/2's Extended Attributes, transparent file compression and encryption, sparse files, hard links, quotas, filesystem conversion and boot abilities.

Oh, and it removes support for storing separate "8.3" filenames for DOS network clients (no one cares).

It also allows for bigger volumes, files and directories.

 

Comparing with competing filesystems:

Please note in this comparison bootable means only on the native system, and while some features where added lately (like transactions to NTFS and journaling to HFS+) the comparison counts all of the currently as of 17th January 2011 implemented ones (stable or beta, but not just "planned"). Also partially implemented features (NTFS sparse files for example) are counted as a Yes.

 

btrfs

ext4

HAMMER

HFS+

NTFS

ReFS

ZFS

Manufacturer

Oracle

Theodore Ts'o

Matthew Dillon

Apple

Microsoft

Microsoft

Sun (Oracle)

Native system

Linux

Linux

DragonFly BSD

Mac OS X

Windows

Windows

Solaris

Year of introduction

2007

2006

2008

1998

1993

2012

2005

ADS/xattrs

Yes

Yes

Yes

Yes

Yes

No

Yes

Object Ids

Yes

Yes

Yes

Yes

Yes

No

Yes

File compression

Yes

No

No

Yes

Yes

No

Yes

File encryption

Planned

No

No

No

Yes

No

Yes

Sparse files

Yes

Yes

Yes

No

Yes

No

Yes

Symbolic links

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Hard links

Yes

Yes

Yes

Yes

Yes

No

Yes

Quotas

Yes

Yes

Yes

Yes

Yes

No

Yes

Deduplication

Yes

No

Yes

No

No

No

Yes

Conversion

Yes

Yes

No

No

Yes

No

No

Metadata checksum

Yes

No

Yes

No

No

Yes

Yes

Data checksum

Yes

No

Yes

No

No

Yes

Yes

Transactions

Yes

No

No

No

Yes

No

Yes

Journaled

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Bootable

Yes

Yes

Yes

Yes

Yes

No

Yes

Integrated data striping and recovery from failure

Planned

No

No

With XSAN

No

With Storage Spaces

With Zpools

Max file size

16 EiB

16 TiB

1 EiB

8 EiB

16 EiB

16 EiB

16 EiB

Max volume size

16 EiB

1 EiB

1 EiB

8 EiB

16 EiB

16 EiB

16 EiB

Conclusion:

Microsoft's new "Resilient File System" is nothing that cannot be implemented on top of NTFS. It's just a huge lost of features that's giving nothing new and taking a lot of features that users, developers and administrators use, plan to use, love and want.

When I read the official MSDN blog post about it I though "yeah Microsoft good joke" while I checked the calendar to see if today was April's Fool. Seeing it's not, my conclusion is that I will stay with NTFS as long as possible and if I somehow want the only interesting feature about ReFS (resilience) I will move to a serious filesystem like ZFS, btrfs or even, HFS+.

26 comments:

  1. typo: ext4 - year of introduction — 2066

    ReplyDelete
  2. Umm what year "will" ext4 be introduced? (Typo alert)

    ReplyDelete
    Replies
    1. Microsoft travels to the past with filesystems and Linux does to the future!

      Being serious, typo correct :p

      Delete
  3. Don't you think it's too early to judge ReFS by the single blog post? Windows 8 will include only it's base features, so users had a chance to test it. Not a good idea to release the Resilient FS to discover bugs later on, eh? (Windows is a commercial product and cannot include untested and beta features on it's release, responsibility is the word)
    Other features will be available later, probably in the next windows version, or in one of upcoming service packs.

    ReplyDelete
    Replies
    1. So commercial responsibility and common sense dictates me the way to do things is to add new features to an already tested and stable filesystem (NTFS), not creating a new one missing a lot of features that will be slowly added "when tested" (they worked on NTFS).

      Also I've read and re-read the official blog post about three times and it seems quite clear those features are deprecated, that is, deleted for never-ever being added again.

      Also from a customer point of view while migrating to Windows 8 with NTFS is an option, where the need for those features exist, migrating to Windows 8 with ReFS expecting those features to return in "Windows 9" or "Service Pack" or "R2" is not feasible, so other solutions should be chosen (migrating to UNIX/Linux/Mac).

      Delete
    2. From what i've read:
      «we will implement ReFS in a staged evolution of the feature: first as a storage system for Windows Server, then as storage for clients, and then ultimately as a boot volume. This is the same approach we have used with new file systems in the past»
      So «bootable» definitely planned, other features unknown at this time. (unknown as in not announced, not deprecated)
      In the Q&A, "No, this is not implemented or supported" applies to the initial release only.
      Deduplication, caching, and other features will be initially available as third party solutions.

      NTFS seems too old, it may not scale well, or it does not support some features by design. It is still the FS to use, and ReFS is not meant to replace it in this release.

      Delete
  4. No, it is definitely not too early to judge. We have been told what is missing and the implications of that immediately fall out.

    It is not the inevitable evolution of ReFS that is of concern to many, rather it is the potential breaking impact of it, imminently, to existing applications.

    ReFS appears to be a mainly a regression of NTFS, with many sticks and few carrots.

    With all its missing features, it looks to be far, far too premature to launch in a commercial product as widely used as Windows is.

    (If it is not even possible to boot Windows from the file system, it should be seen as being clearly not yet fit for purpose or mature enough to release.)

    ReplyDelete
    Replies
    1. There were quite a few troll posts on the blog that assumed they were getting a crippled FS on Windows 8. The blog seems quite clear that they are implementing what is needed for the Server version first attempt and what is available years from now on the client OS will be quite a bit different.

      The doom and gloom / troll crowd typically don't bother to read before commenting.

      What I would be curious about, which may be off topic, would be what Microsoft is doing to prevent virus/trojan writers from seizing control of your computer, hijacking the file system, writing hidden files and preventing you from finding them, hiding processes, etc.

      I think MS provides far too much low-level access and I would be quite happy if they implemented some type of licensing component into Windows for vendors to secure their software/files licenses and then take away low-level access completely. If it breaks some old software so be it.

      Delete
  5. All of windows 8 is sticks with carrot crumbs. Metro.... on a 30" monitor? MS needs to drop metro and develop a decent windowed OS.

    ReplyDelete
  6. Sparse File support in NTFS is a major exaggeration. They were somewhat working up to Windows XP. In Vista and Win7 they are 100% inoperable.

    ReplyDelete
  7. According to the blog post, in the first release ReFS is intended primarily for file server deployments, not as a general purpose file system (a lot of file servers don't need compression, encryption, etc). I expect they will add back the missing NTFS features over time, where it makes sense (i.e. everything except backwards compatibility stuff like 8.3 filenames). Of course they could keep adding stuff to NTFS, but they can't make major changes to the disk format without rewriting tons of code and if you're going to do that it's much better to just create a new filesystem (which is what they've done).

    ReplyDelete
  8. ZFS is transactional. You have it marked as "no"

    http://docs.oracle.com/cd/E19963-01/html/821-1448/zfsover-2.html#gaypi

    ReplyDelete
    Replies
    1. Correct, I corrected the entry.

      I did not find specifics about user-data transactions when I wrote the article.

      Delete
    2. Look at Transactional NTFS (TxF) - Are you sure that other file systems to NTFS offer this?

      A journaling file system will ensure that individual metadata changes are atomic, such as creating directory entries, renaming files, etc. It only applies to one change. If a system crashes, the file structure itself will be correct but the data in its files might not be.

      A logging file system will ensure that both the metadata and the actual data are journaled for an individual operation. This allows it to make sure the contents of a file are correct, but it still only works on one operation.

      A transactional file system gives transactionality to both data and metadata over many modifications. So, many files can be created and modified atomically - either everything happens or nothing does.

      Delete
  9. There are quite a few errors/mixups in the above article.

    Object IDs is not the same as inode. That is user-level concept where an "app"/"service" sets a GUID for it's own tracking purposes. NTFS does have inode, it is called file-id

    Transcatons and TxF is being mixed up. TxF is availabel at application level, AFAIK no other filesystem povides that.

    Alternate Data Streams exists, only "named" streams are not there.

    Regarding so called lack of features:

    Disk conserving technologies like sparse/compression/dedup are left to the storage stacks to deliver.

    Per-file encryption is tackled using volume level encryption.

    Most of the features that have been removed were there for application/usage-scenario. One can be pretty certain that Microsoft has good stats on what is actully being used out there.

    Shishir

    ReplyDelete
    Replies
    1. Oops, realized that ADS is gone period. I stand corrected.

      Shishir

      Delete
    2. File IDs are not so unique, they can be reused while Object IDs and CNIDs cannot. Inode uniqueness depends on implementation but they should be unique.

      Transactions are the same that NTFS calls TxF, when user data is written in transactions. Not to be confused with transactional writing of metadata, that's called Journaling.

      The storage stacks don't know where the files are allocated, only sectors. While deduplication can be done at a block-level, not so the other solutions.

      If you sparse a file, the storage level will never know that the sector should be marked free. It has been used, it's not 0s, it's allocated.

      Compression applied to the storage level will also not known about specific files, so you cannot specify to compress only system files.

      Adding APIs so the filesystem can specify independent blocks will add extra overhead and complexity that has been supposedly removed from the filesystem layer.

      Per-file encryption has a whole different case-usage scenario than per-volume encryption. Once the volume has been decrypted (by a valid bootloader) any user with access authorization to that volume will be able to access all authorized files.

      However with per-file encryption you can fine-grain access.

      You can for example, in a big enterprise, big requested by your boss to create a secret document and store it in a shared-folder. The shared-folder forces all files created inside it to be readable by all members of your department, something your boss does not want to happen.

      With per-volume encryption all of your department will be able to access the file. With per-file encryption you can encrypt the file so only you and your boss will be able to access it.

      All of the features that have been removed are being used extensively by users, administrators and developers, including Microsoft itself (Windows Explorer is the application that creates more ADS)

      Delete
  10. This comment has been removed by a blog administrator.

    ReplyDelete
  11. Simon, your comment has been deleted because offensive contents and a demonstration that you have neither read the Microsoft blog, neither this, neither the comments.

    Any other comment with insults or offensive contents will similarly be deleted without contemplation.

    ReplyDelete
  12. What? Offensive? I think I've read both posts. But anyway, this is your house, so your rules apply. Won't bother you anymore in the future.

    ReplyDelete
    Replies
    1. You started with "stupid" then continued with "who uses that?" when in the post and the comments is clearly specified WHO, then you followed with a comparison of two features that are completely different and whose differences were already explained in both the post and the comments... etc, etc, etc

      Delete
  13. This comment has been removed by the author.

    ReplyDelete
  14. HFS to HFS+ conversion:
    http://download.cnet.com/Alsoft-PlusMaker/3000-2094_4-234.html

    It's a third party tool, but it worked the one time I used it.

    Re: File Encryption...

    While I understand the usefulness of the granularity that per file encryption provides in a multi-user scenario, complete volume encryption is more appropriate in other use models. While it's true that Microsoft is really limiting the cases which it makes sense to use of their new filesystem (i.e. departmental file servers, etc), not listing complete disk encryption separately does not paint an a completely accurate picture of how these compare, IMO.

    ReplyDelete
    Replies
    1. Interesting I never knew about that tool.

      Seeing how people is unable to see the differences between per-file and per-volume encryption I may make a complete article about them.

      Delete
  15. Estimada :

    Un par de typos:

    "...prevents..." --> '...foresees..." en el párrafo de Sparse files (foresee=prever, prevent=prevenir!).

    "...hard links happens when to files are really..." --> "hard links happen when two or more files are really..."
    Por lo demás, muy bueno!

    MS has a special gift of making things hard for its users. No wonder they do not generate a fanatical userbase, only a long-suffering captive audience. This is just one more of many examples.

    Cheers,
    jaf

    ReplyDelete