Introduction of modern, cloud-era clustered data layer “EdgeFS” to solve complex issues of Edge and Fog Computing

We’ve talked a lot lately about the ways of transforming cloud data-intensive workloads in terms of speed and efficiency. It can be achieved as long as dataset located fairly close to the application. By fairly close I meant networking latency.

This is a legacy, centralized design the problems of which in modern cloud-era we going to discuss in this article. Such legacy design assumes that all the data is somehow “cloud-born”, and somewhat magically appears in the cloud in the application hard-coded location.

This has worked when application operates with some few gigabytes of data and/or a few millions of records per day delivered. And to optimize further, we employ asynchronous data copy or some off-the-shelf persistent message queuing technique.

This has worked while cloud connecting network is “reasonably fast” thus an application can upload few megabytes within seconds. We will batch it and then copy it over in terms of to overcome latency problems. While not always possible, sometimes we could even modify our application and send data in batches using multithreaded parallel upload.

So, we are done, all works great, yeah?

Not even close. It reminds me wonders of FIDO-Net with batched packet messaging over ADSL modem and Neo calling Trinity to get him out of Matrix… How impressive and sophisticated that was! :-)

Let’s compare. While transfer speed increased 10x-100x, WAN (wide area network) networking RTT (round trip time) latency hasn’t changed as drastically and still depends on how far apart connected sites are. I.e. for 20+ years, WAN latency didn’t marginally improve and even for direct fiber connections coast-to-coast will not be better then 40ms-100ms depending on the vendor and price.

And it is not just a latency problem. Humans advancing in gathering data. We now digitize everything, we going paperless all the way through, we now will be collecting with 10Gbps speeds, thanks to emerging 5G standards! What I’m trying to say here, some of our year-old predictions on data growth at the cloud edges is likely underestimated.

With this in mind, our public cloud processing centers simply will not be able to keep up. Economics of putting data into the central location is going to be broken. And that is why the new Fog Computing paradigm is emerging.

Now back to the data layer and connecting sites. Let’s re-evaluate.

Bandwidth or slow WAN I/O problem. It is going to be problematic to use the current model for storing terabytes or petabytes of data that is now possible for us to gather with ease at the computing edges. And it is seriously impossible to keep up with hundreds of millions of records that explosion of sensors and 5G technology now capable to generate.

Latency or slow WAN I/O problem. Even most sophisticated fiber backbones cannot deliver better than the law of physics networking latencies. Latencies are increasing as you add more kilometers to all your WAN connected locations. With today’s state of technologies, it may work a little better for the applications that can utilize batching and multi-threading techniques, introducing overall more complexity and the necessity to deal with eventual consistency.

Data Copy problem. And after data is copied, it surely has to be kept somewhere at the destination in terms of to analyze, and re-analyze it for some time. Ingress, cloud function, CPU or instance cloud costs are still going to be nothing comparing to what one would pay for storing a few hundreds of terabytes in the cloud, even temporarily.

Data Management problem. Not only copied data can no longer be deduplicated or compressed efficiently, but it also creates an additional expense — the cost of ownership while managing ever-growing incoming arrays of raw or post-processed data.

Data “Silos” problem. And what if you have more then two regions due to business requirements where data has to be kept due to regulations or other reasons? An uncontrollable aggregation of digital data creates “silos”. Over time silos of data will be harder to manage, more difficult to move, backup and more difficult to analyze.

Multi-Cloud problem. Even in relatively small organizations, we end up using more than a single cloud. Simply because we can and due to the economics of it we becoming users of many cloud offerings. Not just another data “silo” here. We are now dealing with a different set of technologies, protocols, and processes. Even few terabytes with copies here and there presenting a managing challenge. And thus, underestimated impact on data layer complexity kills the benefits of going multi-cloud in the first place.

We need a new data layer.

The world needs new, innovative mechanisms to collect, distribute, observe and consume data easily.

A solution that scales across geographical regions equally well as it scales locally, avoids duplicates of excessive copies, greatly simplify backup and DR policies setup, provides multi-protocol access via standard File, Block, Object and NoSQL interfaces.

A solution that is open-sourced at the core, with community-driven design and development.

A solution that integrates well with existing Cloud Native ecosystem.

An architecture that enables a new class of Fog Computing applications.

Introducing EdgeFS — a multi-cloud scalable distributed storage system

EdgeFS is high-performance and low-latency object storage system released under Apache License v2.0 developed in C/Go. It provides Kubernetes integrated Multi-Head Scale-Out File access (NFS compliant, Distributed RW access to files), Amazon S3 compatible Object API with AI/ML S3X enhancements, iSCSI and NBD block interfaces, advanced global versioning with file-level granularity unlimited snapshots, global data deduplication and geo-transparent access to data from on-prem, private/public clouds or small footprint edge (IoT) devices.

EdgeFS is capable of spanning an unlimited number of geographically distributed sites (Geo-site), connected as one global namespace data fabric running on top of Kubernetes or Docker platform, providing persistent, fault-tolerant and high-performance and compatible storage protocols such as S3 Object API, File and Block CSI volumes for stateful Edge/IoT and Fog Computing Applications.

At each Geo-site, EdgeFS segment nodes deployed as containers (Kubernetes StatefulSet or Docker Compose) on physical or virtual nodes, pooling available storage capacity and presenting it via compatible S3/NFS/iSCSI/etc storage emulated protocols for cloud-native applications running on the same or dedicated servers.

How it works, in a Nutshell?

If you familiar with “git”, where all modifications are fully versioned and globally immutable, it is highly likely you already know how it works at its core. Think of it as a world-scale copy-on-write technique. Now, if I can make a parallel for you to understand it better — what EdgeFS does, it expands “git” paradigm to object storage and making Kubernetes Persistent Volumes accessible via emulated storage standard protocols e.g. S3, NFS and even block devices such as iSCSI. With fully versioned modifications, fully immutable metadata and data, users data can be transparently replicated, distributed and dynamically pre-fetched across many Geo-sites.

How EdgeFS solves the problem?

When we were designing EdgeFS, we looked at the escalating issues holistically. Not only we wanted to create a solution that connects sites better, but we also wanted to solve local performance and data management issues associated with traditional approaches such as Policy-Based or Event-Based data copy. We realized that if we will be solving connectivity and data flow directing issues separately from the core data layer architecture, things will not be easier and we would end up in square one pretty quickly. As such, EdgeFS presenting a unique and feature-complete total solution to the problem issues described above.

There are two buckets of issues that are most important to discuss in the context of efficiency of Edge/IoT Computing and how it can be solved with EdgeFS:

Slow WAN I/O. When I/O is local, everything runs smoothly and fast even on prehistoric storage systems. Local storage I/O is now as fast as local networking I/O, with latency measured in tens of microseconds. EdgeFS enables “as much local I/O as possible” strategy, where both reads and writes made local in a majority of the use cases. EdgeFS achieving this with Copy-On-Write data path design. In EdgeFS system, all the constructs are made immutable. Same as in Git really, except that we extending this paradigm to a real storage system that is capable of delivering local IOPS performance while preserving globally distributed historic changes of all of the modifications. I.e. in EdgeFS terms “version update” operation is not just fast it is supersonic fast! EdgeFS writes always local and version reconciliation happens with transactional consistency. EdgeFS reads almost local and the behavior depends on the configuration that enables dynamic fetch of missing data chunks as well as geo-transparent caching. Because of global (that is across geo-sites) versioning and metadata immutability, we now can guarantee consistency of reads as well as enable intermediate metadata or data aggregations (think of Fog Computing data aggregator use cases here) in terms of to make I/O as much local as possible.

When I/O is local, everything runs smoothly and fast even on prehistoric storage systems. Local storage I/O is now as fast as local networking I/O, with latency measured in tens of microseconds. EdgeFS enables “as much local I/O as possible” strategy, where both reads and writes made local in a majority of the use cases. EdgeFS achieving this with Copy-On-Write data path design. In EdgeFS system, all the constructs are made immutable. Same as in Git really, except that we extending this paradigm to a real storage system that is capable of delivering local IOPS performance while preserving globally distributed historic changes of all of the modifications. I.e. in EdgeFS terms “version update” operation is not just fast it is supersonic fast! EdgeFS writes always local and version reconciliation happens with transactional consistency. EdgeFS reads almost local and the behavior depends on the configuration that enables dynamic fetch of missing data chunks as well as geo-transparent caching. Because of global (that is across geo-sites) versioning and metadata immutability, we now can guarantee consistency of reads as well as enable intermediate metadata or data aggregations (think of Fog Computing data aggregator use cases here) in terms of to make I/O as much local as possible. High cost. Ultimately this is the result of excessive data copies, higher cost of ownership and increased data management overhead. In traditional systems, where data mover (e.g. rsync, cloudsync, etc), data flow directors (iRODS, Robinhood, etc) and data storage (GPFS, Luster, Ceph, Gluster, etc) are separately designed layers, the total solution suffers from the problem issues discussed above. EdgeFS design is consistent across mover, flow director, and storage data layers. For instance, traditionally, local site benefits such as deduplication, caching, snapshotting now can be extended globally across geo-sites. EdgeFS isn’t transferring data block if it detects that it is a duplicate. It cuts cost on networking bandwidth and storage. EdgeFS snapshots are “floating” across stretched namespace, thus providing write consistency with ease. EdgeFS namespace data flows pre-constructed and doesn’t need to be event or policy-driven. With geo-transparent replication (data is always online, even while replicating) one doesn’t need to set up excessive policies, logging, etc. The management of global namespace is fully automatic. In addition, it can be configured with an option to only replicate metadata thus enabling pre-fetching of just blocks of data that is currently needed. This also works in reverse, i.e. EdgeFS is capable of self-healing by fetching “missing” blocks from adjacent Geo-Sites, thus simplifying DR policy and improving availability.

Stateful Serverless Fog Computing

EdgeFS is now officially open for all Edge/IoT and Fog Computing enthusiasts. As a part of a great Kubernetes CNCF Rook community, EdgeFS takes on real problems that cannot be easily band-aided with existing solutions. Fog Computing is a relatively new paradigm. In my mind, Fog Computing assumes the processing of a large amount of data with computational power similar to or exceeding today’s cloud computing capabilities. With geo-transparency as explained above, EdgeFS opens up an opportunity to fundamentally new class of applications: Stateful Serverless Fog Computing(SSFC). SSFC applications capable of moving computational workloads without the need to worry about stateful dataset availability. Serverless cloud function written with SSFC design in mind makes a safe assumption that dataset is always available, thus enabling stateful function mobility.

Learn more about EdgeFS and Rook growing communities. Join us today!