Eternal RAM: the 3rd Generation Data Persistence

Alex Rogachevsky
3 min readApr 28, 2018

Data persistence is the most fundamental concept in business software. The current technology is based on the second, “relational” generation of databases. Storing data “relationally” in indexed tables was a huge leap forward compared to mainframe era flat files. The infamous Y2K crisis exposed the flaws of flat files, as changing just one field from two digits to four (storing the year as four digits) became a painful and expensive process.

Relational or not, all databases share one serious flaw. They are disk-based. Even after the last mechanical HDD is replaced by fast flash storage, the “freight” block-based nature of disk I/O remains fundamentally different from frequent fine-grained RAM access — hence the need to pack and unpack small data units into those “blocks” to minimize the number of expensive round-trips to the storage.

30+ years of perfecting data transportation logistics brought us indexing, partitioning, query optimization, batching, caching, and many more amazing improvements. ORMs hide all of that from programmers making the database look “object-oriented”. That illusion however only goes so far.

Despite the most brilliant attempts to automate four fundamental data transfer operations: Create, Retrieve, Update, and Delete (CRUD), tedious CRUD plumbing still comprises 90% of conventional business software code. A much bigger problem is CRUD’s inherently procedural nature, which fundamentally conflicts with Object-Oriented Programming (OOP) used to model complex real-life objects and processes.

Both old-school relational databases and newer NoSQL ones require CRUD plumbing — severely handicapping OOP. Adequate for simple consumer websites, neither will ever be able to support the complexity of today’s intricate business processes, which require fine-grained object-oriented modeling. Only “instant” RAM-based storage can back true OOP. The sheer volume of inter-object communication can only be supported by fast frequently accessed RAM, which is built fundamentally different from the slow disk storage.

Complex business processes can only be modeled with meticulous object-oriented design comprised of thousands of pieces (classes) of various size and complexity. Database technology has never been designed for variety, and while it can physically support thousands of tables, the required CRUD plumbing (4–10 programming units per table — DAOs, DTOs, “facades”, “delegates”, etc.) kills the programmer’s productivity.

Various cute development frameworks like Grails or Play, compete in the art of automatically generating all that glue code which can never be 100% trusted due to being inherently custom. Why to have it in the first place?

What if the fast RAM was much larger and also persistent? Like… flash storage? Why thousands of intricate object-oriented entities that live in RAM frequently communicating through robust interfaces, can’t live in that kind of “memory” forever — through server restarts?

There are still two logistical issues to solve: scaling/failover via clustering and concurrent transactional access with rollback support. Behold distributed data grids. Finally, after 30+ years of its reign, the relational database can be retired for good together with all interim on-disk NoSQL solutions. No one would miss CRUD either.

Have others thought of that? Sure, the industry pays attention now. IBM has learned its lesson after letting Larry Ellison profit from one of its unnoticed inventions — rejected to preserve older products. However it is not the matter of a single idea anymore. All single ideas have been exploited. It’s the never before seen combination of ideas, cleverly put together to create a revolutionary product.

Bulky PDA phones existed long before iPhone, as well, as pathetic tablets preceding iPad. When it comes to business software, storing any “stuff” in large modern RAM simply won’t do: mainframe-era files for SAP or even a complete database: Oracle 12C. RAM’s purpose is to serve OOP, and OOP exists to meticulously model super-complex business processes, which in turn requires expert knowledge of common and domain-specific data processing workflows.

There a few other tweaks, that bring it all together — the same way iPhone rose from the sea of mediocre PDA phones in 2007 by adding the few crucial ergonomic features like pinch zoom.

What no CRUD glue code means for you, an underpaid and abused developer?

Meritocracy. Fewer programmers doing and subsequently earning more. Programming no longer being a DAO and DTO kind of commodity performed by third-world code monkeys. Google Level 6 wages.

--

--