File Based Applications to replace Database Systems within the Web Publishing WorldLets say you need to organize 2000 people on a football field. A relational database would create 2000 little boxes and make everyone stay in their little box. If someone needed to move around, they would first need to inform the administrator so that the administrator doesn't lose track of everyone. A file based system on the other hand would hand out a cell phone to everyone and tell them to have fun. If someone needs you, we'll give you a call. Just make sure you don't lose your cell phone. Beyond that, have a great day. A relational database was a good system. It was also created in a time where searching a million files took more than milliseconds. It was a product of limitations. It wasn't necessarily the ideal solution, but it was a good solution given the tools at hand. Those limitations are gone. Those limitations are in the past. New technology and mind boggling search capabilities have opened the door for new options that weren't available 20 years ago. File based applications are the next evolution for information management. Especially for the web. Why? Because it's easier understand. It's not that you aren't smart enough to understand a database. It's that you don't have to understand a database. Especially when you already understand how to use a file based system. Who is YOU?! That's the key to my argument. YOU is everyone! To run a database driven web application you'll need a DBA. If you don't know what that stands for you aren't alone. It stands for Database Administrator. But if you want to work with a file based system, all you need is you. If you can find files on your computer, you can find the files within the your file based web application. First let me highlight two historical examples of emerging technologies that changed the face of data management, and then let me draw the parallel to the emergence of file-based systems. Notice that each example required a few iterations to achieve true success but the magnitude of their impact is enormous. The first example is Windows. Windows crushed DOS as the operating system of choice because it focused on people. All of a sudden the common user could touch and feel files. Move them around. Drag and drop. Everything became accessible because you could wander around and find what you were looking for. This concept has revolutionized the way people interact with their digital information. Just imagine if you started your computer this morning and had to interact with your files using a command prompt. ACK!! The idea is unimaginable for the vast majority of the computer market place. The average computer user knows nothing about developing software and writing code. Keep in mind that the average person publishing to the web is the same average person who owns a personal compueter. Most people publishing to the web aren't programmers. My next example is Google. Google dominates the search engine marketplace because it focused on people. Google is like a person. A very reliable and friendly person whose greets everyone with the same comforting approach that 50% of the WHOLE WORLD trusts to find what they are looking for online. That's a heck of a lot of people and a heck of a lot of trust. Google gained our trust with a very simple approach. If Google were a person, Google would probably look like a friendly librarian that instantly makes you feel relaxed and less stupid. And we all hate feeling stupid. Google would say something like, "Hi. Just type what you are looking for in this magic little box and I'll go find it. Don't worry about being wrong either because I'll only take 0.22 seconds to find 34 000 000 options. I'll sort these options for you too so what you are looking for will probably be one of the first ten options on the list. If I don't find the right thing then just ask me again. I'll keep looking. How much? *laughs quietly* Oh my dear boy, you don't have to pay me a red cent. I'll be here 24 hours a day, 7 days a week. If you need something, just ask." Incredible! What has this got to do with file based web applications? Well, let me explain how you do a simple change within a file based system vs a relational database. Let's say we want to make a change to a file. For this example the file is a web page. Specifically, the web page is http://demo.openedit.org/folder/file.html Now let's make a change to this file or web page. File Based: This is the location of the computer: http://demo.openedit.org This is the location of the file on the computer: /folder/file.html Now go ahead and change the file. Use any tool you like. OpenEdit's file manager is one option. FTP is another option. You can also use the operating system running on the host machine to locate and change the file. That's it! Go make a change to that file. No caveats. It's that easy. Relational Database: Umm.. I don't know how to make changes to files within a database. I know how they work but if I want to figure out how to actually use one, I would have to start reading books about databases. But I don't want to learn about databases. I want to create web applications. I've been developing web applications for over 4 years and I know pretty much nothing about managing a database. I can create something that works like a database. I can organize over 100 000 files into an organized, fully indexed structure of files that can be searched much the same as Google. But I can't run a database. Why should I? The goal isn't to have a cool database. The goal is to have organized and easily accessible information. I want to put all my files somewhere, have them organized, and then be able access them when I need them. No rules. Just let me find my files and give me access to them when I need them. A relational database can't give you direct access to your files. Notice I didn't say won't. A relational database would give you direct access if it could. But it can't. Here's why. In a relational database everything needs to be in a specific spot and giving you direct access to the file can have dire consequences. A database doesn't understand that file.html is in a folder called /folder on a computer called http://demo.openedit.org. Instead, a database understands the location of file.html in relation to everything else around it. Recall the football field example? For the database to work you have to ask the database to go get the file for you. In fact, if you went and deleted the file without telling the database you risk corrupting the entire thing. If box 1276 on our football field had the wrong person in it then you probably can't be sure that the rest of your boxes were correct either. If file.html isn't where it's supposed to be than the database often gets confused and can't find anything. When a database gets confused it can lose track of everything. And that's bad. Really bad. So bad they created a term for it. Corrupt. Being corrupt is so bad that every relational database requires an administrator function that acts as the gate keeper to ensure this very bad thing never happens. It is this requirement that causes so much grief for database management. Despite the enormous amount of effort that industry has invested into making these administrator functions easy to work with, they are a REQUIREMENT. A relational database requires an administrator function and can never allow you direct access to the file system. It's part of system's architecture. Until now this was the best way to manage large amounts of information. New technology keeps changing the rules and we are in the middle of another big change. If you want to organize over a million files you no longer need a database. You can keep your files unlocked. You can say goodbye to database administrators. You can regain control. This new reality is probably terrifying for the database industry but it's proving to be nothing short of earth shattering for me! Posted by Joel Halse Fri, Mar 28 2008 2:58 PM
|
Most Recent Posts
Christopher Burkey to Speak in Santiago, Chile
(0 comments)
Redirect API Change
(0 comments)
OpenEdit Weekly Update
(0 comments)
OpenEdit Updates
(0 comments)
OpenEdit Development 8-14-08
(1 comments)
8 Deployments a Day..
(0 comments)
A List Apart Survey, 2008
(1 comments)
OpenEdit Development 8-10-08
(0 comments)
OpenEdit Development 8-1-08
(0 comments)
New Ideas
(2 comments) Archive
Log in
Syndicate:
|
Copyright 2008 OpenEdit Inc. All rights reserved. last modified: Mar 28 2008

I gave up reading at "created in a time where searching a million files took more than milliseconds", because anyone that thinks you can search 'millions of little files' on in milliseconds is smoking crack. Even on high end RAID, doing that is going to be slow.
Honestly, you guys are spreading so much fud with these recent posts. Do you even understand any of the purposes of a database? What happens if someone uses the web-ui at the same time someone changes the file? Where is your ACID? How do you control security? How do you scale horizontally?
Only cowboys promote approaches like you have without addressing other methods of indexing, access control, security and atomicty. Think: if you go against the entire industry, there's one of two options, everyone else is stupid or you are. Whats more likely?
Hi Joel, From the comments it seems there is some resistance. Maybe a less controversial approach would be focus on where we all agree.
We all agree that these things should always be stored on a disk drive:
1. Images.
2. HTML
3. XML
4. Application configuration files
5. Programming logic and classes
So really the question is where to store the Data parts of my web sites. Things like users, product metadata, categories. 99% of web sites use relational databases for this. We have found that XML can do the job just fine for most sites.
Bad analagies, false premises, foolish conclusions, pompous and ridiculous assertions.
Chris,
If we put packaging aside, I find these comments are very helpful and I would like to answer them. My bias towards a file based system is a product of enjoying the benefits of having the files I need accessible via a file manager and my goal is to highlight the benefits of an alternative approach. I expect there are many questions / criticisms about OpenEdit's architecture and considering that we are taking an approach that goes against most industry practices, I also understand that it is our responsibility to provide quantitative evidence to support our argument.
There are some valid questions here that need to be answered if the idea being presented is to be accepted and discussed seriously.
To further the discussion, here is an interesting article that asks the question, "Do you believe in a flatfile driven content management system?" written by John Conroy.
just how do you think you can "search millions of little files in milliseconds"? Google can do this because THEY USE DATABASES. They index the contents of webpages in databases, and have complex algorithms to search through those databases.
Great! Thanks for the great informative post and your effort. I think the above article is valuable for all concerned people. For me the Information is really useful. I am hoping more updates from you. Designer Handbags dolce gabbana purses Burberry Bags ED Hardy Handbags Don't miss the Coach Outlet and Coach Store. I really like it very much. Nicely done and well structured. Mens Gucci Gucci Bags.