DidgetMaster (2739009) writes "I am developing a new general-purpose data management system that handles unstructured, semi-structured, and structured data well, so it has features found in file systems, relational databases, and NoSQL solutions. I am a file system expert so it is very easy for me to see how my system outperforms traditional file systems (e.g. search is 1000x+ times faster), but although I have moderate DB experience it is tough to tell just how my database features compare to the likes of MySQL, PostgreSQL, Oracle, etc.. I have tried to find simple performance metrics on sites that compare various database products, but none of them seem to give any basic information.
I realize that every setup is different and you can tune most databases to get benchmarks to make a particular product look good against the competition, but something simple like "good performance today means you can insert 10 column rows into a table at a rate of 25,000 rows per second" or "a simple database view for finding all customer names that start with the letter 'B' on a 10 million row table should take 3.5 seconds or less". Using my software on a desktop system (intel i7), I can read, parse, and insert 5 million rows (10 columns each) into a table in 1 minute 6 seconds. Queries against that table (e.g. SELECT * FROM table WHERE customerName LIKE '%au%';) usually take less than 2 seconds. (My custom database is a column store that de-dupes all data and does not need any indexes.)
It seems fast to me but is it really? I tried doing the same thing using MySQL Workbench and it always took much longer (sometimes 17 seconds or more for each query), but I can't tell if I am just not doing it right. How long should it take on a desktop machine to import a 5 million row, 10 column.CSV file into a database table? How long should it take to execute simple views against that table? I don't need exact millisecond numbers, just ballpark figures." Link to Original Source top
Ask Slashdot: Do you want a local object store (i.e. a flat file system)?
DidgetMaster writes "Object stores have been around for a while now. For example, Amazon S3 storage is a set of buckets in the cloud in which you can store millions of files as "objects". Local file systems have been around much longer. For nearly 50 years now, we have been stuck with the traditional hierarchical file system storage model (e.g. a tree-like structure of folders or directories). Both systems make it easy to store a ton of data and to find a single item very quickly if you know its unique ID (full path for files, key for objects). But both systems are terrible at searching for all data that have certain features. If you have 2 million files on a file system volume and you want to find all pictures (*.jpg, *.png, *.ico, etc.), then it takes forever to scan the whole system looking for them. Cloud systems are not any better at search. Adding extra metadata (e.g. tags or extended attributes) can help distinguish one file or object from another, but searching for things based on those is even slower. "Find all documents where Author=John" is only fast if all the metadata has been collected and stored in a separate database, otherwise go to lunch while you wait for the results. The Didget Management System wants to change all that by introducing a new object storage model designed to replace file systems. A Didget (Data Widget) is like a file that can contain any unstructured data stream up to 16 TB in length, but it is also like a row in a NoSQL database table where lots of searchable structured tags can be attached to it. Structured and unstructured data can be stored side-by-side within the same container and both types can be returned as the result of a query. Note: this is NOT like other indexing systems like Spotlight or Windows Search. See the 5 minute video demo at http://screenr.com/XV17. Is the ability to instantly find "All photographs taken in Hawaii in 2011" when there are 100 of them among 5 million other pieces of data, enough for you to want to replace your file system with something new?" Link to Original Source top
Ask Slashdot: Should I consider a Kickstarter (or other) campaign?
DidgetMaster writes "I have over 25 years experience writing data management solutions — file system drivers (DOS, Windows, OS/2), disk utilities (PartitionMagic, Drive Image), custom file systems and online backup to the cloud. I have invented a revolutionary new data management system that is build from the ground up (block-based management, I/O, and cache). It has a great feature set and it can replace existing file systems and many database solutions. The architecture has distributed properties that will enable it to compete with Hadoop, CouchDB, MongoDB, and other "Big Data" NoSql solutions. A few friends and I are now two years into its development and we have a lot of the features working. It is blazingly fast and is designed to appeal to everyday consumers as well as large enterprises (it scales really well).
My problem: I have run out of seed money (self-funded) and I need to raise capital to get the rest of the features finished in a reasonable time. Most of the finished features are necessary for a consumer product, but the enterprise features need the most work. I am considering starting a Kickstarter (or other crowd-funding) campaign to raise the funds. That can be a good way to get cash without having to give up a huge chunk of equity. On the other hand, if we get a bunch of regular users that need to be supported it may take our focus off the enterprise features (the most fun stuff). If we can find an angel investor, we can work undistracted and get a good enterprise product out in about a year. Anyone wanting to know more can find info and links to video demonstrations at http://didgetmaster.blogspot.com/