"1. Riak is a key-value store (values are opaque to Riak). The only reason I haven't commented on that part is that I wanted to stay focused on the main topic: net splits. 2. MongoDB is a distributed database (replica sets, auto-sharding, etc.). But it is not part of this list."
- Alex Popescu
"So your hypothesis goes like this: "Those working on and using Hadoop are a bunch of idiots that do not realize its complexity. Its adoption is caused only by the fact that idiots reproduce faster than smart ones." I do not have any monetary interest in seeing Hadoop used or not. My only interest is in helping others form an educated picture of the market. And as a side note, I don't like the old marketing gimmick of throwing mud at competitors. I'll wait with interest to learn about your solution. Meanwhile there's so much happening in the Hadoop space that I need to pay attention too."
- Alex Popescu
""instantly increasing performance on the one hand and losing no data on the other" Unfortunately I don't think this can actually ever happen :-). But I could try to imagine some scenarios for collaborative loosely-coupled participants."
- Alex Popescu
"Ben, What you are saying is correct from the perspective of accessing data. But let's try to get on the same page :-). 1. What I wrote in the post and in the above comments refers to MapReduce implementations. And it's true and applies to MongoDB, CouchDB, Riak MapReduce implementations. 2. The fact that MongoDB offers two types of "queries" (native and MapReduce) while in CouchDB all "queries" are MapReduce is correct. But if we go to compare MongoDB queries with CouchDB views we will notice more differences than what you mention (and sound at a quick read as major benefits of MongoDB). Just to give you a quick example: in CouchDB views' results are cached and only new/updated data is re-processed. In MongoDB all queries are re-executed. Anyways, this is completely a different discussion."
- Alex Popescu
"Ben, I'm not really sure what you are referring to. But I'm sure my comments in the posts are related to the MapReduce implementation used by most of the NoSQL databases (serializing objects to a JavaScript engine that passes back the emitted key-values) :-). And as far as I know none of these implementations are actually happy with their performance. They actually worked very hard for finding workarounds: 1. managing pools of JavaScript engines 2. providing a set of native map/reduce functions 3. performing pre-filtering at the node level"
- Alex Popescu
"No problem. I've actually tried to get a copy of the paper and I'm pretty sure I have one saved somewhere, but I'd still not have the rights to distribute it."
- Alex Popescu
"Klint, a different way to put it is that for many NoSQL databases security wasn't (yet?) a priority. But that doesn't change much for those putting at risk their apps and data. And I could agree that this issue is not limited to NoSQL, but this is the space I'm focusing on. Just to be clear about a couple of things: 1. I'm no security expert, so I might be missing a lot of other security related issues. But sometimes I recognize patterns that should be avoided and try to warn against them. Awareness is the first step towards knowledge. 2. I know it's wishful thinking but I'd like to see every NoSQL database and NoSQL hosted solution having a big link to their documentation on security."
- Alex Popescu
"The title is that of the linked article. And my commentary was meant to clarify that all his complains are either about the Ruby SDK or coming from ignoring the DynamoDB documentation."
- Alex Popescu
"A while ago I've posted a bookmarklet that provides the same enhanced functionality. Actually it works for capturing snippets from websites and works with GMail (in both Safari and Chrome). Here is the post: http://jots.mypopescu.com/post..."
- Alex Popescu
"It is less about the tools and more about the approach. Predictive models are based on historical data while this approach uses a single point in time."
- Alex Popescu
"MongoDB lacks the knobs allowing to configure how the system uses the available memory and that can lead to unpredictable behavior (take a look at MySQL and all the various memory configuration options)."
- Alex Popescu
"Thanks for bringing these up. Google's storage solution is part of the Google App Engine and thus not directly accessible. Secondly, while I've written that Megastore is probably the solution behind Google App Engine High Replication Datastore I don't think I've seen this confirmed 100%. Moreover according to Megastore: "The entity group defines the a priori grouping of data for fast operations." which makes me think it is not exactly auto-sharding. On the other hand, I think you are right about Azure Table storage."
- Alex Popescu
"You are absolutely right... I got caught in a weird formulation. But the update of the post is clear: "there are no maintenance windows". It would be great if you could pull together some answers. Meanwhile I'll be scanning other sources to try to gather all the questions in a single place."
- Alex Popescu
"Werner, thanks a lot for confirming the lack of maintenance windows. It would be great to get some details about DynamoDB SLA (durability, uptime, etc.)"
- Alex Popescu
Re: Amazon DynamoDB â a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications - All Things Distributed - http://www.allthingsdistributed.com/2012...
"After boosting innovation in both the data storage and service/cloud areas, I was really wondering if more of the Amazon internals would surface at some point. Indeed as someone following closely and evangelizing NoSQL, I was hoping to see what system was behind the Dynamo paper. The release today is even more interesting considering it On the other hand, during the last couple of years, a few very interesting NoSQL databases became available and have been used in production by some prominent services (Cassandra, HBase, Riak, to name just a few). So, it will be interesting to see both where DynamoDB fits better and how it compares to these NoSQL database. I've already started to put together references about both DynamoDB and some early comparisons: - Notes about DynamoDB: http://nosql.mypopescu.com/pos... - Cassandra and DynamoDB compared: http://nosql.mypopescu.com/pos... Hopefully more will come shortly."
- Alex Popescu
"Definitely not what I meant. Here is my point: would you really spend your money of something running "acceptable" software? Could you imagine for example a TV set that needs 1second to change the channels? Or a washing machine that would randomly change the program? Frankly, I would buy any of these. Or if I'd be tricked into getting something like this, I'd make sure to return it as soon as possible."
- Alex Popescu
"Andy, it would be appreciated if besides pointing out the drawbacks of a solution, the comment would include details about the alternative solution(s) . Otherwise it will be misunderstood and/or misinterpreted."
- Alex Popescu