- “And nobody expressed any concern that this might not be right?“
- “Bill, you are building a system, not a collection of parts.“
Steve Vinoski talking about
To achieve reliability, you have to accept the fact that failure will occur, Once you accept that, then other things fall into place: you need to be able to restart things quickly, and to do that, processes need to be cheap. If something fails, you don’t want it taking everything else with it, so you need to at least minimize, if not eliminate, sharing, which leads you to message passing. You also need monitoring capabilities that can detect failed processes and restart them
Amen to that!
Towards Robust Distributed Systems– Eric Brewer’s slidedeck that kicked off the whole CAP Theoremdebate at PODCin 2000.
ErlangMeeting – “How do you test large systems written in Erlang?” on Monday (I presume) Jan 21.
- Map Reduce a major step back. Michael
Stonebrakerand David J. DeWittlet forth on why they dislike Google’s MapReduceinfrastructure and how it is poor relation to standard relational databasetechnology. Relational DatabaseExperts Jump The MapReduce Shark – a reply to Stonebraker and DeWitt that I pretty broadly agree with.
- MapReducing 20
petabytesper day – Greg Linden commenting on the recent paper “MapReduce: Simplified Data Processing on Large Clusters” by Jeff Dean and Sanjay Ghemawat.
Werner has a new piece on
I am keen to see how the DB heads respond to this. I also want to see how how long it takes for people to get their heads around the
- Interesting job spec from
My favourite line: “You have at least once tried to understand
Paxosby reading the original paper.“. It all sounds like ancient greek to me!
It’s always nice to get an Ebay perspective on things – especially when you totally agree with what is being said. For a long time I have felt that if we really want to embrace the power of large scale distributed systems we have to accept that we just do not know what is going on – life is non-determinstic, messy, chaotic and generally random. Most tech orientated people are used to thinking in a single linear fashion. First the processor does this, then it does that. Nice – but not the way the world works. Things actually happen at the same time. Just not neccasarily on your processor or in your address space. For many people dealing with that is a problem and something that should be somehow hidden away or, at the very least, forced into a 2PC harness. That might work in a small system but not when you reach internet scale systems. Forget all about a linear world and move to thinking about multiple agents doing multiple things at the same time.
This is a thought process that is going to increasingly dominant the way we build systems.