Category Archives: Distributed Computing

Distributed Computing

Inaugural Next Net meeting at Betfair

We had the inaugural Next Net meeting, held at the offices in Hammersmith, last night. It was well attended event with about 40 people there. Dan Creswell gave an excellent talk on the work he has been doing with Amazons EC2 service (slides are : here).

The aim of the group is to try and get a group of people together who are interested in creating next generation distributed systems รขโ‚ฌโ€œ focussing on how you build, manage and develop on large scale, high performance, highly resilient, self healing distributed systems out in the real world not buzzword land.

We are looking at holding the next meeting at Brunel University on January 18th. If you are interested in coming along, or have an idea you want to talk about – then please drop me a mail.

Inaugural Next Net meeting

A while a go I put the forward the idea about setting up a Jini user group in London. There was pretty good feedback about the idea. Many people felt that it would be better to make the remit of the group slightly wider and focus on future developments in distributed computing. So I am pleased to announce the inaugural meeting of the Next Net user group.

We are going to hold an inaugural meeting at Betfair‘s offices in Hammersmith on Monday December 4th at 6-6.30 pm. If you are interested in future developments in distributed/grid/parallel/large scale/resilient systems then we would love to see you. Please sign up here.

Dan Creswell has agreed to give the first talk on some of the work he has been doing on Amazons EC2 (Elastic Compute Cloud) infrastructure.

The idea is that it should be a very informal group of like minded individuals. Beer and Pizza included!

London Jini User Group?

I have just posted a question on the Jini users mailing list to see if any one is interested in a Jini users group in London?

At the moment I don’t have any clear ideas on how this should be structured, set up, etc other than I reckon “that would be a pretty cool bunch of people to get together…”.

If anyone is interested reply on the mailing list or drop me a mail and lets see what we can set up !

Update: the initial response seems positive – which is good! Had several off list responses so keep them rolling in! Dan has put an initial post on his blog – so good work there!

Google Big Table paper

Google Labs have just released a paper about Big Table. Big Table is “a sparse, distributed, persistent multidimensional
sorted map” that provides other Google components with a scalable and reliable persistent data storage layer. This is one of several infrastructure components that Google apps are built from including: Map Reduce, GFS, etc.

The thought that is going through my head is whether Google might open up this system and allow the public to use it. Clearly you could use Big Table as a file storage medium. This could then become a direct competitor to Amazon‘s S3 file storage service. That would be an interesting battle ๐Ÿ™‚

Via Greg (as usual)

Amazon Elastic Compute Cloud

Now this is starting to get interesting! (via Greg)

Amazon have announced Amazon Elastic Compute Cloud (EC2). Built along the same economic principles of their S3 storage service it allows you to gain access to a cloud of compute servers that provide you with access to a virtual compute resource that is:

“You have complete control of your instances. You have root access to each one, and you can interact with them as you would any machine. Each instance predictably provides the equivalent of a system with a 1.7Ghz Xeon CPU, 1.75GB of RAM, 160GB of local disk, and 250Mb/s of network bandwidth.

Your “app” consists of a set of one or more “Amazon Machine Images” (AMIs) which are uploaded into their “cloud”. The AMIs can be whatever you want – webservers, app servers, databases, etc with backend file storage provided by S3 – which then run in the cloud.
As with any “grid” style compute service the idea is that you only pay for what you use. The pricing model seems pretty similar to S3s:

  • Pay only for what you use.
  • $0.10 per instance-hour consumed (or part of an hour consumed).
  • $0.20 per GB of data transferred outside of Amazon (i.e., Internet traffic).
  • $0.15 per GB-Month of Amazon S3 storage used for your images (charged by Amazon S3).

I have only skimmed the site very quickly but this is definitely somehting that needs to be played with!

Some documentation is here.

SOA == Football

Steve Jones has used the World Cup to come up with an interesting analogy as to why football is the perfect example of a Service Orientated Architecture. Apart from the fact that he missed the obvious conclusion that SOA is EXACTLY like the English team – promises so much and always fail to deliver – I thought it was interesting to see that he was starting to talk about the importance of agents operating within the system.

One of the key characteristics of an SOA (not that I actually know what that horrible term actually means….) for me, is that it should be goal orientated : what business problem are we actually trying to solve today?

Once you have actually decided what the underlying issue is then you can start to look at how you might solve that problem. You will generally have a set of actors (services? components?) that are available to help you to do some work. The classic (IT) approach is to define a process that is responsible for orchestrating some of those actors to achieve that end goal. The process instance is responsible for invoking and managing the interactions of all these actors. Irrespective of whether you are directly invoking the underlying actor (through some kind of RPC – web services?) or using a more abstract messaging approach (and thus invoking by proxy) there is a very high degree of coupling in this system. The use of an asynchronous messaging layer might help to alleviate some of the ill effects of temporal coupling, but the coupling remains through the fact that the orchestrating process needs to have a great deal of knowledge about the semantics and capabilities of the actors it is invoking. Change is difficult in this command and control approach because of the implicit interaction knowledge that is embedded within the process.

What happens if you invert that thinking and rather than assuming that the actors in your architecture are dumb, stupid and have to be told what to do; let them become fully fledged participants in the solution? Allow them to go looking for work and to carry out the tasks they know how to move forward. This “inverted” Agent based approach results in a much richer, more flexible and powerful architecture. There is no requirement for a process to explicitly orchestrate the interactions between services. The knowledge of what a service does and how and when it should be invoked is removed from the system. It is highly collaborative and it is unlikely that any single actor knows how to achieve the goals of the system – indeed it is actually undesirable to have that much knowledge in a single place. By making actors more specialised they are easier to write and test. The actors can be added and removed at will. As new problems arise – new actors with new specialisations can be added to the system without impacting anything that has gone on before. If existing actors can help on this new problem then they will do so – which leads to a great deal of actor (read service) reuse. Scaling and operational management might also become much easier – actors will not choose more work than they can comfortably manage. If you have outstanding work at the end of the day – simply add more workers to the system. Compare this to the “classic” approach where the runtime system must be aware of how much work an individual actor is carrying out and must know what to do if that actor starts to become overloaded. Does it matter where an agent actor actually runs? Probably not.

So how do you go about building a system like this?

A good place to start looking at how to build a system like this is to actually look at the place where SOA is supposed to be the “next” (yawn) Silver Bullet – inside a business. Very few businesses actually run using the classic “command and control” approach (indeed the best example of that mind set is Soviet Russia – what a great role model!). Business are far more likely to run using a collaborative approach. Problems are generally solved by one or more teams taking a problem (say pitching for an order) and dealing with those aspects they are able to manage. The sales guy might take the request for business and log the details of the order, the credit guy will check that the customer is good for the deal, the shipping guy might search for the best delivery quote, the warehouse guy will check it is in stock and the finance people will try to finance the deal. If there is a problem, say there is a credit issue, then another actor (say a senior manager) can be drawn into the discussion and make an executive decision based on how importsnt the client is. Once everyone is happy then the sales guy can take the final solution and pitch for the business. Nowhere in this description have I described HOW the actors are invoked – all I have described is what they do. No actor has to know about any other actors – yet they are still able to solve the problem. This means that it is the “problem” itself that is flowing through the system and the actors that can deal with it somehow “see” it.

Modern companies might use an Order Management System but the classic computer science approach is to use a Blackboard. Linda (created by David Gelertner at Yale in the 1980’s) is one approach that maps very well to this kind of problem space. There are a variety of Linda like systems out there in the real world. Jini JavaSpaces is one such implementation that I have personally used very effectively to build highly dynamic, collaborative systems with.

In a JavaSpace approach the “problem”can be modelled using fully fledged objects that flow through the system. Actors are able to write instances of those objects into the system, read them or remove them completely. They are able to modify instances of the object and return it to the space or even create new objects/problems that other actors can then help to solve. The space ensures that access to objects are concurrently safe – only one actor can remove an object (although many can read it simultanously) – and the use of transactions means that objects are not lost if an actor dies whilst working on its part of the problem. Actors monitor the space for objects that match their specified criteria – maybe it is of a certain class, or a field (or set of fields) are set to a certain value. Once they see an object that matches their criteria they can operate it. In the course of operating on the object they may well set its state such that another actor becomes interested in it and is now able to do some work – thus moving the problem on a little further. Eventually the problem will be solved and the goal scored – unlike the England team.

(Aside: many of these object flows through the space start to look like events – see Gregor Hohpe Programming with a Call Stack – Event Driven Architectures for an interesting alternative discussion)

Geoff Arnold goes to Amazon

Last year they hired Pat Helland from Microsoft, now Werner Vogels and Amazon have got a new team member and are clearly building a team of pretty bright people to work on the next step of distributed computing : dealing with SCALE. It is interesting to see that Geoff Arnold is now making the trip from Sun to Amazon to work with Werner and the team.

I think it is really cool when you can work on such large problems. Clearly there are only a few organisations that are facing these issues at the moment – Google and Amazon spring to mind – and I find it interesting that Geoff would move from Sun to Amazon. I think dealing with scale is going to become increasingly important for all businesses over time – and not just those web super giants that have to deal with it at the moment. It is going to require a total change in perspective to deal with it.

The other interesting thing is how scale – once you are faced with super massive scale – can become a friend rather than a foe. Its a bit like applying some kind of martial arts move (scale ju-jitsu?) to the massive problem and allowing its own inertia to let you handle things. At very large size I can see how applying some fundamentally different thought processes – ie looking biology and taking a probablistic rather than command and control approach – scale suddenly becomes alot less scary (although that is not to say it becomes easier!)

Interesting times – interesting problems!

Distributed Systems Engineering at Amazon

Interesting to note Amazon have set up a site for their Distributed Systems Engineering team. I would imagine that these guys report to Werner Vogels – shame he has stopped writing so much following his move to Amazon. Werner outlines some of the stuff that I would imagine this team is working on.

The site is fairly basic at the moment but has some job listings (hint: knowing this might help getting a job there….)

Shame there is no RSS feed …….