Cloud working

I thought it would be worth pulling out the "assumptions that are now invalid" from Steve Loughran's and  Julio Guijarro's presentation "Farms, Fabrics and Clouds".

  • Systems have a long lifespan
  • It is slow/expensive to create a new system
  • It is expensive to duplicate one
  • Systems can/should be managed by hand
  • Clocks proceed at the same rate
  • Physical RAM doesn't get swapped out
  • Running machines can't be moved/cloned
  • System failure is an unusual event
  • 100% availability can be achieved
  • Data is always near the server
  • You need physical access to the servers
  • Databases are the best storage form
  • You need millions of $/£/€ to play
  • Terabyte datasets are hard to work with
  • Code runs on a single machine
  • Sequential code is better than parallel code
  • RAID hardware is the best way to store data
  • Databases are better than filesystems
  • A single farm needs to scale to infinity
  • You need to provide 100% availability to 100%  of users
  • You have to roll out simultaneous updates to  the application, changes to the DB schema, globally

This is a HP labs paper, though what's most interesting to are how these clouds are been driven by consumer facing companies (such as Amazon and Google), that claim the enterprise state of art won't cut it at scale. This seems to me like the desktop computer revolution of the eighties or the email/web of the ninties - consumer tech that infects the enterprise. Or maybe that's selection bias from reading blogs and silly-valley articles; perhaps there's a ton of interesting stuff happening in the IT/Datacenter sectors that doesn't get aired much. Another interesting thing is that my thinking about capacity planning and rollout has to change. As Steve says - "you no longer have to estimate load in advance [...] and you have to embrace this dynamic world from the outset."  That seems true to me; legacy apps and containers making strong design assumptions around N=1 datasets have been the main reason I see "just scaling out" gets taken off the table as an option - too much valley to cross to the next local maxima; the path of least resistance is a bigger database. Sometimes I wonder how a deployed n-tiered scale up monolith can be gradually refactored to a scale out model or run on scale out infrastructure. It's well documented that companies like Amazon and Ebay have done just that, only how they did it tends to get left out of the slide decks. I suspect it involves thinking quite differently about what 'good' application code is; in that sense I have a lingering doubt about coding direct to an ORM, but less so about service layers.

I would have added one other invalidation:

  • virtual hosts are only for testing and development
but perhaps that one is already obvious :)

Tags:

    tags:

5 Comments


    I really think we'd be a lot further along in the world if every developer were forced to be actually deploy and monitor their own code. Better yet, be a sysadmin for a year or two and do it for other people's code.

    Anyway.


    @Mark - well said!

    @dehora - you know where I'm coming from on that :)


    @Mark: yes!

    @Murf: you already *know* I am so maintenance!


    Hi Bill

    "It's well documented that companies like Amazon and eBay have done just that, only how they did it tends to get left out of the slide decks..."

    I happen to be in the last Qcon event in which eBay, Amazon, Yahoo presented how they scale their applications - you can find a summary of what i saw as a repetitive patterns for scaling here: http://natishalom.typepad.com/nati_sh... i happen to give a specific presentation during that same event on "how to transition existing tier based application into scale out model" - the presentation is available online here http://qcon.infoq.com/sanfrancisco/pr...

    Below is short snippet from my summary on how Amazon,Yahoo,eBay scale their applications:

    - Asynchronous event-driven design: Avoid as much as possible any synchronous interaction with the data or business logic tier. Instead, use
    an event-driven approach and workflow

    - Partitioning/Shards: You need to design your data model so
    that it will fit the partitioning model

    - Parallel execution: Parallel execution should be used to
    get the most out of the available resources...

    - Replication (read-mostly): In read-mostly scenarios
    (LinkedIN seems to fall into this category well), database replication can help
    load-balance the read load by splitting the read requests among the replicated
    database nodes

    - Avoid the use of distributed transactions....

    - Move the database to the background: There was violent
    agreement that the database bottleneck can only be solved if database
    interactions happen in the background.

    Quoting Werner Vogel (Amazon) "To scale: No direct access to the database anymore. Instead data access is encapsulated in services (code and data together), with a stable,
    public interface."

    HTH

    Nati S.


    Nati, thanks for the links! I guess I should have been more clear - the what and the techniques I'm aware of, I'm much much fuzzier on the organisational/project processes of doing so.


Comments are closed for this entry.