可伸缩性, 可用性和稳定性模式 Scalability, Availability & Stability Patterns

Scalability, Availability & Stability Patterns

一自我有要求的读者应该提出问题：（研习：掌握层次：）能力级别：不会（了解）——领会（理解）——熟练——精（why）——通（融汇贯通）

1.1 什么是Scalability, Availability&Stability Patterns ？
1.2 以上各个模式都说了些什么？
1.2.1 Scalability Patterns 从State和Behavior都说了些什么？是简单介绍还是有一定深度呢？
1.2.2 Availability Patterns 都说了些什么？
1.2.3 Stability Patterns ？都说了些什么？
该PPT只是比较全面、轻轻点水般介绍了一下当前架构思想，只能增加架构设计的视野，要想能够很好的使用这些思想，必须选一两个感兴趣有前景的点深入下去才行。
1.3 （3）这本书说得有道理吗？是全部有道理，还是部分有道理？why?
作者最终的目标，加上他建议的达成目标的方法--这要看你认为追求的是什么，以及什么才是最好的追求方法而定。
这些Pattern在什么情况下用？如何用? 可以解决那些问题？而不解决那些问题？
1.4 （4）赞同一本实用性的书之后，确实需要你采取行动。照着作者希望你做的方式来行动。How
行动：为达到某种目的而进行的活动。行动目标，行动方法，行动开始时间，结束时间，行动人，行动地点，行动方式。
在架构设计时，考虑这些因素会提高系统Scalability，Availability&Stability等？

二研读过程中应该努力寻找问题的答案，对问题的思考越深入，收获也就越多：

2.1 什么是Scalability, Availability&Stability Patterns ？http://www.jdon.com/jivejdon/thread/38928
2.1.1 Scalability(伸缩性、可扩展性):（研习:1 掌握层次：理解） Scale up/Scale out
    可伸缩性就是通过增加资源使服务容量产生线性（理想情况下）增长的能力。可伸缩应用程序的主要特点是：附加负载只需要增加资源，而不需要对应用程序本身进行大量修改。
    在一些大的系统中，预测最终用户的数量和行为是非常困难的，伸缩性是指系统适应不断增长的用户数的能力。提高这种并发会话能力的一种最直观的方式就增加资源（CPU，内存，硬盘等），集群是解决这个问题的另一种方式，它允许一组服务器组在一起，像单个服务器一样分担处理一个繁重的任务。
    尽管原始性能对于确定应用程序所能支持的用户数很重要，但可伸缩性和性能是两个单独的实体。事实上，性能结果有时可能与可伸缩性结果是对立的。
      可伸缩性Scalable高性能系统设计：http://www.jdon.com/jivejdon/thread/40668
      可伸缩性最佳实战: http://www.jdon.com/jivejdon/thread/37793
      CAP理论以及Eventually Consistent （最终一致性）解析:http://www.jdon.com/jivejdon/thread/37999
      BASE（Basically Availability、Soft state、Eventually consistent）
      你真的明白什么是可伸缩性吗？http://developer.51cto.com/art/200710/57496.htm
2.1.2 Availability(可用性、有效性) （研习:1 掌握层次：理解）
    ISO9241/11中的定义是：一个产品可以被特定的用户在特定的上下文中，有效、高效并且满意得达成特定目标的程度(The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.)。
    GB/T3187-97对可用性的定义：在要求的外部资源得到保证的前提下，产品在规定的条件下和规定的时刻或时间区间内处于可执行规定功能状态的能力。它是产品可靠性、维修性和维修保障性的综合反映。
    实际比较常用Shakel（1991）对可用性的定义：可用性是指技术的“能力（按照人的功能特性），它很容易有效地被特定范围的用户使用，经过特定培训和用户支持，在特定的环境情景中，去完成特定范围的任务。”
    单一服务器的解决方案并不是一个健壮方式，因为容易出现单点失效。像银行、账单处理这样一些关键的应用程序是不能容忍哪怕是几分钟的死机。它们需要这样一些服务在任何时间都可以访问并在可预期的合理的时间周期内有响应。集群方案通过在集群中增加的冗余的服务器，使得在其中一台服务器失效后仍能提供服务，从而获得高的可用性。
       可用性：http://baike.baidu.com/view/1436.htm
2.1.3 Stability（稳定性、稳定度）（研习:1 掌握层次：理解）
    软件的稳定性，指软件在持续操作时间内出错的概率，例如一天时间内会出错1次或几次。具体来定义它是否属不属于稳定，根据软件的具体要求来定义。
    软件的稳定性应该和软件的可靠性是不同的。软件的稳定性从软件开发的角度出发，强调软件架构的稳定，也就是说需求、代码等的变更对软件系统的影响尽可能地小，这也是架构设计要解决的首要任务。
    这需要作边缘测试来检验，而边缘测试的定义和实施都是需要很多经验来支持的，这对于新手来说是无法做到的。
    边缘测试，举个例子：在压力测试中，可以在压力的最大值、最小值附近取值进行测试，甚至考虑超过最大值和最小值的方式进行测试。这就属于边缘测试。
    平均无故障时间等指标是说明系统的可靠性的。系统的稳定性应该是指系统的一些边缘故障。比如系统运行一直很好，只是偶尔出现一些奇怪的问题，但是找不到原因，经过重启或者重装之后就恢复正常，这就在考验系统的稳定性。从系统本身来说，没有无缘无故的爱也没有无缘无故的恨，出现问题一定是在某个方面有缺陷，而且问题往往是出在设计上。如果要从设计角度去保障软件的稳定性就需要设计人员充分的考虑系统各个模块之间的关系，减少耦合度，是问题隔离起来。很多问题都是出在模块之间的调用上的。模块内部也是一样，最大的问题就出在内存的使用上，不过这就到编码的问题了。总之，稳定的系统需要专业的有丰富经验的设计人员，合理的划分系统，详细设计做到足够细，避免在开发阶段出现问题。
       稳定性：http://baike.baidu.com/view/251942.htm
       什么是软件的稳定性：http://topic.csdn.net/t/20051220/22/4471364.html

2.2 以上各个模式都说了些什么？
2.2.1A Scalability Patterns 从State和方面都说了些什么？是简单介绍还是有一定深度呢？
    2.2.1.1 Scalability解决什么问题？Managing Overload （研习:1 掌握层次：理解）
    2.2.1.2 Scalability有哪两种扩展方式？ Scale up vs Scale out （研习:1 掌握层次：理解）
    2.2.1.3 关于Scalability一般建议是什么？（研习:1 掌握层次：理解）
           *Immutability as the default：不变作为一个缺省
           *Referential Transparency(FP) :参考透明性
           *Laziness: 懒惰
           *Think about your data:
              *Different data need different guarantees
    2.2.1.4 伸缩扩展系统时，有哪些因素中权衡？Scalability Trade-offs （研习:1 掌握层次：理解）
           没有免费的午餐,扩展系统是有代价的。
       (1)Performance(单机)       vs Scalability（多机）?
           How do I know if I have a performance problem? If your system is slow for a single user.
           How do I know if I have a scalability problem? If your system is fast for a single user but slow under heavy load
       (2)Latency(等待时间) vs Throughtput(容量\吞吐率) ?
           You should strive(力求) for maximal throughput with acceptable latency(等待时间)
       (3)Availability vs Consistency ?
           Brewster's CAP theorem: CAP(Consistency\Availability\Partition tolerance) You can only pick 2
           *对于Centralized system 对CAP权衡的因素？
             In a centralized system(RDBMS etc.) we don't have network partitions, e.g. P no in CAP
              So you get both: Availability and Consistency 如ACID(Atomic\Consistent\Isolated\Durable)
           *对于Distributed system 对CAP权衡的因素？
            In a distributed syste we (will) have network partitions, e.g. P in CAP
            So you get to only pick one: Availability or Consistency
           *如何利用CAP理论指导实践呢？
            there are only two types of systems:
              1. CA == CP (they are equivalent)
              2. AP
            there is only one choice to make. In case of a network partition, what do you sacrifice?
              1. C:Consistency
              2. A:Availabilty
              权衡的选择：BASE(Basically Avaialable\Soft state\Eventually consistent)
              Eventual Consistency is an interesting trade-off
2.2.1.5 Scalability Patterns:State都说了些什么？State（持久化或内存中的数据）
    2.2.1.5.1 Partitioning:分区技术（研习:1 掌握层次：理解）
    2.2.1.5.2 HTTP Caching （研习:1 掌握层次：理解）
          Reverse Proxy(反向代理:都可以做负载均衡器)软件: F5/BigIp
             *Varnish:Varnish是一款高性能的开源HTTP加速器，http://www.oschina.net/p/varnish/ 挪威最大的在线报纸 Verdens Gang (http://www.vg.no) 使用3台Varnish代替了原来的12台squid，性能居然比以前更好。
             *Squid:流行的自由软件（GNU通用公共许可证）的代理服务器和Web缓存服务器。HTTP外，对于FTP与HTTPS的支援也相当好
             *rack-cache
             *Pound:Pound 是一个反向 HTTP 代理，负载均衡器和 SSL 封装器。可代理客户的的 HTTPS 请求到 HTTP 的后端服务器，并对这些请求进行分发，支持会话保持，支持 HTTP/1.1。
             *Nginx:Nginx ("engine x") 是一个高性能的 HTTP 和反向代理服务器，也是一个 IMAP/POP3/SMTP 代理服务器。
             Apache mod_proxy
          CDN(Content Delivery Network 内容分发网络):使用户可就近取得所需内容
          Generate Static Content:
            Precompute content:
             *Homegrown + cron or Quartz
             *Spring Batch：SpringBatch是一个批处理的框架，作为一个 Spring 组件，提供了通过使用 Spring 的依赖注入(dependency injection) 来处理批处理的条件。
             *Gearman：Gearman是一个分发任务的程序框架，可以用在各种场合，与Hadoop相比，Gearman更偏向于任务分发功能。它的任务分布非常简单，简单得可以只需要用脚本即可完成。
             *Hadoop：分布式计算
             *Google Data Protocol
             *Amazon Elastic MapReduce
    2.2.1.5.3 RDBMS Sharding （研习:1 掌握层次：理解）    (属于SoR:Service of Record)：分片：水平扩展(Scale Out，亦或横向扩展、向外扩展) Sharding：http://baike.baidu.com/view/3126360.htm
          How to scale out RDBMS?
         2.2.1.5.3.1 Partitioning:例子如:把User[A-C]放到DB1,把User[D-F]放到DB2...把User[X-Z]放到DBn
         2.2.1.5.3.2 Replication :把User[A-C]User[D-F]放到DB1,把User[D-F]User[A-C]放到DB2...把User[N1-N2]User[M1-M2]放到DBn
         2.2.1.5.3.3 anti-pattern(反模式):ORM+rich domain model ,Attempt: Read an object from DB, Result:You sit with your whole database in your lap
    2.2.1.5.4 NOSQL:Not Only SQL (属于SoR:Service of Record) （研习:1 掌握层次：理解）
       2.2.1.5.4.1 Think about your data?
           When do you need ACID?
           When is Eventually Consistent a better fit?
           Different kinds of data has different needs
           When is a RMDBS not good enough?
           Scaling reads to a RDBMS is hard!
           Scaling writes to a RDBMS is impossible
           Do we really need a RDBMS? But many times we don't.
           Who's ACID?
             Relational DBs(MySQL,Oracle, Postgres)\Object DBs(Gemstone, db4o)\Clustering products(Coherence, Terracotta)\Most caching products(ehcache)
           Who's BASE?
             Distributed databases:
                *Cassandra：assandra是一个混合型的非关系的数据库，类似于Google的BigTable。由Facebook开发，后转变成了开源项目
                *Riak：Riak是由技术公司basho开发的一个类Dynamo的分布式Key-Value系统。其以分布式，水平扩展性，高容错性等特点著称。
                *Voldemort：Dynomite 是采用 ErLang 语言开发的分布式的Key-Value存储系统。
                *Dynomite：Dynomite 是采用 ErLang 语言开发的分布式的Key-Value存储系统。
                *SimpleDB：Amazon SimpleDB
           NOSQL in the wild:
                *Google:Bigtable\Amazon:Dynamo \ Amazon:SimpleDB\Yahoo:HBase\Microsoft:Dynomite\Facebook:Cassandra\LinkedIn:Voldemort
       2.2.1.5.4.2 Chord(和弦) & Pastry（糕点）：
             Distributed Hash Tables(DHT) \ Scalable\ Partitioned\ Fault-tolerant\ Decentralized\ Peer to Peer\ Popularized(Node ring/Consistent Hashing)
           Bigtable
             How can we build a DB on top of Google File System?
             Paper:Bigtable: A distributed storage system for structured data, 2006
             Rich data-model, structured storage
             Clones: HBase|Hypertable|Nepture
           Dynamo:
             Hoe can we build a distributed has table for the data center?
             Paper:Dynamo:Amazon's highly available key-value store, 2007
             Focus:partitioning,replication and availability
             Eventually Consistent
             Clones:Voldemort|Dynomite
       2.2.1.5.4.3 Types of NOSQL stores
             Key-Value databases(Voldemort,Dynomite)
             Column databases(Cassandra,Vertica)
             Document databases(MongoDB,CouchDB)
             Graph databases(Neo4J,AllegroGraph)
             Datastructure databases(Redis,Hazelcast)
      2.2.1.5.5 Distributed Caching （研习:1 掌握层次：理解）
           2.2.1.5.5.1 Write-through
           2.2.1.5.5.2 Write-behind
           2.2.1.5.5.3 Eviction Policies
              TTL(time to live)
              Bounded FIFO(first in first out)
              Bounded LIFO(last in first out)
              Explicit cache invalidation
           2.2.1.5.5.4 Replication
           2.2.1.5.5.5 Peer-To-Peer(P2P)
              Decentralized:分散
              No"special" or "blessed" nodes
              Nodes can join and leave as they please
           2.2.1.5.5.6 Distributed Caching Products:
              EHCache
              JBoss Cache
              OSCache
              memcached: Simple\Key-Value(string->binary)\Clients for most languages\Distributed\Not replicated - so I/N chance for local access in cluster
           2.2.1.5.6 Data Grids/Custering:数据网格 Parallel data storage （研习:1 掌握层次：了解）
              Data replication
              Data partitioning
              Continuous availability
              Data invalidation
              Fail-over
              C+A in CAP
           Data Grids/Custering Products:
              Coherence/Terracotta/GigaSpaces/GemStone/Hazelcast/Infinispan
     2.2.1.5.7 Concurrency:并发
         2.2.1.5.7.1 Shared-State Concurrency （研习:1 掌握层次：理解）
            Every one can access anything anytime
            Totally indeterministic
            Introduce determinism at well-defined places...
            ... using locks
          Problems with locks:
            Locks do not compose
            Taking too few   locks
            Taking too many locks
            Taking the wrong locks
            Taking locks in the wrong order
            Error recovery is hard
          Please use java.util.concurrent.*:
            ConcurrentHashMap/BlockingQueue/concurrentQueue/ExecutorService/ReentrantReadWriteLock/ParallelArray/and much much more..
         2.2.1.5.7.2 Message-Passing Concurrency （研习:1 掌握层次：理解）
          *Actors: erlang万物皆Actor, Actor之间只有发送消息这一种通信方式
             Implemented in Erlang, Occam,Oz
             Encapsulates state and behavior
             Closer to the definition of OO than classes
             Share NOTHING
             Isolated lightweight processes
             Communicates through messages
             Asynchronous and non-blocking
             No shared state   ... hence, nothing to synchronize.
             Each actor has a mailbox(message queue)
             Easier to reson about
             Raised abstraction level
             Easier to avoid -Race conditions -Deadlocks -Starvation -Live locks
          *Actor libs for the JVM:
             Akka(Java/Scala)/scalaz actors(Scala)/Lift Actors(Scala)/Scala Actors(Scala)/Kilim(Java)/Jetlang(Java)/Actors'Guild(Java)/Actorom(Java)/FunctionalJava(Java)/GPars(Groovy)
         2.2.1.5.7.3 Dataflow Concurrency （研习:1 掌握层次：了解）
             Declarative
             No observable non-determinism
             Data-driven - thread block until data is available
             On-demand, lazy
             No difference between:
                 Concurrent&Sequential code
             Limitations: can't have side-effects
         2.2.1.5.7.4 Software Transactional Memory （研习:1 掌握层次：了解）
             See the memory(head and stack )as a transactional dataset
             Similar to a database: begin commit abort/rollback
             Transactions are retired automatically upon collision
             Rolls back the memory on abort
             Transactions can nest
             Transactions compose
         Transactions restrictions: All operations in scope of a transaction: Need to be idempotent
         STM libs for the JVM:
             Akka(Java/Scala)
             Multiverse(Java)
             Clojure STM(Clojure)
             CCSTM(Scale)
             Deuce STM(Java)

   2.2.1B Scalability Patterns:Behavior(行为、性能)都说了些什么？
        1.2.1B.1 Event-Driven Architecture （研习:1 掌握层次：了解）
            1.2.1.6.1 Domain Events
            1.2.1.6.2 Event Sourcing
            1.2.1.6.3 Command and Query Responsibility Segregation(CQRS) pattern
                    in a nutshell
            1.2.1.6.4 Event Stream Processing
            1.2.1.6.5 Messaging
                 Publish-Subscribe
       Point-to-Point
       Store-forward
       Request-Reply
               Standards: AMQP(即Advanced Message Queuing Protocol，高级消息队列协议) 和JMS（Java Messaging Service）
               Products: RabbitMQ(AMQP)/ActiveMQ(JMS)/Tibco/MQSeries/etc
            1.2.1.6.6 Enterprise Service Bus
               products: ServiceMix(Open Source)|Mule(Open Source)|Open ESB(Open Source)|Sonic ESB|WebSphere ESB|Oracle ESB|Tibco|BizTalk Server
            1.2.1.6.7 Actors
               Fire-forget:Async send
           Fire-And-Receive-Eventually:Async send + wait on Future for reply
            1.2.1.6.8 Enterprise Integration Architecture(EIA)
               参考书《Enterprise Integration Patterns》
               Apache Camel: More than 80 endpoints/XML(Spring) DSL/Scala DSL
        1.2.1.6.2 Compute Grids（研习:1 掌握层次：了解）
           Parallel execution:并行执行
                 Automatic provisioning
             Load balancing
             Fail-over
             Topology(拓扑) resolution
           Products：
       Platform/DataSynapse/Google MapReduce/Hadoop/GigaSpeaces/GridGain
        1.2.1.6.3 Load-balancing（研习:1 掌握层次：理解）
               Random allocation 随机分配算法
               Round robin allocation 循环分配算法
           Weighted allocation 负载分配算法
           Dynamic load balancing:
                  Least connections :连接数最少
              Least server CPU :CPU服务最少
                  etc.
               DNS Round Robin(simplest) : Ask DNS for IP for host/Get a new IP every time
               Reverse Proxy(better)
           Hardware Load Balancing
            Load balancing products:
               Reverse Proxies: Apache mod_proxy(OSS)|HAProxy(OSS)|Squid(OSS)|Nginx(OSS)|VLS
           Hardware Load Balancers: BIG-IP|Cisco
        1.2.1.6.4 Parallel Computing（研习:1 掌握层次：了解）
            SPMD Pattern:
       Single Program Multiple Data
            Very generic pattern, used in many other patterns
       Use a single program for all the UEs
            Use the UE's ID to select different pathways through the program. F.e:
           Branching on ID
               Use ID in loop index to split loops
            Keep interactions between UEs explicit
        Master/Worker Pattern
            Good scalability
            Automatic load-balancing
            How to detect termination? Bag of tasks is empty/ Poison pill
            If we bottleneck on single queue? Use multiple work queues/ Work stealing
       What about fault tolerance? Use"in-progress" queue
        Loop Parallelism Pattern
        Fork/Join Pattern
        MapReduce Pattern
            UE:Unit of Execution: Process/Thread/Coroutine/Actor

2.2.2 Availability Patterns 都说了些什么？
      What do we mean with Availability ?
    2.2.2.1 Fail-over:故障切换（研习:1 掌握层次：理解）
            simple Fail-over
            complex Fail-over
            Network fail-over

Fail-Fast

维基百科地址：http://en.wikipedia.org/wiki/Fail-fast

Fail-fast is a property of a system or module with respect to its response to failures. A fail-fast system is designed to immediately report at its interface anyfailure or condition that is likely to lead to failure. Fail-fast systems are usually designed to stop normal operation rather than attempt to continue a possibly flawed process. Such designs often check the system's state at several points in an operation, so any failures can be detected early. A fail-fast module passes the responsibility for handling errors, but not detecting them, to the next-higher system design level.

从字面含义看就是“快速失败”，让可能的错误尽早的被发现，对应的方式是“fault-tolerant（错误容忍）”。以JAVA集合（Collection）的快速失败为例，当多个线程对同一个集合的内容进行操作时，就可能会产生fail-fast事件。例如：当某一个线程A通过iterator去遍历某集合的过程中，若该集合的内容被其他线程所改变了；那么线程A访问集合时，就会抛出ConcurrentModificationException异常，产生fail-fast事件。

Fail-Over

维基百科地址：http://en.wikipedia.org/wiki/Failover

In computing, failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application, server, system, hardware component, or network. Failover and switchover are essentially the same operation, except that failover is automatic and usually operates without warning, while switchover requires human intervention.

Fail-Over的含义为“失效转移”，是一种备份操作模式，当主要组件异常时，其功能转移到备份组件。其要点在于有主有备，且主故障时备可启用，并设置为主。如Mysql的双Master模式，当正在使用的Master出现故障时，可以拿备Master做主使用。

Fail-Safe

维基百科地址：http://en.wikipedia.org/wiki/Fail-safe

A fail-safe or fail-secure device is one that, in the event of failure, responds in a way that will cause no harm, or at least a minimum of harm, to other devices or danger to personnel.

Fail-Safe的含义为“失效安全”，即使在故障的情况下也不会造成伤害或者尽量减少伤害。维基百科上一个形象的例子是红绿灯的“冲突监测模块”当监测到错误或者冲突的信号时会将十字路口的红绿灯变为闪烁错误模式，而不是全部显示为绿灯。

另外就是我们误用的“自动功能降级”翻译做“Auto-Degrade”会更好一些。

     2.2.2.2 Replication （研习:1 掌握层次：理解）
        *Active replication - Push
        *Passive replication - Pull
           * Data not available, read from peer, then store it locally
             Works well with timeout-based caches
          Master-Slave
          Tree replication
          Master-Master
          Buddy(伙伴) Replication

2.2.3 Stability Patterns ？都说了些什么？（研习:1 掌握层次：了解）
      2.2.3.1 Timeouts:Always use timeouts (if possible):
      2.2.3.2 Circuit Breaker:断路开关,断路器
      2.2.3.3 Let-it-crash
      2.2.3.4 Fail fast
      2.2.3.5 Bulkheads
      2.2.3.6 Steady State
      2.2.3.7 Throttling

2.2.4 Extra material(Client-side consistency|Server-side consistency) （研习:1 掌握层次：理解）
     Client-side consistency
     Server-side consistency

三、（3）这本书说得有道理吗？是全部有道理，还是部分有道理？why?

   在你不能回答上面两个问题时，无法回答这个问题的
   作者最终的目标，加上他建议的达成目标的方法--这要看你认为追求的是什么，以及什么才是最好的追求方法而定。
   这些Pattern在什么情况下用？如何用? 可以解决那些问题？而不解决那些问题？

   3.1.1 作者最终的目标：让软件架构师及产品经理们，了解当前主流架构模式。
   3.1.2 他建议的达成目标的方法？先了解基本软件架构特性如：Scalability\Avaiability\Stability Pattern,
         其次介绍各个Pattern具体技术思想，便于自己在设计软件时思考借鉴，
         再其次，介绍各个Pattern具体技术产品（开源），便于在设计软件时做参考等。
   3.1.3 这些Pattern在什么情况下用？在软件开发周期：设计阶段（尤其架构设计，粗粒度技术选型时）
   3.1.4 如何用? 这个文档指示给你一个引子，后续如何使用，需要研究具体的技术产品（一般产品都有：deom\reference\api help 等）
   3.1.5 可以解决那些问题？而不解决那些问题？可以帮助架构设计技术选型等，帮助提高找到满足设计目标的方法程度。
         不解决那些问题: 具体如何设计、如何技术选型、如何研习选型产品、编码。。。。

四、（4）赞同一本实用性的书之后，确实需要你采取行动。照着作者希望你做的方式来行动。How

行动：为达到某种目的而进行的活动。行动目标，行动方法，行动开始时间，结束时间，行动人，行动地点，行动方式。
在架构设计时，考虑这些因素会提高系统Scalability，Availability&Stability等？
   4.1 我看这个技术资料的目的：
       1 在关于当前主流架构设计讨论中，能清楚知道别人说的概念，能够讨论一些相关技术，提高自己设计水平
       2 在自己产品设计中如何应用这些思想，帮助设计出更好的产品
   4.2 我应该采取什么实际行动：
       1. 技术交流要能够说相关知识点，及why
       2. 大型分布式架构设计、网管架构设计、中心架构设计、。。。能够用上这些Pattern思想提高设计水平

Jdom对这个主题描述

在这个PPT中，你会发现大量词语在本站讨论过：
分布式缓存；数据网格计算；NoSQL;RDBMS；Http缓存如CDN 反向代理；CAP理论，并发模式（消息传递模式软事务内存数据流并发状态共享并发）；分区；复制。EDA事件驱动架构；负载平衡；并行计算(MapReduce模式 Fork/Join模式)。

由于难得一见的全面，需要反复多看几次，理清头绪。我下面就逐步诠释一下：

（1）Scalability可伸缩性，可伸缩性扣住“状态”这个关键词，2006年我就写了一篇状态对象：数据库的替代者，应该说当时已经隐约感觉到了状态这根主线，如今在这个PPT得到全面诠释，非常释然。状态又分为：
分区 Http缓存 RDBMS Sharding碎片 NoSQL 分布式缓存，数据网格，并发Concurrency.

PPT指出可 伸缩性是没有免费午餐，需要在以下因素权衡：
性能和可伸缩性
什么是性能问题？如果你的系统对于一个用户访问还很慢，那就是性能问题；
什么是可 伸缩性问题？如果你的系统对一个用户来说是快的，但是在高访问量下就慢了。

延迟和吞吐量
你要为如下目标奋斗：用可接受的延迟获得最大的吞吐量。

可用性和一致性
就是CAP原理，传统的集中式关系数据库只能获得CA。大量章节谈了 NoSQL，本站已经相关介绍，基本都已经涵括。

状态
在状态方面，首先谈的是Http缓存，反向代理：Varnish squid Nginx mode_proyx这些都很热门，通过CDN在离客户端最近布置状态服务器。

页面静态化主要归纳为Precompute content方面，很多人喜欢将动态页面静态化，变成html，通过引入AJAX异步，也是一种可 伸缩性提高手段，静态化实时性差，适合可以预先计算的页面，预先计算可以采取：朴素的crontab 或Java的Quartz框架，Gearman，hadoop云计算已经google的数据协议，亚马逊的Elastic MapReduce。通过设置http协议，使用客户端浏览器本地缓存，加长http中失效期限，这些在国内被归纳为SEO范畴，也是可 伸缩性一个小章节。

（2）可用性这里意思应该是我们通常的可靠性概念，可用性包括复制Replication和失败恢复failover（过去称为集群）。

何为可用性？是99.999%在线运行。7x24全天候运行。PPT讲了failover的复杂性已经fail back。

Replication复制性分Active复制（推）和Passive复制（拉），形式上分主从双主 Tree和Buddy伙伴四种，这些技术是MySQL Oracle以及追求CP类 NoSQL数据库采取的同步策略。
主从Master-slave：主用来读写，可多个slave用来读；双主则是两个都用来读写；伙伴复制采取一对一结伴，类似Weblogic的集群策略。

(3)稳定性包括let it crash (Akaa框架) SEDA Throttling.

其他
PPT将传统关系数据库和 NoSQL归纳为Service of Record SOR模式，讲了水平垂直伸缩，RDBMS的Sharding碎片技术包括分区和复制。

文章认为： ORM +富模型Rich domain model是一种反模式，会导致你就把精力浪费在照料数据库上。
避免方式是：重新思考你的数据，什么时候你需要ACID，什么时候可以从最终一致性中获得好处？不同种类数据有不同的需求。见本站过去讨论： ORM已经是过去的事情

文章认为除了关系数据库以外，对象数据库如db4o 以及集群，如Terracotta兵马俑 ehcach都属于ACID。

在缓存概念中，解释了什么是write-through，什么是write-behind，什么是缓存的Eviction驱逐策略比如先进先出FIFO；

在message-passing模式中，提到了Erlang Scala的Actors模型，最早提出由1973的Carl Hewitt，他比传统的类Class概念更加符合 OO。

Actors模式特点：share nothing；隔离轻量处理，通过消息通讯；异步且非堵塞，因为不共享就不用同步。每个Actor有一个邮箱。

在Dataflow并发中提到，数据是On-demand, lazy装载懒加载数据(jdonframeowork通过domain events实现数据随用随取，见其 PPT说明)

该PPT将domain events明确为EDA架构，当然还有我们讨论的CQRS。并对Event Sourcing事件源进行了说明，如果有事件记录，就无需ORM，只要持久化事件就可以。

总之，该PPT是对近期热点模式进行总结。值得推荐一看。

可伸缩性, 可用性和稳定性模式

http://www.jdon.com/jivejdon/thread/38928
http://www.jdon.com/jivejdon/thread/38928
http://www.slideshare.net/jboner/scalability-availability-stability-patterns

介绍架构比较不错连接：

网站架构相关PPT、文章整理（更新于2009-7-15）淘宝牛人

http://www.blogjava.net/BlueDavy/archive/2009/04/28/267970.html

http://www.blogjava.net/BlueDavy/