usits2001 talk

Information about usits2001 talk

Published on October 29, 2007

Author: Dabby

Source: authorstream.com

Content

Neptune: Scalable Replication Management and Programming Support for Cluster-based Network Services:  Neptune: Scalable Replication Management and Programming Support for Cluster-based Network Services Kai Shen, Tao Yang, Lingkun Chu, JoAnne L. Holliday, Douglas K. Kuschner, and Huican Zhu Department of Computer Science University of California, Santa Barbara http://www.cs.ucsb.edu/research/Neptune Motivations:  Motivations Availability, incremental-scalability, and manageability - key requirements for building large-scale network services. Challenging for those with frequent persistent data updates. Existing solutions in managing persistent data: Pure data partitioning: no availability guarantee; bad at dealing with runtime hot-spots. Disk-sharing: inherently unscalable; single-point of failure. Replication provided by database vendors: tied to specific database systems; inflexible in consistency. Neptune Project Goal:  Neptune Project Goal Design a scalable clustering architecture for aggregating and replicating network services with persistent data. Provide a simple and flexible programming model to shield complexity of data replication, service discovery, load balancing, and failover management. Provide flexible replica consistency support to address availability and performance tradeoffs for different services. Related Work:  Related Work TACC, MultiSpace: infrastructure support for cluster-based network services. DDS: distributed persistent data structure for network services. Porcupine: cluster-based email service (with commutative updates). Bayou: weak consistency for wide-area applications. BEA Tuxedo – platform middleware supporting transactional RPC. Outline:  Outline Motivations & Related Work System Architecture and Assumptions Replica Consistency and Failure Recovery System implementation and Service Deployments Experimental Studies Partitionable Network Services:  Partitionable Network Services Characteristics of network services: Information independence. Service data can be divided into independent categories (e.g. discussion group). User independence. Data accessed by different users tend to be independent (e.g. email service). Neptune is targeting partitionable network services: Service data can be divided into independent partitions. Each service access can be delivered independently on a single partition; or Each service access can be aggregated from sub-services each of which can be delivered independently on a single partition. Conceptual Architecture for a Neptune Service Cluster:  Conceptual Architecture for a Neptune Service Cluster Neptune Components:  Neptune Components Neptune components on client and server-side: Neptune Server Module: starts, regulates, terminates registered service instances and maintains replica data consistency. Neptune Client Module: provides location-transparent accesses to application service clients. Programming Interfaces:  Programming Interfaces Request/Response communications: Client-side API: (called by service clients) NeptuneCall (CltHandle, Service, Partition, SvcMethod, Request, Response); Service Interface: (abstract interface that application services implement) SvcMethod (SvcHandle, Partition, Request, Response); Stream-based communications: Neptune sets up a bi-directional stream between the service client and the service instance. Assumptions:  Assumptions All system modules follow fail-stop failure model. Network partitions do not occur inside the service cluster.  Neptune does allow persistent data survive all-node failures. Atomic execution is supported if each underlying service module ensures atomicity in stand-alone configuration. Neptune Replica Consistency Model:  Neptune Replica Consistency Model A service access is called a write if it changes the state of persistent data; and it is called a read otherwise. Level 1: Write-anywhere replication for commutative writes. Writes are accepted at any replica and propagated to peers. E.g. message board (append-only). Level 2: Primary-secondary replication for ordered writes. Writes are only accepted at primary node, then ordered and propagated to secondaries. Level 3: Primary-secondary replication with staleness control. Soft time-based staleness bound and progressive version delivery. Not strong consistency because writes completed independently at each replica. Soft Time-based Staleness Bound:  Soft Time-based Staleness Bound Semantics: each read serviced at a replica at most x seconds stale compared to the primary. Important for services such as on-line auction. Implementation: Each replica periodically announces its data version; Neptune client module directs requests only to replicas with a fresh enough version. The bound is soft, depending on network latency, announcement frequency, and intermittent packet losses. Progressive Version Delivery:  Progressive Version Delivery From each client’s point of view, Writes are always seen by subsequent reads. Versions delivered for reads are progressive. Important for services like on-line auction. Implementation: Each replica periodically announces its data version; Each service invocation returns a version number for a service client to keep as a session variable; Neptune client module directs a read to a replica with an announced version >= all the previously-returned version. Failure Recovery:  Failure Recovery A REDO log is maintained for each data partition at each replica, which has two portions: Committed portion: completed writes; Uncommitted portion: writes received but not yet completed. Three-phrase recovery for primary-secondary replication (level-2 & level-3): Synchronize with underlying service module; Recover missed writes from the current primary; Resume normal operations. Only phase one is necessary for write-anywhere replication (level-1). Outline:  Outline Motivations & Related Work System Architecture and Assumptions Replica Consistency and Failure Recovery System Implementation and Service Deployments Experimental Studies Prototype System Implementation on a Linux cluster:  Prototype System Implementation on a Linux cluster Service availability and node runtime workload are announced through IP Multicast. multicast once a second; kept as soft state, expires in five seconds. Service instances can run either as processes or threads in Neptune server runtime environment. Each Neptune server module maintains a process/thread pool and a waiting queue. Experience with Service Deployments:  Experience with Service Deployments On-line discussion group View message headers, view message, and add message. All three consistency levels can be applied. Auction Level 3 consistency with staleness control is used. Persistent cache Store key-value pairs (e.g. caching query result). Level 2 consistency (Primary-secondary) is used.  Fast prototyping and implementation without worrying about replication/clustering complexities. Experimental Settings for Performance Evaluation:  Experimental Settings for Performance Evaluation Synthetic Workloads: 10% and 50% write percentages; Balanced workload to assess best-case scalability; Skewed workload to evaluate the impact of runtime hotspots. Metric: maximum throughput when at least 98% client requests are completed in 2 seconds. Evaluation Environment: Linux cluster with dual 400MHz Pentium IIs, 512MB/1GB memory, dual 100Mb/s Ethernet interfaces. Lucent P550 Ethernet switch with 22Gb/s backplane bandwidth. Scalability under Balanced Workload:  Scalability under Balanced Workload NoRep is about twice as fast as Rep=4 under 50% writes. Insignificant performance difference across three consistency levels under balanced workload. Skewed Workload:  Skewed Workload Each skewed workload consists of requests chosen from a set of partitions according to Zipf distribution. Define the workload imbalance factor as the proportion of the requests directed to the most popular partition. For a 16-partition service, an imbalance factor of 1/16 indicates a completely balanced workload. An imbalance factor of 1 means all requests are directed to one partition. Impact of Workload Imbalance on Replication Degrees:  Impact of Workload Imbalance on Replication Degrees Replication provides dynamic load-sharing for runtime hot-spots (Rep=4 could be up to 3 times as fast as NoRep). 10% writes; level-2 consistency; 8 nodes. Impact of Workload Imbalance on Consistency Levels:  Impact of Workload Imbalance on Consistency Levels 10% writes; Rep degree 4; 8 nodes. Modest performance difference: Up to 12% between level-2 and level-3; Up to 9% between level-1 and level-2. Failure Recovery for Primary-secondary Replication:  Failure Recovery for Primary-secondary Replication Graceful performance degradation. Performance drop after the three-node failure. Errors and timeouts trailing each recovery (write recovery and sync overhead). Conclusions:  Conclusions Contributions: Scalable replication for cluster-based network services; multi-level consistency with staleness control. A simple programming model to shield replication and clustering complexities from application service authors. Evaluation results: Replication improves performance for runtime hotspots. Performance of level 3 consistency is competitive. Level 2/3 carries extra overhead during failure recovery. http://www.cs.ucsb.edu/research/Neptune

Related presentations


Other presentations created by Dabby

Propaganda Comparativa
16. 11. 2007
0 views

Propaganda Comparativa

ch 6 ppt
15. 06. 2007
0 views

ch 6 ppt

Feudal Japan Origin Religion
09. 10. 2007
0 views

Feudal Japan Origin Religion

Riedel DASER2
25. 09. 2007
0 views

Riedel DASER2

Shen CRF
25. 09. 2007
0 views

Shen CRF

Anna
11. 10. 2007
0 views

Anna

intro CS 3
16. 10. 2007
0 views

intro CS 3

TheatreHistoryO
17. 10. 2007
0 views

TheatreHistoryO

panama 5
22. 10. 2007
0 views

panama 5

Lesson 1 Intro and Pre WW II
22. 10. 2007
0 views

Lesson 1 Intro and Pre WW II

gf5
25. 09. 2007
0 views

gf5

hao discr prob mod rel dat
25. 09. 2007
0 views

hao discr prob mod rel dat

Correcting News Mistakes
05. 10. 2007
0 views

Correcting News Mistakes

MRCME Febrile Rash
23. 10. 2007
0 views

MRCME Febrile Rash

Microfinance MDGs
28. 11. 2007
0 views

Microfinance MDGs

kinetic models
25. 09. 2007
0 views

kinetic models

rtc
16. 10. 2007
0 views

rtc

debate
26. 10. 2007
0 views

debate

SALSA RTE Burchardt Frank
01. 11. 2007
0 views

SALSA RTE Burchardt Frank

Behav Interv Gay MA Users
02. 11. 2007
0 views

Behav Interv Gay MA Users

ECCR IU Mar15 07
21. 11. 2007
0 views

ECCR IU Mar15 07

Lesson 1 Introduction
28. 12. 2007
0 views

Lesson 1 Introduction

99 ChemAware Chapter 03
02. 01. 2008
0 views

99 ChemAware Chapter 03

Dr G B Reddy
03. 01. 2008
0 views

Dr G B Reddy

Sloboda Prague
25. 09. 2007
0 views

Sloboda Prague

ber
02. 08. 2007
0 views

ber

05 bandura
02. 08. 2007
0 views

05 bandura

Robins
25. 09. 2007
0 views

Robins

Comp Gen Phylo HMM
25. 09. 2007
0 views

Comp Gen Phylo HMM

plkongres2007 crop 04
04. 10. 2007
0 views

plkongres2007 crop 04

lysenko
26. 11. 2007
0 views

lysenko

CNE120 11 8 04
02. 08. 2007
0 views

CNE120 11 8 04

Martin Hilbert
22. 10. 2007
0 views

Martin Hilbert

antioxidants
04. 03. 2008
0 views

antioxidants

presentation reynolds
07. 11. 2007
0 views

presentation reynolds

certeau present
03. 01. 2008
0 views

certeau present

NewBrunswick
12. 03. 2008
0 views

NewBrunswick

JVM models in ACL2
25. 09. 2007
0 views

JVM models in ACL2

ge203 08
25. 03. 2008
0 views

ge203 08

Q307 englanti
26. 03. 2008
0 views

Q307 englanti

auerickson
25. 09. 2007
0 views

auerickson

EcologicalFootprints
07. 04. 2008
0 views

EcologicalFootprints

TradeinHealthService s130207
28. 03. 2008
0 views

TradeinHealthService s130207

april cyprus lnarayanan
30. 03. 2008
0 views

april cyprus lnarayanan

BRAMBLE
31. 12. 2007
0 views

BRAMBLE

Macro course 2005 lecture 4
09. 04. 2008
0 views

Macro course 2005 lecture 4

summit2008a
10. 04. 2008
0 views

summit2008a

Wayne NY NJPresentation
13. 04. 2008
0 views

Wayne NY NJPresentation

AE2 C04 2007
14. 04. 2008
0 views

AE2 C04 2007

Rinolfi
17. 10. 2007
0 views

Rinolfi

HDX4000 Training NA
22. 04. 2008
0 views

HDX4000 Training NA

chapman poster 14jan05
25. 09. 2007
0 views

chapman poster 14jan05

BBC Series State of the Earth
08. 10. 2007
0 views

BBC Series State of the Earth

1960spowerpoint
02. 11. 2007
0 views

1960spowerpoint

hansjeppson
15. 10. 2007
0 views

hansjeppson

hegel
05. 01. 2008
0 views

hegel

exec blue 060120
18. 06. 2007
0 views

exec blue 060120

Ethiopia session II
18. 06. 2007
0 views

Ethiopia session II

emergenuity
18. 06. 2007
0 views

emergenuity

experiencia aenor
18. 06. 2007
0 views

experiencia aenor

India Work Plan UNCT
07. 01. 2008
0 views

India Work Plan UNCT

Tropsha 4 5 05
24. 11. 2007
0 views

Tropsha 4 5 05

posterH2OinPFCs
01. 01. 2008
0 views

posterH2OinPFCs

etd2004
12. 10. 2007
0 views

etd2004

chi00
19. 11. 2007
0 views

chi00

38613SciTechStudies1
16. 10. 2007
0 views

38613SciTechStudies1

educause 2004 Fedora
25. 09. 2007
0 views

educause 2004 Fedora

cours7
23. 10. 2007
0 views

cours7

comics
15. 06. 2007
0 views

comics

Columbia Political Cartoons
15. 06. 2007
0 views

Columbia Political Cartoons

Collins Math Stats2
15. 06. 2007
0 views

Collins Math Stats2

Chapter Eight student version
15. 06. 2007
0 views

Chapter Eight student version

blagues
15. 06. 2007
0 views

blagues

Anime Manga Pres
15. 06. 2007
0 views

Anime Manga Pres

1193 Cartoons pig
15. 06. 2007
0 views

1193 Cartoons pig

1 cartoon
15. 06. 2007
0 views

1 cartoon

PBOCJapan060103
09. 10. 2007
0 views

PBOCJapan060103

control
15. 06. 2007
0 views

control

jcdl contentmodels
25. 09. 2007
0 views

jcdl contentmodels

curso dq abp joao
28. 12. 2007
0 views

curso dq abp joao

conf present 045
07. 01. 2008
0 views

conf present 045

05 International Conflict
23. 11. 2007
0 views

05 International Conflict

banse1
15. 06. 2007
0 views

banse1

Feg Express
18. 06. 2007
0 views

Feg Express

Fantasztikus programozas
18. 06. 2007
0 views

Fantasztikus programozas

smp99
25. 09. 2007
0 views

smp99

efg pr005
07. 11. 2007
0 views

efg pr005

F8 Femenino
18. 06. 2007
0 views

F8 Femenino

9 3 DEPAC SLPRS Ppresentation
29. 11. 2007
0 views

9 3 DEPAC SLPRS Ppresentation

geer sesiposter
25. 09. 2007
0 views

geer sesiposter