Skip to end of metadata
Go to start of metadata

What is RAMCloud?

RAMCloud is a new class of storage for large-scale datacenter applications. It is a key-value store that keeps all data in DRAM at all times (it is not a cache like memcached). Furthermore, it takes advantage of high-speed networking such as Infiniband or 10Gb Ethernet to provide very high performance. Applications running in the same datacenter as a RAMCloud cluster can access small objects in about 5μs, which is1000x faster than disk-based storage systems. Small writes take about 15μs. At the same time, RAMCloud storage is durable: data is automatically replicated on nonvolatile secondary storage such as disk or flash, so it is not lost when servers crash. One of RAMCloud's unique features is that it recovers very quickly from server crashes (only 1-2 seconds) so the availability gaps after crashes are almost unnoticeable. Finally, RAMCloud is designed to scale: it can support clusters containing thousands of storage servers, with total capacities up to a few petabytes.

From a practical standpoint, RAMCloud enables a new class of applications that manipulate large data sets very intensively. Using RAMCloud, an application can combine tens of thousands of items of data in real time to provide instantaneous responses to user requests.  Unlike traditional databases, RAMCloud scales to support very large applications, while still providing a high level of consistency. We believe that RAMCloud, or something like it, will become the primary storage system for structured data in cloud computing environments such as Amazon's AWS or Microsoft's Azure. We have built the system not as a research prototype, but as a production-quality software system, suitable for use by real applications.

RAMCloud is also interesting from a research standpoint. Its two most important attributes are latency and scale. The first goal is to provide the lowest possible end-to-end latency for applications accessing the system from within the same datacenter. We currently achieve latencies of around 5μs for reads and 15μs for writes, but hope to improve these in the future. In addition, the system must scale, since no single machine can store enough DRAM to meet the needs of large-scale applications. We have designed RAMCloud to support at least 10,000 storage servers; the system must automatically manage all the information across the servers, so that clients do not need to deal with any distributed systems issues. The combination of latency and scale creates a large number of interesting research issues. To date we have addressed several of these, such as how to ensure data durability without sacrificing the latency of reads and writes, how to take advantage of the scale of the system to recover very quickly after crashes, and how to manage storage in DRAM. Many more issues remain, such as whether we can provide higher-level features such as secondary indexes and multiple-object transactions without sacrificing the latency or scalability of the system. We are currently exploring several of these issues.

The RAMCloud project is based in the Department of Computer Science at Stanford University.

Learning About RAMCloud

General information about RAMCloud, such as talks and papers. Much of the information here is related to the research aspects of the project, as opposed to information on how to use RAMCloud.

How to Deploy and Use RAMCloud

RAMCloud has now reached a level of maturity where it is suitable for production use with real applications.  The links below provide information on how to set up a RAMCloud cluster and on the RAMCloud APIs for applications.

RAMCloud Performance

Measurements of RAMCloud performance, as well as comparisons between RAMCloud and other systems.

Information for RAMCloud Developers

Information for people who are working on the RAMCloud code base; it is intended primarily for the internal use of the RAMCloud team at Stanford, but may be useful to other people as well.

The RAMCloud Test Cluster

Information about the cluster we use for RAMCloud testing at Stanford. Unfortunately not all of this information is completely up to date.

New Cluster

Design Notes

These documents were used at various points in the project to record our early ideas about various parts of the system. Most of these pages are now out of date (they typically are not updated once serious coding begins) but they may still provide useful background information as well as alternatives that we considered.

Project History, Schedules, Milestones

Ideas for Future Work

Related Topics

Miscellaneous Topics

Personal Wikis

Page: 2-week Milestones Page: Amendments to Current Documentation and Testing Guidelines Page: Ankita's Coordinator Notes List Page: Application APIs Page: Applications Page: Articles about RAMCloud Page: Assumptions Page: Back-of-the Envolope Calculations Page: Backup and Recovery Revisited Page: Behnam's Notes Page: Cache Latencies and Sizes on RAMCloud Test Cluster Page: Cluster Custodian Page: Cluster Intro Page: Cluster Inventory Page: Cluster Tasks Page: Coding Conventions Page: Concurrency model Page: Controlling Machines Remotely via IPMI Page: Coordinator Page: Coordinator - Design Discussions Page: Coordinator - Progress tracking page Page: Coordinator Refactoring Page: Copyright Notice Page: Creating a RAMCloud Client Page: Current Applications Page: Data model Page: Data Operations Page: Data persistence Page: DCFT Paper Notes Page: Dead Machines Page: Deciding Whether to Use RAMCloud Page: Design Meetings from Winter Quarter 2010 Page: Design Review Page: Design Seminars - Spring Quarter 2009 Page: Detecting Incomplete Logs Page: Directory Structure Page: Distributed Leases Page: Distribution of data among servers, replication, locality Page: Documentation Guidelines Page: Effect of Profile-Guided Optimization Page: Energy management Page: Facebook Information Page: Failures Page: FastTransport Page: For New Developers - Understanding Reads in RAMCloud Page: Future Projects and PhD Topics Page: Garbage Collection Resources Page: General Information for Developers Page: Glossary of RAMCloud Terms Page: Hash Table & Multi-Level Lookup Performance Page: Higher Level Data Model Notes Page: How To Measure Performance Page: How To Run Clusterperf Page: Index API Page: Infiniband Tools and Debugging Page: Infolunch Notes Page: Inf Under Load Page: Inside Concurrency Primitives Page: Intel 530 Performance Page: Interesting Links Page: Interesting Statistics Page: Least Usable System Page: Lights-out automated management Page: Low latency RPCs Page: Machine Evaluations Page: Measuring RAMCloud Performance Page: Mellanox HW and Infiniband Notes Page: Mellanox Performance Data Page: Memory benchmark for last level cache misses Page: Memory management within a server Page: Memory Prices Page: Mfence Page: Milestones from 2010 Page: multiRead Benchmarking Page: Multi-tenancy Page: Naming and Indexing Page: NetBeans tips Page: Network substrate Page: New Contributor Checklist Page: New Infiniband Fabric Notes Page: Node architecture Page: Old Design Documents Page: Older Performance Measurements Page: Old Miscellaneous Topics Page: Online schema changes Page: Open Questions Page: Paper Ideas Page: Planning Meeting June 16, 2009 Page: PNUTS Page: Primary Keys Page: Project History Page: Proposed Server API Page: Protocol Buffers Page: Python Bindings Page: RAMCloud 0.1 Page: RAMCloud 0.2 Page: RAMCloud 0.3 Page: RAMCloud 1.0 Page: RAMCloud Filesystem Page: RAMCloud Nuts and Bolts Lunch Ideas Page: RAMCloud Papers Page: RAMCloud Presentations Page: RAMCloud Tech Talks Page: Recovery Page: Recovery "Bin" Partitioning Page: Recovery Blitz Page: Recovery Performance Page: Recovery Task List Page: Redis vs. RAMCloud Page: References Page: Reliability Page: Rethinking Tombstones Page: Rotation and CURIS Ideas Page: RPC API Page: RPC Performance Numbers Page: RPC Protocol Page: Running LogCabin + Coordinator + Masters Page: Running Recoveries with Page: Scalability Page: Security Page: Security and access control Page: SEDCL Retreat 2012 - Industrial Feedback Session Page: SEDCL Retreat 2013 - Industrial Feedback Session - Outline Scribe Page: Server Memory Architecture Page: Server Prices Page: Service Locators Page: Setting Up a RAMCloud Cluster Page: Software Design Philosophy Page: Split of functionality between servers and clients Page: Spring Discussion Wrapup Page: SSD Experiments Page: Strawman Page: Supported Platforms Page: Tablet Migration Page: Team Members Page: Technical Support Page: The ALPO consensus protocol Page: The role of flash memory and other technologies Page: Tips from Charlie & Co Page: Transaction_satoshi Page: Transactions Page: Undo, redo, audit trail Page: Updating BIOS automatically with PXE and FreeDOS Page: Updating Mellanox NIC Firmware Page: Usability Features and Research Topics Page: Version Numbers Page: Vim Settings Page: Weird things discovered while running Coordinator Recovery Page: Wireshark Plugin Page: Workload Generator