Scaling Your Data Horizontally On OpenStack

MagnetoDB: NoSQL Database As A Service

You’re talking to one of your users about hosting a new application on your OpenStack deployment. One of the first questions they ask you is: “I’m running my SQL database on 64 cores and 256GB of memory… can you host that for me on OpenStack?”

Databases on OpenStack

As I described in my last post, we’re building a huge private cloud at Symantec. Early on, before we had chosen OpenStack as our platform, we recognized a common need for a horizontally scaling database service across many of the Symantec product teams that would eventually become our customers. Some of these teams were already running and operating NoSQL clusters. Others were pushing the upper limits of vertically scaling database technologies, buying larger and larger servers or manually sharding their data into multiple physical servers.

This post isn’t about the pros and cons of relational and NoSQL databases – there are plenty of those out there. Let’s start by agreeing that both types of databases have their sweet spots, and both are necessary in an OpenStack environment. In this post I'll be talking specifically about how to provide your OpenStack users with easy access to powerful NoSQL capabilities.

OpenStack provides some great storage technologies: Swift for large scale object storage, Trove for database provisioning, Cinder for block storage. However we saw a need within Symantec for NoSQL as a service, a place where users could store and query lots of table data with very low latency, and without the operational overhead of managing the database cluster. This is where MagnetoDB comes in.

MagnetoDB

MagnetoDB is a fully open sourced, high performance, high throughput NoSQL database service. It satisfies the NoSQL need on OpenStack that DynamoDB fills on AWS. It provides our users with most of the benefits of running their own NoSQL cluster without… well… having to run their own NoSQL cluster. The user manages and populates their data tables via a Web services API, in a secure multi-tenant environment. We worry about how to operate and scale the underlying NoSQL database.

A MagnetoDB table has a very flexible schema: the user defines only a primary key and any secondary indexes; the other row attributes can be defined dynamically. The user can then query that table by either the key or indexes. Tables can grow in the many terabytes and billions of rows while still maintaining excellent performance. MagnetoDB also supports more advanced features like configurable consistency, conditional updates, and row expiration. We’re adding a streaming interface for higher performing bulk load processes. And naturally MagnetoDB is natively integrated with Keystone for authentication and multi-tenancy.

True to the OpenStack ethos, MagnetoDB provides an API layer and a driver abstraction, allowing you to plug in the NoSQL database of your choice. We’ve implemented the Cassandra driver, though HBase, MongoDB, and others may also be good options. Based on our user’s requirements, we’re building our MagnetoDB and Cassandra deployments to handle up to 10TBs of storage, billions of rows, and 10Ks requests per second for each tenant.

Do You Need MagnetoDB?

Is MagnetoDB something you need? Let’s get back to your user’s question about hosting a vertically scaling relational database on OpenStack. I generally answer this question with one of my own: “Can you help me understand what you’re storing?” Sometimes the answer is that the data is truly relational. However, often much of the data would be better suited to a different type of storage.

Large blob data can go into Swift, with just the object reference stored in the database. If a majority of the remaining data can be stored in tables where joins can be avoided, MagnetoDB may likely be a good fit, removing the requirement for ever larger and larger machine instances. The remaining, relational data may likely be appropriate for a small, Trove-provisioned MySQL instance running on a much more manageable sized VM.

Some users have been interested in accessing the features of the underlying NoSQL database. In order to enforce authentication and multi-tenancy, we require users to interact with MagnetoDB only through the REST API. Folks who need raw Cassandra can deploy it on top of OpenStack, and in fact Trove is adding support for Cassandra provisioning. However, we’ve found that many users who initially want their own NoSQL cluster ultimately use MagnetoDB instead, making the trade-off that gives them the easier operational model.

Our product teams are already designing their applications around the use of MagnetoDB, and we’re starting to see some patterns:

Storing, retrieving, and searching user profile data
Supporting searchable metadata for objects stored in Swift
Importing results from big data processes, for more interactive data mining
Tracking application metrics in real-time

If you’d like to try MagnetoDB out, we’re integrated with DevStack. You can leave your questions and comments here, or drop me a note privately if you prefer. And look for more details in future posts.

Keith Newstadt

Symantec Cloud Platform Engineering

Follow: @knewstadt

Scaling Your Data Horizontally On OpenStack

Trending Articles

井上貴博アナウンサー彼女や結婚の噂は？実家や親が話題？人気は？

2015年4月3日号　横浜銀行（4月1日付）

サキュバス戦記　攻略

大阪・泉南イオンで飛び降り自殺とみられる転落事件が発生：ネットで拡散された理由とは

海南市でひき逃げ　２７歳会社員の男を逮捕

[1080p]回復術士のやり直し 11 完全《回復》ver.

ゴールデン・スランバー　ザ・ビートルズ　歌詞　和訳

五嶋みどり　タングルウッドの奇跡　その時何が起こったのか？

彫だいタトゥー料金表です！

バットで殴り4000円強盗　容疑の３人逮捕

salesoforceおよびapexでの文字数・バイト数のカウントについて

Robocopy のエラー (戻り値) について

人気占い師・Sakkoが占う！今日のアナタの運勢と、ラッキーカラーは・・・

Data Pump Exportエラーについて

國領屋一家（山口組）

旧)野尻湖プリンス　バーラウンジ

名古屋ホストクラブ経営者拉致殺害事件

生野が生んだスーパースター文政　現在、男道（刑務所）にて修行（服役）中㉙

jQuery –本をめくる効果が出せるスクリプト２選。「BookBlock.js」「Turn.js」

日本空手道明武会鳳凰杯大会第18回オープン空手道交流大会結果