Come see us at MySQL Conf

Horizontal Scaling with HiveDB

  • Presented by: Britt Crawford (Cafepress.com), Justin McCarthy (Cafepress.com)
  • Time: 10:50am - 11:50am Thursday, 04/17/2008
  • Tracks: Architecture and Technology, General, Java, LAMP, MySQL Cluster and High Availability, Replication and Scale-Out
  • Location: Ballroom D

Massively scalable systems used to be the sole province of large corporations with acronyms for names and budgets the size of a small country’s GDP, but with the explosion of user-created content sites scaling is now everybody’s problem.

One of the main strategies for scaling databases is to partition the data across many servers. This is a simple enough idea but requires a good deal of expertise in systems design, programming, and MySQL administration to execute successfully. Many horizontal partitioning systems have been built and put into production in corporate environments, but so far none of them have:

  • Built an easily extensible platform
  • Released their system under an open source license

This session introduces HiveDB, an open source framework for partitioning MySQL that implements the core ideas behind horizontal partitioning in as clear and concise a way as possible.

DNS Hiccups

We’re back!

We had some recent DNS difficulties that took us offline over weekend. The problem was compounded by the fact that I was out of town and not paying attention to email. Sorry about that,

lolcat.jpg

HiveDB 0.9.3 Released

We’ve just tagged HiveDB release 0.9.3 and declared it ready for use. This release is packed with a ton of new features along with a lot of updates to the core HiveDB code. Hopefully this will be our last development release before 1.0. Here’s what’s new:

» Hibernate Support

This is where the bulk of the work for this release has gone. We now directly integrate with Hibernate using Hibernate Shards. Writing an application with HiveDB is as easy as using Hibernate.


List entityClasses = Lists.newList(Continent.class, WeatherReport.class, WeatherEvent.class);
ShardAccessStrategy strategy = new SequentialShardAccessStrategy();
HiveSessionFactory factoryBuilder = new HiveSessionFactoryBuilderImpl( hiveUri, entityClasses, strategy);
WeatherReport report = new WeatherReport();
...
Session session = factoryBuilder.openSession();
Transaction tx = null;
try {
	tx = session.beginTransaction();
	session.saveOrUpdate(report);
} catch( RuntimeException e ) {
	if(tx != null)
		tx.rollback();
} finally {
	session.close();
}


» Simple configuration and installation using Java Annotations

We created automated configuration an installation of the Hive. You can decorate you entity classes or interfaces with our annotations and ConfigurationReader will process them and create the required Hive resources and indexes.


@Resource("WeatherReport")
public interface WeatherReport {
    @PartitionIndex
    String getContinent();

    @EntityId
    Integer getReportId();

    @Index
    Integer getRegionCode();

    @Index(type=IndexType.Data)
    Double getLatitude();

    @Index(type=IndexType.Data)
    Double getLongitude();

    int getTemperature();
}



Just pass the annotated classes into the ConfigurationReader and call install() and your hive is ready to go.


ConfigurationReader reader  = new ConfigurationReader(WeatherReport.class);
reader.install(hiveUri);


» More compact and flexible directory layout

We have added relationships between entities into the directory removing the need for redundant secondary indexes.


» Automated Indexing for Hive entities

If you have annotated your entity classes you can use the HiveIndexer to automatically add or update entries in the directory. Or if you are using Hibernate the HiveSessionFactory will automatically index anything that you save or update.



There are a lot more new features including automated proxying for entity classes, simple data validation, serialized blob storage along with many updates and improvements to the core HiveDB code. We intend for this to be the last development release before HiveDB 1.0. We’ll be updating all of the documentation and posting more about the new features in the coming weeks.

Finally, come and see us at MySql Conference 2008.

About

HiveDB is an open source framework for building scalable, high-performance, partitioned MySQL systems created and maintained by:


Join us at the HiveDB-Dev Google group.

Comments

  • Ajit: Does hivedb support aggregation of data across shards? For me this is a very interesting use case and if hivedb...
  • Alex Li: It is a wonderful move to stay away from SVN. Unfortunately, Git seems does not handle file rename/move well...
  • britt: @MikeD Hi MikeD, I’m not sure that I understand your question correctly. So, if I’m answering the...
  • britt: @Divya B 1. HiveDB doesn’t handle replication. In general we defer to MySQL native replication. However,...
  • Divya B: Could you also give us some information about the following? 1. Is there any data replication happening in...