Salat

https://github.com/novus/salat




Simple Serialization with MongoDB and Scala





Rose Toomey, Novus Partners
June 2011 @ MongoNYC

Salat: Simple Serialization with MongoDB and Scala

  • What is Salat?
  • Demonstration
  • How does it work?
  • In detail: more advanced usage
  • From zero to DAO in two minutes
  • What next?
  • Other projects of interest
  • More information
  • Credits

What is Scala?



Scala is a concise, elegant object-oriented language that runs on the JVM:

  • statically typed
  • functional
  • scalable
  • makes it easy to create libraries and DSLs
  • interoperable with Java

What is Casbah?

What is Salat?

Salat provides fast, reliable bi-directional serialization between Scala case classes and MongoDB's DBObject format.

Fast

Salat mines pickled Scala signatures, introduced with Scala 2.8.0, for hi-fi type information. "Salat" is a transliteration of the Russian word "салат", for "salad", because Salat is lightweight and doesn't slow you down through use of runtime reflection.

Simple

Salat's design is focused: this is a library for serializing and deserializing Scala case classes.

Focused

Salat is not a fully-fledged ORM and does not attempt to match the flexibility, compability or functionality of an ORM that would let you define relationships between classes, provide a query language, or serialize many types of classes.

Availability

The latest release, Salat 0.0.7, is available for Scala 2.8.1.

The latest snapshot, Salat 0.0.8-SNAPSHOT, is available for Scala 2.8.1 and 2.9.0-1.

Salat is not available for Scala 2.7.7 because pickled Scala signatures were introduced in Scala 2.8.0.

Salat is not compatible with Java classes for the same reason.

Dependencies

Salat has dependencies on the latest releases of:

  • scalap, a Scala library that provides functionality for parsing Scala-specific information out of classfiles
  • mongo-java-driver, the official Java driver for MongoDB
  • casbah-core, the official Scala toolkit for MongoDB

Getting started

Add the Novus repos and the salat-core dependency to your sbt project

val novusRepo = "Novus Release Repository" at "http://repo.novus.com/releases/"
val novusSnapsRepo = "Novus Snapshots Repository" at "http://repo.novus.com/snapshots/"

val salat = "com.novus" %% "salat-core" % "0.0.8-SNAPSHOT"

Import Salat implicits and default context

import com.novus.salat._
import com.novus.salat.annotations._
import com.novus.salat.global._

Try it out!

The sample code shown in this presentation is available at:
rktoomey/mongonyc2011-salat-examples.

You can build and run the project using simple-build-tool.

The quickest way to get started experimenting is to clone the project and run sbt console to use a Scala interpreter with a classpath that includes compiled sources and managed libs:

~ $ git://github.com/rktoomey/mongonyc2011-salat-examples.git
~ $ cd mongonyc2011-salat-examples
~/mongonyc2011-salat-examples $ sbt console

How to import what you need

You can try out the sample code shown in this presentation by running sbt console with these imports:

Welcome to Scala version 2.9.0.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_24).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import com.novus.salat._

scala> import com.novus.salat.global._

scala> import com.novus.salat.annotations._

scala> import com.mongodb.casbah.Imports._

scala> import prasinous._

Demonstration: there and back again

Given a case class:

package prasinous

case class Alpha(x: String)

Serializing and deserializing is as simple as using the asDBObject and asObject methods:

scala> val a = Alpha(x = "Hello world")
a: prasinous.Alpha = Alpha(Hello world)

scala> val dbo = grater[Alpha].asDBObject(a)
dbo: com.mongodb.DBObject = { "_typeHint" : "prasinous.Alpha" , "x" : "Hello world"}

scala> val a_* = grater[Alpha].asObject(dbo)
a_*: prasinous.Alpha = Alpha(Hello world)

scala> a == a_*
res0: Boolean = true

How does it work?

A case class instance extends Scala's Product trait, which provides a product iterator over its elements.

Salat used pickled Scala signatures to turn case classes into indexed fields with associated type information.

These fields are then serialized or deserialized using the memoized indexed fields with type information.

For more information about pickled Scala signatures, see
scala.tools.scalap.scalax.rules.scalasig.ScalaSigParser

In addition, refer to this brief paper:
SID # 10 (draft) - Storage of pickled Scala signatures in class files

Moving parts

  • a Context has global serialization behavior including:

    • how type hinting is handled (always, when necessary or never) - default is always
    • what the type hint is - default is _typeHint
    • how enums are handled (by value or by id) - default is by value
    • math context used for deserializing BigDecimal (default precision is 17)
  • a Grater can serialize and deserialize an individual case class

Keeping things in scope

The context is an implicit supplied by importing Salat's global package object (or a custom context defined in your own package object).

import com.novus.salat.global._

Graters are created on first request. Use the grater method supplied in Salat's top level package object:

import com.novus.salat._
import com.novus.salat.global._

grater[Alpha].asObject(dbo)

There and back again: more detail

scala> val dbo = grater[Alpha].asDBObject(a)

Our method call to grater[Alpha] made the Context either find or create a Grater for Alpha.

The Grater finds the pickled Scala signature for Alpha and uses it to identify the constructor, fields and companion object.

Once a Grater for Alpha exists, we know everything we need to know to do the following two things without any runtime reflection:

  • to serialize an instance of case class Alpha as a DBObject
  • to deserialize a DBObject representing an instance of Alpha into an instance of case class Alpha

There and back again: more detail

scala> val dbo = grater[Alpha].asDBObject(a)
dbo: com.mongodb.DBObject = { "_typeHint" : "prasinous.Alpha" , "x" : "Hello world"}

Calling asDBObject creates a DBObject representation of Alpha.

What about the type hint?

The type hint, _typeHint, is not essential for deserializing Alpha and could be omitted under some circumstances.

We'll talk more about that later.

There and back again: more detail

scala> val a_* = grater[Alpha].asObject(dbo)
a_*: prasinous.Alpha = Alpha(Hello world)

Turning the DBObject representation of an instance of Alpha back into an object is as easy as calling asObject.

Note that we didn't use _typeHint here - we already told the Grater we were expecting Alpha.

So when do we need type hints?

We need type hints to deal with two types of situations:

  • case classes typed to a trait or an abstract superclass and annotated with @Salat
  • if the context needs to look up a Grater from a raw DBObject



For example:

scala> val dbo = MongoDBObject("_typeHint" -> "prasinous.Alpha", "x" -> "Hello world")
dbo: com.mongodb.casbah.commons.Imports.DBObject = { "_typeHint" : "prasinous.Alpha" ,
  "x" : "Hello world"}

scala> ctx.lookup(dbo)
res1: Option[com.novus.salat.Grater[_ <: com.novus.salat.package.CaseClass]] =
Some(Grater(class prasinous.Alpha @ com.novus.salat.global.package$$anon$1@1a33c91e))

What Scala types can Salat handle?

  • case classes

    • embedded case classes
  • embedded case classes typed to a trait or abstract superclass annotated with @Salat
  • Scala enums
  • Options
  • collections

Options

Any supported type is also supported as an Option.

Collections

Maps are represented as DBObject; all other collections turn into DBList.

In detail: Salat collection support

Salat 0.0.7 and below support the following immutable collections:

  • Map
  • List
  • Seq

Salat 0.0.8-SNAPSHOT and above support the following mutable and immutable collections:

  • Map
  • Lists and linked lists
  • Seqs and indexed seqs
  • Set
  • Buffer
  • Vector

BSON support

Salat delegates serialization for most common types to Casbah's BSON support:

  • String
  • Boolean
  • Numeric types

    • Int, Double, Long
  • ObjectID
  • Date

Make sure that Casbah's BSON encoding hooks are in scope:

com.mongodb.casbah.commons.conversions.scala.RegisterConversionHelpers()

DateTime support

org.joda.time.DateTime support requires registering Casbah's DateTime encoding hook in addition to Casbah's other conversion helpers:

com.mongodb.casbah.commons.conversions.scala.RegisterJodaTimeConversionHelpers()

Salat extensions to BSON support

Salat provides support for converting the following types to something BSON serializes natively:

  • Char is serialized as String
  • Float is serialized as Double
  • BigDecimal is serialized as Double

    • the precision and rounding mode will be preserved as specified in your Context
  • BigInt is serialized as Long

Roll your own: custom BSON encoding hooks

You can support other types by creating custom BSON hooks. For instance, if you needed to serialize a field typed to java.net.URI, you would need to create a custom BSON hook to handle this type.

For more information on how to write and use BSON encoding hooks, see the Casbah API docs and source code:



The Casbah mailing group is another valuable resource

Unsupported types

Salat can't support any of these types right now:

  • Nested inner classes (as used in Cake pattern)
  • A class typed at the top-level to a trait or an abstract superclass
  • com.mongodb.DBRef



Salat can't support these types because the mongo-java-driver doesn't support them:

  • Any type of Map whose key is not a String

    • any type of map whose key is a String containing . or $

In detail: traits and abstract superclasses

With one easy extra step, Salat can handle fields and collections typed to a trait or an abstract superclass.

Without traits

case class Zeta(x: String)
case class Iota(z: Zeta)

With a trait

In this example, Iota's z field is parameterized to a trait, namely trait Zeta. To avoid performance degradation at run time, you must annotate trait Zeta with the @Salat annotation, as shown below.

@Salat
trait Zeta {
  val x: String
}
case class Eta(x: String) extends Zeta
case class Iota(z: Zeta)

In detail: traits and abstract superclasses

To deserialize from DBObject back to Iota, the _typeHint field is necessary!

scala> val i = Iota(z = Eta("eta"))
i: prasinous.Iota = Iota(Eta(eta))

scala> val dbo = grater[Iota].asDBObject(i)
dbo: com.mongodb.DBObject = { "_typeHint" : "prasinous.Iota" ,
  "z" : { "_typeHint" : "prasinous.Eta" , "x" : "eta"}}

scala> val i_* = grater[Iota].asObject(dbo)
i_*: prasinous.Iota = Iota(Eta(eta))

In detail: key remapping

Use @Key to perform ad hoc key remapping:

scala> val o = Omicron(o = "Same old")
o: prasinous.Omicron = Omicron(4de5df4ce4ffd3ffea79e486,Same old)

scala> val dbo = grater[Omicron].asDBObject(o)
dbo: com.mongodb.DBObject = { "_typeHint" : "prasinous.Omicron" ,
  "_id" : { "$oid" : "4de5df4ce4ffd3ffea79e486"} ,
  "o" : "Same old"}

You can also override keys on a per-class basis or globally - see the custom context wiki page for more information.

In detail: serialize a value outside the case class constructor

case class Psi(x: String) {
  @Persist val reversed = x.reverse
}

Values marked with @Persist will be serialized to DBO and then discarded when deserialized back to the case class.

scala> val p = Psi(x = "persist me")
p: prasinous.Psi = Psi(persist me)

scala> p.reversed
res0: String = em tsisrep

scala> val dbo = grater[Psi].asDBObject(p)
dbo: com.mongodb.DBObject = { "_typeHint" : "prasinous.Psi" , "x" : "persist me" , "reversed" : "em tsisrep"}

scala> val p_* = grater[Psi].asObject(dbo)
p_*: prasinous.Psi = Psi(persist me)

From zero to DAO in two minutes

SalatDAO makes it simple to start working with your case class objects. Use it as is or as the basis for your own DAO implementation.

By extending SalatDAO, you can do the following out of box:

  • insert and get back an Option with the id
  • findOne and get back an Option typed to your case class
  • find and get back a Mongo cursor typed to your class
  • iterate, limit, skip and sort
  • update with a query and a case class
  • save and remove case classes
  • projections
  • built-in support for child collections

SalatDAO: getting started

Extend SalatDAO, typing it to your case class and ID, and supply a collection.

package prasinous

import com.mongodb.casbah.Imports._
import com.novus.salat.global._
import com.novus.salat.dao.SalatDAO

case class Omega(_id: ObjectId = new ObjectId, y: String, z: Int)

object OmegaDAO extends SalatDAO[Omega, ObjectId](
  collection = MongoConnection()("mongonyc2011-salat-example")("omega")
)

SalatDAO: insert and find

scala> val o = Omega(y = "E-123", z = 24)
o: prasinous.Omega = Omega(4de5c7e7e4ff62f56ace1ccb,E-123,24)

scala> val _id = OmegaDAO.insert(o)
_id: Option[com.mongodb.casbah.Imports.ObjectId] = Some(4de5c7e7e4ff62f56ace1ccb)

scala> val o_* = OmegaDAO.findOneByID(new ObjectId("4de5c7e7e4ff62f56ace1ccb"))
o_*: Option[prasinous.Omega] = Some(Omega(4de5c7e7e4ff62f56ace1ccb,E-123,24))

scala> val o_* = OmegaDAO.findOne(MongoDBObject("y" -> "E-123"))
o_*: Option[prasinous.Omega] = Some(Omega(4de5c7e7e4ff62f56ace1ccb,E-123,24))

scala> val o_* = OmegaDAO.find(MongoDBObject("y" -> "E-123")).toList
o_*: List[prasinous.Omega] = List(Omega(4de5c7e7e4ff62f56ace1ccb,E-123,24))

SalatDAO: update

You can update using a DBObject:

scala> OmegaDAO.update(MongoDBObject("_id" -> new ObjectId("4de5c7e7e4ff62f56ace1ccb")), MongoDBObject("y" -> "E-124", "z" -> 25))

scala> val o_* = OmegaDAO.findOneByID(new ObjectId("4de5c7e7e4ff62f56ace1ccb"))
o_*: Option[prasinous.Omega] = Some(Omega(4de5c7e7e4ff62f56ace1ccb,E-124,25))

Or a case class, which requires specifying arguments for upsert, multi and a WriteConcern:

scala> OmegaDAO.update(q = MongoDBObject("_id" -> new ObjectId("4de5c7e7e4ff62f56ace1ccb")),
t = o.copy(y = "E-125", z = 26), upsert = false, multi = false, wc = new WriteConcern)

scala> val o_* = OmegaDAO.findOneByID(new ObjectId("4de5c7e7e4ff62f56ace1ccb"))
o_*: Option[prasinous.Omega] = Some(Omega(4de5c7e7e4ff62f56ace1ccb,E-125,26))

SalatDAO: save

scala> OmegaDAO.save(o.copy(y = "E-126", z = 27))

scala> val o_* = OmegaDAO.findOneByID(new ObjectId("4de5c7e7e4ff62f56ace1ccb"))
o_*: Option[prasinous.Omega] = Some(Omega(4de5c7e7e4ff62f56ace1ccb,E-126,27))

SalatDAO: remove

scala> OmegaDAO.remove(o.copy(y = "E-126", z = 27))

scala> val o_* = OmegaDAO.findOneByID(new ObjectId("4de5c7e7e4ff62f56ace1ccb"))
o_*: Option[prasinous.Omega] = None

SalatDAO: primitive projections

case class Theta(_id: ObjectId = new ObjectId, x: String, y: String)

Use projections to bring back a typed list that discards null or None.

scala> val _ids = ThetaDAO.insert(Theta(x = "x1", y = "y1"),
    Theta(x = "x2", y = "y2"), Theta(x = "x3", y = "y3"), Theta(x = "x4", y = "y4"),
    Theta(x = "x5", y = null))
_ids: List[Option[com.mongodb.casbah.Imports.ObjectId]] = List(Some(4de5d418e4ff796559972ad3),
  Some(4de5d418e4ff796559972ad4), Some(4de5d418e4ff796559972ad5), Some(4de5d418e4ff796559972ad6),
  Some(4de5d418e4ff796559972ad7))

scala> ThetaDAO.primitiveProjections[String](MongoDBObject(), "y")
res0: List[String] = List(y1, y2, y3, y4)

scala> ThetaDAO.primitiveProjections[String](MongoDBObject(), "x")
res1: List[String] = List(x1, x2, x3, x4, x5)

SalatDAO: case class projections

case class Nu(x: String, y: String)
case class Kappa(@Key("_id") id: ObjectId = new ObjectId, k: String, nu: Nu)

Projections can also handle case classes.

scala> val _ids = KappaDAO.insert(Kappa(k = "k1", nu = Nu(x = "x1", y = "y1")),
  Kappa(k = "k2", nu = Nu(x = "x2", y = "y2")),
  Kappa(k = "k3", nu = Nu(x = "x3", y = "y3")))
_ids: List[Option[com.mongodb.casbah.Imports.ObjectId]] = List(Some(4de5d5bbe4ff17cca27b2872),
  Some(4de5d5bbe4ff17cca27b2873), Some(4de5d5bbe4ff17cca27b2874))

scala> KappaDAO.projection[Nu](MongoDBObject("k" -> "k1"), "nu")
res0: Option[prasinous.Nu] = Some(Nu(x1,y1))

scala> KappaDAO.projections[Nu](MongoDBObject("k" -> MongoDBObject("$in" -> List("k2", "k3"))), "nu")
res1: List[prasinous.Nu] = List(Nu(x2,y2), Nu(x3,y3))

SalatDAO: practical concerns

Write concerns

The following methods take a WriteConcern parameter that defaults to the collection's own write concern:

  • insert
  • update
  • save
  • remove

What if something goes wrong?

If something goes wrong, these methods will blow up with a detailed runtime exception.

Other projects of interest

Salat is not all things to all people. Some functionality will always be outside of the project vision.

If you need something that Salat just doesn't do, here's some information about other projects that you might be interested in.

Morphia: more of everything

Morphia, a type-safe Java ORM for MongoDB, provides:

  • Fully featured ORM

    • define embedded and semantic relationships between objects
    • optimistic locking
    • inheritance strategies for mapping model objects to collections
    • lifecycle method annotations like @PrePersist, @PostPersist, @PreLoad, @PostLoad
  • type-safe queries
  • validation

If you're interested in more information about Morphia, look on MongoNYC 2011 for slides and video from Scott Hernandez' talk on "Morphia: Easy Java Persistence" this morning.

Spring Data

The Spring Data lead, Mark Pollack from VMWare, is presenting "MongoDB for Java Devs with Spring and CloudFoundry" in the Rusack room from 3:00 - 3:30pm

Spring Data is a project to make it easy for Spring applications to use non-relational databases, map-reduce frameworks and cloud-based data services.

The MongoDB support in spring-data-document is currently in beta.

Examples are available:

Query DSLs

Type-safe MongoDB

  • Rogue - open-sourced by Foursquare, this project provides an internal DSL that works with the Lift web framework

Type-safe Google data store

  • HighChair - created by Chris Lewis, this toolset for developing Google App Engine services and applications in Scala includes a type-safe query DSL that provides a feel intentionally similar to Rogue but for Google data store instead of Lift/MongoDB

Dealing with SQL

Type-safe SQL

Plain SQL but a plusher ride

Who's using Salat?

  • salat-avro, Fast bi-directional Scala case class to Avro serialization from T8Webware and @rubbish
  • smidm, Warren Strange's experimental identity sync manager using Scala and Mongo and the Identity Connector Framework

Projects that make use of Salat's approach to ScalaSig

  • Jerkson, @coda's Scala wrapper for Jackson which brings Scala's ease-of-use to Jackson's features
  • beaucatcher, a Scala MongoDB API with async and BSON AST -> (JSON or CaseClass) pipeline from @havocp



Is your project using Salat? Let us know about it!

What happens next?

We're working to make the code in Salat more modular and general purpose.

  • our tools for working with pickled Scala signatures will be moved to salat-util, a standalone module without dependencies
  • the current salat-core module will contain a generic framework for managing contexts and transformers

    • salat-core will have core JSON and BSON transformers
    • submodules will provide additional Grater capabilities by providing additional transformer implementations
  • the Casbah dependencies will be moved out to salat-casbah in preparation for adding...
  • a new Salat module for using Brendan McAdams' Hammersmith project

Briefly: Hammersmith

Hammersmith is a pure asynchronous MongoDB driver for Scala.

See slides from Hammersmith: Netty, Scala and MongoDB - Brendan's presentation at a recent ny-scala meetup.

Finding out more

Thank you