(c) Oliver M. Bolzer, 2003
$Id: example.html 278 2003-11-22 08:26:58Z bolzer $
In order to use VAPOR, several Ruby libraries need to be installed on the target system. Additionally a PostgreSQL database is required as the backend storage. The following libraries are required:
vaporadmin
)vaporadmin
)Before an object can be stored using VAPOR, a database that will hold the
repository needs to be created and the repository must be initialized.
How the database is create depends on the backend RDBMS used, but usually
by issuing a CREATE DATABASE
statement or using a specific
tool. Read your RDBMS's documentation on how to create a new database.
The database is then initialized as a Repository by using the
vaporadmin
tool.
The following example creates a new PostgreSQL database with it's encoding set to Unicode (s. PostgreSQL Administrator's Guide Ch. 7) and initializes it for use with Vapor.
foo@bar: ~> psql -h host -U user template1
Password: password
Welcome to psql 7.3.2, the PostgreSQL interactive terminal.
template1=> CREATE DATABASE database_name WITH ENCODING = 'UNICODE';
CREATE DATABASE
template1=> \q
foo@bar: ~> vaporadmin user@host/database_name init
Password: password
foo@bar: ~>
Next, information about the classes to be stored in the Repository must be made known to the Repository. Some metadata that can not be deducted from the class's definition in Ruby is written into a XML-file, namely which attributes are to be persistently stored. Such a XML-file (tutorial.xml) with two simple classes looks like:
<vapor> <class name="Person"> <attribute name="first_name" type="String" /> <attribute name="last_name" type="String" /> <attribute name="inhabits" type="Reference" /> </class> <class name="City"> <attribute name="name" type="String" /> <attribute name="altitude" type="Integer" /> </class> </vapor>
The information from this XML-file, is imported to the Repository by again
using the vaporadmin
tool. Any number of XML files can be
specified on the command line. vaporadmin
will abort with an
error if a class' supposed superclass is not already registered with the
Repository.
foo@bar: ~>vaporadmin user@host/database_name add tutorial.xml Password: password Attempting to add classes to repository: Person City Added 2 new classes to Repository. foo@bar: ~>
Now that the database is prepared, the actual Ruby classes need to to be made aware of their Persistent-capability. This is done by including the Vapor::Persistable module. No other change is needed to the class' code itself. The only requirement VAPOR makes is that the class must have a constructor without any arguments. Otherwise VAPOR will not be able to instantiate an "empty" object to which it feeds it's attributes when reinstantiating the object from the Repository.
require 'vapor' class City include Vapor::Persistable def initialize( n = nil , a = nil ) @name = n @altitude = a end attr_reader :name, :altitude end
Before any operations on the storage can be executed, a PersistenceManager,
the frontend for all operations on the Repository, must be instantiated with
the proper access credentials for the backend Datastore. The credentials are
passed in form of an Hash
-like object that responds to []
with the proper keys and returns nil
for unknown keys.
Most often, it will be a real Hash
.
The following code creates a PersistenceManager with the specified
access credentials and keeps it in the variable @pmgr
.
For this example,
Autocommit-Mode is turned off, to better illustrate state changes
of persistable objects.
require 'vapor' properties = Hash.new properties[ 'Vapor.Datastore.Name' ] = 'bolzer_vapor_test' properties[ 'Vapor.Datastore.Host' ] = 'db' properties[ 'Vapor.Datastore.Port' ] = 5432 properties[ 'Vapor.Datastore.Username' ] = 'bolzer' properties[ 'Vapor.Datastore.Password' ] = 'foo02bar' properties[ 'Vapor.Autocommit' ] = false @pmgr = Vapor::PersistenceManager.new( properties )
Once the classes are marked as Persistable and the DB-schema has been created, instances of Persistable classes can be instantiated and added to the storage.
The following code instantiates two Citys and marks the objects as persistent inside a transaction. In Vapor, all changes to states of persistent objects must occur in a transaction. Only when the transaction is commited, are the changes written to the Datastore.
tokyo = City( 'Tokyo', 0 ) berlin = City( 'Berlin', 34 ) tokyo.oid # => nil tokyo.state # => Vapor::Persistable::TRANSIENT @pmgr.transaction{|tx| # start transaction tokyo.make_persistent berlin.make_persistent tokyo_oid = tokyo.oid # => 64212 tokyo.state # => Vapor::Persistable::NEW } # commit transaction, writing changes to Datastore tokyo.state # => Vapor::Persistable::PERSISTENT
A single object can be specifically retrieved from storage, if it's OID is known. The following code retrieves the City-object for Tokyo, saved above
tokyo = @pmgr.get_object( tokyo_oid ) tokyo.name # => 'Tokyo' tokyo.altitude # => 0 tokyo.oid == tokyo_oid # => true
Knowing the OID of stored objects beforehand is often not practical. If the have to be stored elsewhere, it defeats half the purpose of having a convenient object repository.
Very often, all instances of a specific class have to be processed. For this purpose, Vapor can retrieve all instances of a class at once. If all of them are created into memory at once, the procedure would be very heavy-weight. So Vapor returns an special Enumerable object (Extent) that contains information about multiple Persistables, but actually instantiates them only when they are needed via the Extent#each Iterator.
By default, persistent instances of persistent subclasses are also returned.
If only instances from the class itself should be returned, specify
false
as the second argument to get_extent()
.
The following code reinstantiates the City objects saved earlier
cities = @pmgr.get_extent( City ) cities.type # => Vapor::Extent cities.empty? # => false cities.size # => 2 cities[0].name # => 'Tokyo' cities[1].name # => 'Berlin'
In order to modified an Persistable's content, it has to be first loaded, then
modified in the usual ways using the object's methods and then flushed back
to the repository. Unfortunately, there exists no easy way to detect that an
object has changed in Ruby, short to hooking all methods. So an object
must explicitly be marked as "dirty" before it is considered by the
PersistenceManager to be flushed out, using the
Persistable#mark_dirty
method. The object should
mark itself dirty after changing one of it's persistent attributes so that
the user of the object does not have to call the mark_dirty
.
class City # extend class defined above def altitude=( x ) @altitude = x self.mark_dirty end end @pmgr.transaction{|tx| # begin transaction tokyo = @pmgr.get_object( tokyo_oid ) # load object tokyo.altitude = 100 # object marked as dirty in here } # commit changes
Objects that are persistently stored in the repository can be deleted. Deleted objects become transient again and only exist in-memory and not in the repository anymore, losing all persistent identity. If a deleted object is made persistent again, it will have a different OID than before.
tokyo = @pmgr.get_object( tokyo_oid ) @pmgr.transaction{|tx| # begin transaction tokyo.delete_persistent tokyo.state # => Vapor::Persistable::DELETED tokyo.oid == tokyo_oid # => true } # commit changes tokyo.state # => Vapor::Persistable::TRANSIENT tokyo.oid # => nil @pmgr.transaction{|tx| # start another transaction tokyo.make_persistent tokyo.state # => Vapor::Persistable::NEW tokyo.oid == tokyo_oid # => false }
Very often, it is needed to retrieve only a subset of the instances of a class matching a certain (search) criteria. For this purpose, the PersistenceManager#query() method exists. The following code retrieves a City object, who's name is 'Tokyo'
By default, persistent instances of persistent subclasses that match the
query are returned too. Should only persistent instances of the class
itself be returned, specify false
as the third argument to
query()
.
name = "Tokyo" altitude = 0 cities = @mgr.query( City, "name = ? AND altitude = ?", [ name, altitude ] ) cities.type # => Vapor::Extent cities.candidate_class # => City cities.size # => 1 tokyo = cities[0]
The query string is made up of pairs of attribute-name and place-holders for their values. The values are specified in an array that contains them in their order of appearance. Several pairs can be specified with 'AND' and only objects matching all pairs are returned.
Currently supported comparison operators are "exact match" (=) for all types of attributes and "similar to" (~) for Strings, where an question mark (?) matches any single character and an asterix (*) matches any string of zero or more characters. The "similar to" operator always covers the entire attribte value and is case-sensitive. To match a pattern anywhere within the attribute value, the pattern must therefore start and end with an asterix. If the search includes literal asterixes or question marks, they need to be escaped by a backslash (\). Other operators will be supported upon request.
Examples:
Match any City or subclass with an altitude of zero:
@mgr.query( City, "altitude = ?", [ 0 ] )
Same as above, but only match instances of City
@mgr.query( City, "altitude = ?", [ 0 ], false )
Match all City and subclass instances that have "heim" anywhere in their name
@mgr.query( City, "name ~ ?, ["*heim*" ] )
The query language expressed in a BNF-like syntax:
QUERY_STRING := <SEARCH_CRITERIA> [ AND <SEARCH_CRITERIA> ] SEARCH_CRITERIA := <ATTRIBUTE_NAME> <COMP_OP> ? ATTRIBUTE_NAME := [A-Za-z_]+ COMP_OP := = | ~
Future plans include the ability to support literal values where variables do not need to be bound to the query.
If some attributes are searched very often, it might be useful to
hint indexes to be created for these attributes to speed up search queries.
For single attribute indexes, just add a index="true"
attribute
to the <attribute /> tag in the metadata file. For multi-attribute
indexes, use a special <index> tag inside the class' <class>
tag. A multi-attribute index for the attributes A and B are useful when
searching for attribute A or (A and B). It will not be used for searches
involving only B. When queries for A only and B only and for A and B are
equally often executed, consider creating a single attribute index for
each of them. They will be used also for the A and B case.
Let's imagine, for the above defined Person
class, most of the
search queries are either for the inhabits
attribute only or
for both first_name
and last_name
. To improve
performance, we create an single attribute index for inhabits
and a multi-attribute index over last_name
and i
first_name
, assuming that we occasionally also search for
last_name
alone. The XML metadata definition would now look
like:
<class name="Person"> <attribute name="first_name" type="String" /> <attribute name="last_name" type="String" /> <attribute name="inhabits" type="Reference" index="true" /> <index> <attribute name="last_name" /> <attribute name="first_name" /> </index> </class>
Note:Index hints from superclasses are NOT inherited. If inherited attributes are to searched for often, too, create indexes for them using <index<. Multi-attribute indexes over attributes from the class itself and attributes from superclasses are also possible.
Limiting the valid range of values for an attribute is the job of the class' methods. But sometimes, some constraints are required to be satisfied by the all instances together, like that a specific attribute's value must be unique over all instances of the same class. E.g. the student ID should be unique for all students or no two cities shall have the same name. Of course, the class could search for duplicates in the Repository before setting attribute values but that would contradict with out goal that the classes should know as little as about Vapor.
Vapor has support for uniqueness constraints for single or multiple attributes
of a class. If an object is newly made persistent and another already persistent
object has the same value for a unique attribute, an
UniquenessError
is raised.
Again using our Person
and City
classes, we add the
requirement, that no two cities can have the same name and that no two persons
can have the same first and last name.
<class name="Person"> <attribute name="first_name" type="String" /> <attribute name="last_name" type="String" /> <attribute name="inhabits" type="Reference" index="true" /> <index> <attribute name="last_name" /> <attribute name="first_name" /> </index> <unique> <attribute name="last_name" /> <attribute name="first_name" /> </unique> </class> <class name="City"> <attribute name="name" type="String" unique="true"/> <attribute name="altitude" type="Integer" /> </class>
An index is automatically created for the attributes of each uniqueness constraint.
Now based on a Repository using above metadata, let's try to save two
Cities with the same name. When committing the second city, an
UniquenessError
will be raised.
tokyo = City( 'Tokyo', 0 ) tokyo2 = City( 'Tokyo', 8332 ) @pmgr.transaction{|tx| tokyo.make_persistent } # tokyo in Repository @pmgr.transaction{|tx| tokyo2.make_persistent tokyo2.state # => Persistable::NEW } # => Vapor::Exceptions::UniquenessError, "uniqueness constraint violated" # automatical rollback tokyo2.state # => Persistable::TRANSIENT
Note:uniqueness constraints are NOT inherited. If fields that are defined in a superclass should have an uniqueness constraint, specify it using <index> tag. Of course, multi-argument constraints over arguments from the class itself and arguments from superclasses are possible.
Note 2:Currently attributes that are Arrays can't be part of an uniqueness constraint due to restrictions of PostgreSQL which enforces the constraints. (no unique index creatable on arrays)
Vapor caches loaded persistent objects to preserve identity of in-memory
objects and to avoid accessing the Datastore unless neccersary. While an
object is loaded in-memory by one PersistenceManager instance,
another PersistenceManager might have changed it in the Datastore. This
is detected at commit, but an object can also manually be refreshed to the
current data in the Datastore using the
Persistable#refresh_persistable
method. This sets the values of the
object's persistent attributes to those currently in the Datastore and stets
the objet's state to PERSISTENT. All uncommited changes are discarded.
However, be aware that right after the object is refreshed, it can be changed again in the Datastore by another PersistenceManager.
munich.altitude # => 550 sleep 300 # somebody does something while we rest munich.refresh_persistent # retrieve that work munich.altitude # => 560
Multiple PersistenceManagers could be accessing the same Repository at the same time. In order to avoid inconsistency like lost changes, Vapor implements transaction that guarantee that objects don't change in the repository without the application noticing.
Under normal circumstances, all changes to persistent Persistables must
occur inside a transaction. Transactions are started by acquiring it from
the PersistenceManager using PersistenceManager#transaction()
and finished either by committing it, which means that all changes are
written to the Datastore, or by rolling the transaction back, whereby all
changes are discarded and the state before the transaction is restored.
Vapor uses optimistic locking to prevent concurrent PersistenceManager
instances overwriting each other's changes. When a transaction is about to be
committed, Vapor checks wheter all objects that are going to be changed have
not changed in the Repository since they were loaded or last refreshed by
another PersistenceManager instance. Should this be the case, a
StaleObjectError
(when the object has changed) or a
DeletedObjectError
(when the object has been deleted) is raised.
When an error such as an UniqnessError
, DeletedObjectError
or StaleObjectError
occurs during commit, the transaction
is automatically rolledback before the exception is raised to prevent
Repository inconsistencies. The object that caused the error can be
determined by calling the causing_object()
method on one of
these errors.
There are two ways to begin and commit or rollback transaction. One way is
to make the apropriate begin()
, commit()
or
rollback()
methods on the PersistenceManager
's
Transaction
-object, which can be obtained through
PersistenceManager#transaction
.
Alternately, PersistenceManager#transaction()
can
be called with an block to which the Transaction
-object is
passed. The transaction is automatically started before the block and
automatically committed when the block terminates. If the block raises any
exception, the transaction gets automatically rolledback.
munich = City.new( 'Munich', 530 ) munich.make_persistent # => Vapor::NotInTransactionError munich.state # => Vapor::Persistable::TRANSIENT # using an Transaction-object t = @pmgr.transaction munich.make_persistent munich.state # => Vapor::Persistable::NEW t.commit t.commit # => Vapor::StaleTransactionError munich.state # => Vapor::Persistable::PERSISTENT # Transaction as a block @pmgr.transaction{|t| munich.altitude # => 530 munich.altitude = 550 munich.altitude # => 550 munich.state # => Vapor::Persistable::DIRTY t.rollback } munich.altitude # => 530 munich.state # => Vapor::Persistable::PERSISTENT # nested Transactions @pmgr.transaction{|t| t2 = @pmgr.transaction # => Vapor::NestedTransactionError } # automatic rollback
By default, all changes to persistent objects (
Persistable#make_persistent
,
Persistable#delete_persistent
, Persistable#mark_dirty
)
are instantly saved back to the Repository. This might have negative impact on
performance and does not guard against Repository inconsistency, if the
application crashes or gets interrupted, without making all intended changes.
This Autocommit-Mode can be turned off by setting
PersistenceManager#autocommit=
to false
. If
Autocommit-Mode is off, changes to persistent objects are only saved when
a transaction is committed. All changes will be lost if the transaction
aborts or the application terminates without committing it's last
transaction. Errors like UniquenessError
will be raised
during commit
.
Current status of Autocommit-Mode can be determined through
PersistenceManager#autocommit?
. When Autocommit-Mode is turned
on again, after being turned off, the current transaction will be committed.
Autocommit-Mode can also be enabled/disabled by default by setting
the Vapor.Autocommit
property to true
or
false
(actual boolean value, a String will be interpreted as
true
) when creating the PersistenceManager
.
@pmgr.autocommit? # => false; disabled above @pmgr.autocommit = true # => enable Autocommit-Mode, commit transaction # up until now madrid = City.new( 'Madrid', 650) madrid.make_persistent # immediatly saved to Repository madrid.sate # => Persistable::PERSISTENT (not NEW)
Each time a transaction is committed, you have the chance to record the
committer (who/what initiated the commit) and a message describing the
commit, by setting them via the Transaction#committer=
and
Transaction#message=
methods. The committer, once set is
recorded for all subsequent transaction, until the value is changed or the
PersistenceManager
instance is discarded. The commit message
is cleared each time a transaction is committed, because it is supposed
to describe the current transaction specifically.
@pmgr.transaction{ |t| t.committer = $0 t.message = "something" .... }
The TransactionLog
-object associated with the transaction that
last changed a Persistable
object, thereby creating the
current version of the object (s. Version Management below), can be
accessed by the Persistable#vapor_changelog()
method.
The TransactionLog
object contains information such as
the time of the commit, list of objects modified by the transaction,
committer and commit message. The last two might be empty if the user
didn't supply these. The commit message will be empty if the transaction
was triggered by an autocommt.
last_change = munich.vapor_changelog last_change.class # => Vapor::TransactionLog last_change.committer # => "Somebody" last_change.message # => "Modification Test" last_change.date # => Thu Oct 30 16:36:38 CET 2003 last_change.modified_objects # => [munich]
Vapor supports Version Control of objects stored in the Repository. When an object is changed in the Repository by an transaction commit, the old state of the object is not thrown away, but archieved in the Repository. Each state of the object at the end of a modifiying transaction is called a "version" and is given a unique "revision number" that is unique among all versions of an unique object. The most recent version, that has not (yet) been modified, is the "current version" of the object. This is the version which applications load by default and references point to.
An important property of Version Management is stability. One created, the content of an object version should not change for the entire lifetime of the Repository. Change always creates a new version.
Vapor also keeps information about the transaction that created a object version by modifiying the previous version, such as time and date when the transaction was committed, the name of the user making the changes and an optional message explaining the transaction.
The current revision number of an persistent Persistable
object
is returned by calling the revision()
method on the
Persistable
object. Transient objects return nil
to this method.
Versions of persistent objects other than the current version can be loaded into memory using the various methods described below. The loaded objects are normal instances of their class with two important differiences to in-memory-instances of current versions.
READONLY
. All operations
on the object that change the object's persistence state will raise a
PersistableReadOnlyError
. This includes operartions that would
change the object's persistence state to DIRTY
, e.g.
setter methods that properly call Persistable#mark_dirty()
.
PersristenceManager
, different
in-memory objects with the same attribute values will be returned. Because
non-current versions are never modified, there is no need that all references
to a specific non-current version point to the same in-memory object for
consistency. This behavious might change in the future if need for caching and
consistency arises.
A reference from a non-current version to another persistent object points to to the current version of the persistent object and not the version that was current when the version that holds the reference became non-current. In the future, stronger references that point to a specific version of an object might be implemented, should the need arise to support them.
Specific revisions of an persistent object can be loaded and accessed in
two different ways. In both caseses, nil
is returned if a
version with the specific revision number does not exist.
Persistable#old_verision( revision_number )
method returns
the object's version with the specified revision number.
PersistenceManager#get_object_version( klass, oid, revision )
method can be called directly without requiring the current version to be
loaded.
Administration of the Repository itself, like adding or removing classes is
done using the vaporadmin
command.
Persistent
classes that are not needed anymore can be deleted
from the Repository using the remove
command.
foo@bar: ~> $ bin/vaporadmin help remove
remove: Remove a class and all it's insstances from the
Repository.
usage: vaporadmin REPOSITORY remove [-r|-f|-rf] KLASSNAME
-r delete recursivly, including all subclasses
-f don't ask for conformation
Caution: All instances (the actual data) of the class will be permanently removed.
By default, classes that have subclasses registered to the Repository will not
be deleted. By using the -r
option, all subclassses will be
deleted recursivly, too.
Usage Example:
foo@bar: ~>vaporadmin user@host/database_name remove City Password: password Attempting to remove class(es) from repository: Really remove `City' including all instances and subclasses from the Repository? (y/N) y City foo@bar: ~>
During the development and evolution of an application that uses Vapor, changes
to the class' definition will become neccersary. Most of the time this will be
the introduction of new attributes. Instead of reinitializing the Repository and
losing all the instances stored in it, the metadata definition can be modified
using the update
command to vaporadmin
. It takes
the names of the same XML files as arguments as the add
command.
Updating class definitions is (currently) basically limited to adding new attributes. In order to preserve instances, type redefinition is not possible and attributes that do not exist anymore in the updated definiton will not be deleted.