SPLASH Scripting
The SAP Sybase Event Stream Processor has two basic methods of computing. Familiar features from relational database systems are used in the first method, via stream operators like Compute, Join, and Aggregate. The second method uses a scripting language called SPLASH to process events, via the FlexStream operator. SPLASH contains the language of expressions used in the ordinary stream operators and also contains variables, data structures, looping constructs, conditionals, and control-flow operations.
SPLASH makes certain computations easier to express, particularly those that must "remember old values" or, in technology terminology, those that require "state."
Below is an example, inspired by the trading industry, to maintain the top of an order book. This example will use of some of the new data structures in SPLASH as well as variables and loops. Suppose the stream has the fields
int32 Id
string Symbol
double Price
int32 Shares
where id is the key field, that is, the field that uniquely identifies a bid. Bids can be changed: not only might the stream insert a new bid, but it might also update or delete a previous bid. For instance, the Bid stream might have the following events:
insert: Id=1, Symbol='IBM', Price=43.11, Shares=1000
insert: Id=2, Symbol='ALR', Price=22.08, Shares=200
update: Id=1, Symbol='IBM', Price=43.17, Shares=900
The goal: any time a bid is inserted or changed for a particular stock, output the top three highest bids. The fields in the output are
int32 Position
string Symbol
double Price
int32 Shares
where Position ranges from 1 to 3. (The key fields in the output are Position and Symbol.) For example, if the events in the Bid stream have been
insert: Id=1, Symbol='IBM', Price=43.11, Shares=1000
insert: Id=2, Symbol='IBM', Price=43.17, Shares=900
insert: Id=3, Symbol='IBM', Price=42.66, Shares=800
insert: Id=4, Symbol='IBM', Price=45.81, Shares=50
and the next event is
insert: Id=5, Symbol='IBM', Price=46.41, Shares=75
the stream should output
insert: Position=1, Symbol='IBM', Price=46.41, Shares=75
insert: Position=2, Symbol='IBM', Price=45.81, Shares=50
insert: Position=3, Symbol='IBM', Price=43.17, Shares=900
Note how the latest value appears at the top. This type of problem--keeping track of the top values--arises frequently in Complex Event Processing. It's not easy to describe using SQL-like operations. But with Release 3.0, there are means of handling it.
First, a way is needed to remember previous bids. In Release 3.0, there's a data structure called an "event cache" for storing previous events. An event cache holds a number of events grouped into buckets. For the order book problem, we'll use the event cache declaration
eventCache(Bid[Symbol], coalesce, Price desc) previous;
This declares a variable called "previous" to hold the last events from the Bid stream. This event cache declaration specifies:
- The stream of events that the event cache remembers (namely Bid)
- The field or fields on which events will be grouped (namely Symbol)
- The option "coalesce", meaning that inserts and updates should be coalesced into single records
- A means of ordering the events in the group, here ordered by descending order of the Price field.
- Event caches allow other options too, but these are the only ones needed for the order book example.
Second, we need to process events from the Bid stream. The following bit of SPLASH code, which is run automatically for every event, can be used:
{
int32 i := 0;
string s := Bid.Symbol;
while ((i < count(previous.Id)) and (i < 3 ) ) {
output setOpcode([ Position=i+1; Symbol=s; Price=nth(i,previous.Price);
Shares=nth(i,previous.Shares) ], upsert);
i := i + 1;
}
while (i < 3) {
output setOpcode([ Position=i+1; Symbol=s], safedelete);
i := i + 1;
}
}
- The first two lines assign local variables "i" to 0 and "s" to the current event's Symbol field.
- The first "while" loop walks over the group associated with the Symbol of the current event. It creates at most three new events, marked as upserts, getting the highest price first, the second highest second, and so forth.
- The second "while" loop makes sure that, if there are not three entries in the group, that the output doesn't have more than three events. The loop handles a corner case when there are deletes in the Bid stream.
Programming with state in SPLASH can greatly improve CEP applications, and that's why it's an integral part of the SAP Sybase Event Stream Processor.