Compose :: Melbourne Speaker - Luke Stephenson


Compose :: Melbourne will feature many excellent speakers. One of this year's lineup is Luke Stephenson. If you want to see the whole lineup look here!




4:15pm - Luke Stephenson

Reactive Streams for REA feeders

This talk introduces Reactive Streams, specifically the Monix implementation and how at REA Group we have used it to solve real problems.

At REA we have many 'feeders' which extract from one data source, perform some processing and transformations before loading into a target system.

Historically these feeders have been built using Java / Scala concurrency primitives (e.g. Future / ExecutionContext). The typical feeder has a very sequential execution model like:

  1. Read a batch in parallel from the source system. The reading is done in as a parallel batch because the source system may be slow, e.g. an S3 bucket).
  2. Transform the batch
  3. Write the batch in parallel to the target system

And repeat the above until all data is processed.

The very sequential nature means the feeds are not as performant as they could be. The source and target systems were never accessed in parallel. Writing to the target required waiting for the source to read another batch. And the feeder would not read another batch from the source until the target completed.

Also being based on lower level concurrency primitives, the data pipeline is not as easy to visualise from reading the codebase.

Reactive streams allow us to model asynchronous processing pipelines as a data type which can then be executed. The APIs are declarative, as opposed to the legacy solution working with Scala Future and ExecutionContext which resulted in immediate evaluation.

The talk will cover:

  • Background into our feeders and legacy implementation
  • How the latency involved in that implementation cumulates to affect performance
  • Intro to the Monix reactive streams implementation covering the key APIs that we have used
  • Demonstrating how the APIs are just data types and that nothing is executed until the observable is consumed and run.
  • Code examples for a basic feeder structure
  • How the end solution is more performant, but more importantly that the code is more expressive and easy to reason about.

Attendees should leave with a basic understanding of Reactive Streams, and specifically how to get started with the Monix implementation of reactive streams. I'll share a working basic project (not REA production code) which provides a basic 'feeder' structure for attendees to reference and extend upon if they wish.

About Luke Stephenson

Luke Stephenson works as a Lead Developer at REA Group building Scala services to power the consumer facing website. After many years as a Java developer, Luke has enjoyed the transition to Scala and becoming a better dev through many FP learnings.