Skip to content

Core Concepts

etl4s has one core building block:

Node[-In, +Out]
A Node wraps a lazily-evaluated function In => Out. Chain them with ~> to build pipelines.

Node types

To improve readability and express intent, etl4s defines four aliases: Extract, Transform, Load and Pipeline. All behave the same under the hood.

type Extract[-In, +Out]   = Node[In, Out]
type Transform[-In, +Out] = Node[In, Out]
type Load[-In, +Out]      = Node[In, Out]
type Pipeline[-In, +Out]  = Node[In, Out]

Building pipelines

import etl4s._

val A = Extract("users.csv")
val B = Transform[String, Int](csv => csv.split("\n").length)
val C = Load[Int, Unit](count => println(s"Processed $count users"))

val pipeline = A ~> B ~> C

pipeline.unsafeRun()// Processed 3 users

Create standalone nodes:

val toUpper = Transform[String, String](_.toUpperCase)
toUpper("hello")  // HELLO

Running pipelines

Call like a function:

pipeline(())

Or be explicit:

pipeline.unsafeRun()

Error handling:

val risky = Pipeline[String, Int](_.toInt)

risky.safeRun("42")    // Success(42)
risky.safeRun("oops")  // Failure(...)

Execution details:

val trace = pipeline.unsafeRunTrace(())
// trace.result, trace.logs, trace.timeElapsedMillis, trace.errors

val safeTrace = pipeline.safeRunTrace(())
// safeTrace.result is a Try[Out]

Note

etl4s also has a Reader type for dependency injection. Use .requires to turn any Node into a Reader[Config, Node]. The ~> operator works between Nodes and Readers. See Configuration for details.