Core Concepts
etl4s has one core building block:
A Node wraps a lazily-evaluated functionIn => Out. Chain them with ~> to build pipelines.
Node types¶
To improve readability and express intent, etl4s defines four aliases: Extract, Transform, Load and Pipeline. All behave the same under the hood.
type Extract[-In, +Out] = Node[In, Out]
type Transform[-In, +Out] = Node[In, Out]
type Load[-In, +Out] = Node[In, Out]
type Pipeline[-In, +Out] = Node[In, Out]
Building pipelines¶
import etl4s._
val A = Extract("users.csv")
val B = Transform[String, Int](csv => csv.split("\n").length)
val C = Load[Int, Unit](count => println(s"Processed $count users"))
val pipeline = A ~> B ~> C
pipeline.unsafeRun()// Processed 3 users
Create standalone nodes:
Running pipelines¶
Call like a function:
Or be explicit:
Error handling:
val risky = Pipeline[String, Int](_.toInt)
risky.safeRun("42") // Success(42)
risky.safeRun("oops") // Failure(...)
Execution details:
val trace = pipeline.unsafeRunTrace(())
// trace.result, trace.logs, trace.timeElapsedMillis, trace.errors
val safeTrace = pipeline.safeRunTrace(())
// safeTrace.result is a Try[Out]
Note
etl4s also has a Reader type for dependency injection. Use .requires to turn any Node into a Reader[Config, Node]. The ~> operator works between Nodes and Readers. See Configuration for details.