- loads data from a variety of sources, including CSV and other types of files, relational database connections, and more
- transforms the data according to instructions in a YAML file
- renders a Jinja template (which can be any text-based data format, including JSON, XML, HTML, YAML, and more) for each row of transformed data, and saves the output to a file
earthmover is similar in some ways to dbt, but it is the transformation execution engine (rather than issuing SQL commands to a database backend which is the execution engine).
earthmover is built using a number of Python libraries including Dask and NetworkX.
At EA, we use
earthmover to transform various types of flat files into JSON according to the Ed-Fi data standard, which we then send to Ed-Fi APIs using lightbeam. You can learn more about both tools in this presentation.