Helper and utility transformations.
Log(field_filter=None, condition=None, clean=None)¶
Identity transform that adds a console output side effect, to watch what is going through Queues at some point of an ETL process.
Stop(transform=None, input_channels=None, output_channels=None)¶
Sinker transform that stops anything through the pipes.
Simple transform that will overwrite some values with constant values provided in a Hash.
Clean(transform=None, input_channels=None, output_channels=None)¶
Remove all fields with keys starting by _
SimpleTransform is an attempt to make a trivial transformation easy to build, using fluid APIs and a lot of easy shortcuts to apply filters to some fields.
The API is not stable and this will probably go into an “extra” module later.
>>> t = SimpleTransform()
Apply “upper” method on “name” field, and store it back in “name” field.
>>> t.add('name').filter('upper') <rdc.etl.extra.simple._SimpleItemTransformationDescriptor object at ...>
Apply the lambda to “description” field content, and store it into the “full_description” field.
>>> t.add('full_description', 'description').filter(lambda v: 'Description: ' + v) <rdc.etl.extra.simple._SimpleItemTransformationDescriptor object at ...>
Remove the previously defined “useless” descriptor. This does not remove the “useless” fields into transformed hashes, it is only usefull to override some parent stuff.
>>> t.useless = 'foo' >>> t.delete('useless')
Mark the “notanymore” field for deletion upon transform. Output hashes will not anymore contain this field./
Add a field (output hashes will contain this field, all with the same “foo bar” value).
>>> t.test_field = 'foo bar'