transfer_col_references¶
- pydiverse.transform.transfer_col_references(table, ref_source)[source]¶
Transfers the column references from ref_source to table.
The returned table contains all selected columns of table, but its columns are now referenced by the columns from ref_source. All column names selected in table must also be present in ref_source.
- Parameters:
table – The table from which the data is taken.
ref_source – The table from which the column references are taken.
Examples
Materialization without breaking the functional flow. Say you have a function your_materialization_fn that writes a transform table to a database and returns a transform table again. Then you can define a custom verb
>>> @verb ... def materialize(table) -> pdt.Table: ... new = your_materialization_fn(table) ... return pdt.transfer_col_references(new, table)
With this verb, it is possible to write
>>> t = pdt.Table(dict(a=[1, 2, 5], b=["x", "y", "z"]), name="t") >>> t >> filter(t.a >= 2) >> materialize() >> mutate(z=t.a + t.b.str.len()) Table `t` (backend: polars) shape: (2, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ z │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ i64 │ ╞═════╪═════╪═════╡ │ 2 ┆ y ┆ 3 │ │ 5 ┆ z ┆ 6 │ └─────┴─────┴─────┘
Without transfer_col_references, it would not be possible to use t.a and t.b in the mutate. (Of course, you would normally have a SQL backend when materializing, not a polars backend like in the example here.)