API¶
Table¶
- class pydiverse.transform.Table[source]¶
- __init__(
- resource: DataFrame | LazyFrame | DataFrame | Table | str | dict,
- backend: Target | None = None,
- *,
- name: str | None = None,
Creates a new table.
- Parameters:
resource – The data source to construct the table from. This can be a polars or pandas data frame, a python dictionary, a SQLAlchemy table or the name of a table in a SQL database.
backend – The execution backend. This must be one of the pydiverse.transform backend objects, see Backends / Export Targets. It may carry additional information how to interpret the resource argument, such as a SQLAlchemy engine.
name – The name of the table. It is not required to give the table a name, but may make print output more readable.
Examples
Python dictionary.
>>> t = pdt.Table( ... { ... "a": [4, 3, -35, 24, 105], ... "b": [4, 4, 0, -23, 42], ... }, ... name="T", ... ) >>> t >> show() Table T, backend: PolarsImpl shape: (5, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═════╡ │ 4 ┆ 4 │ │ 3 ┆ 4 │ │ -35 ┆ 0 │ │ 24 ┆ -23 │ │ 105 ┆ 42 │ └─────┴─────┘
Polars data frame.
>>> df = pl.DataFrame( ... { ... "a": [4, 3, -35, 24, 105], ... "b": ["a", "o", "---", "i23", " "], ... }, ... ) >>> t = pdt.Table(df, name="T") >>> t >> show() Table T, backend: PolarsImpl shape: (5, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ str │ ╞═════╪═════╡ │ 4 ┆ a │ │ 3 ┆ o │ │ -35 ┆ --- │ │ 24 ┆ i23 │ │ 105 ┆ │ └─────┴─────┘
Pandas data frame. Note that the data frame is converted to a polars data frame and the backend is polars.
>>> import pandas as pd >>> df = pd.DataFrame( ... { ... "a": [4, 3, -35, 24, 105], ... "b": ["a", "o", "---", "i23", " "], ... }, ... ) >>> t = pdt.Table(df, name="T") >>> t >> show() Table T, backend: PolarsImpl shape: (5, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ str │ ╞═════╪═════╡ │ 4 ┆ a │ │ 3 ┆ o │ │ -35 ┆ --- │ │ 24 ┆ i23 │ │ 105 ┆ │ └─────┴─────┘
SQL. Assuming you have a SQLAlchemy engine
engine, which is has a connection to a database containing a tablet1in a schemas1, you can create a pydiverse.transform Table from it as follows.>>> t = pdt.Table("t1", SqlAlchemy(engine, schema="s1")) >>> t >> show() Table t1, backend: PostgresImpl shape: (5, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ str │ ╞═════╪═════╡ │ 4 ┆ a │ │ 3 ┆ o │ │ -35 ┆ --- │ │ 24 ┆ i23 │ │ 105 ┆ │ └─────┴─────┘
Note that the name argument to the
pdt.Tableconstructor was not specified, so transform used the name of the SQL table. This example of course assumes that a database connection is set up and the above table is already present in the database. For more information on how to set up a connection, see Database testing.
ColExpr¶
Col¶
- class pydiverse.transform.Col[source]
- export(
- target: Target,
Exports a column expression.
- Parameters:
target – The data frame library to export to. Can be a
PolarsorPandasobject. Thelazykwarg for polars is ignored.- Returns:
A polars or pandas Series.
Note
Not every column expression can be exported. Unlike mutate or other verbs, there is no ambient table the expression lives in, which is required to resolve C-columns and correctly deal with columns from different tables. Thus, the expression must contain one column whose table contains all other columns appearing in the expression. The table of this column is then used to export the expression.
Examples
>>> t1 = pdt.Table({"h": [2.465, 0.22, -4.477, 10.8, -81.2, 0.0]}) >>> t1.h.export(Polars) shape: (6,) Series: 'h' [f64] [ 2.465 0.22 -4.477 10.8 -81.2 0.0 ] >>> t1.h.export(Pandas()) 0 2.465 1 0.22 2 -4.477 3 10.8 4 -81.2 5 0.0 Name: h, dtype: double[pyarrow]