docs/user-guide/transformations/concatenation.md
There are a number of ways to concatenate data from separate DataFrames:
nullIn a vertical concatenation you combine all of the rows from a list of DataFrames into a single longer DataFrame.
{{code_block('user-guide/transformations/concatenation','vertical',['concat'])}}
--8<-- "python/user-guide/transformations/concatenation.py:setup"
--8<-- "python/user-guide/transformations/concatenation.py:vertical"
Vertical concatenation fails when the dataframes do not have the same column names.
In a horizontal concatenation you combine all of the columns from a list of DataFrames into a single wider DataFrame.
{{code_block('user-guide/transformations/concatenation','horizontal',['concat'])}}
--8<-- "python/user-guide/transformations/concatenation.py:horizontal"
Horizontal concatenation fails when dataframes have overlapping columns or a different number of rows.
nullierIn a diagonal concatenation you combine all of the row and columns from a list of DataFrames into a single longer and/or wider DataFrame.
{{code_block('user-guide/transformations/concatenation','cross',['concat'])}}
--8<-- "python/user-guide/transformations/concatenation.py:cross"
Diagonal concatenation generates nulls when the column names do not overlap.
When the dataframe shapes do not match and we have an overlapping semantic key then we can join the dataframes instead of concatenating them.
Before a concatenation we have two dataframes df1 and df2. Each column in df1 and df2 is in one or more chunks in memory. By default, during concatenation the chunks in each column are copied to a single new chunk - this is known as rechunking. Rechunking is an expensive operation, but is often worth it because future operations will be faster.
If you do not want Polars to rechunk the concatenated DataFrame you specify rechunk = False when doing the concatenation.