To merge a list of pandas dataframes into a single dataframe in Python, you’ll need to use the merge() function. This function takes two or more dataframes as input and returns one new data frame of type list. Merging a list of pandas dataframes into a single dataframe in Python is easy. In this tutorial, we will learn how to merge a list of pandas dataframes into a single dataframe in Python.
*Working on Jupyter Notebook Anaconda Environment.
As a result of merging a list of DataFrames with identical column labels into one DataFrame, each DataFrame column is combined into another until there is only one DataFrame left, which has information about all dataframes previously in the list.
If you want to learn more about Python Programming, visit Python Programming Tutorials.
Merge Pandas Dataframes In Python.
First, we’ll need to create our dataset. We’ll combine both dataframes by creating a list of Panda’s dataframes. We’ll need to create an object for the list to hold the merged dataset.
After that, we call the merge function from pandas to combine a list of pandas dataframes into a single dataframe in Python in aggregation with reduce function.
The execution process is like this:
- First, create two or more dataframes in Pandas by importing the panda’s library.
- Now, in a list, hold both dataframes.
- Now import reduce function from the functools module.
- Now using reduce function in aggregation with merge() function to combine a list of pandas dataframes into a single dataframe in Python.
- Both dataframes are joined by “on,” indicating which field is being joined.
- A join can be inner, outer, left, or right depending on the “how.”
- Print the merged dataframes using simple print commands.
import pandas as pd
flower=pd.DataFrame({'flower':['Red Ginger','Tree Poppy','passion flower','water lily'],'test':['similarities','accuracy','correctness','classification']},
index=[0,1,2,3])
test=pd.DataFrame({'flower':['Red Ginger','Tree Poppy','rose flower','sun flower'],'cluster':['cluster_1','cluster_2','cluster_3','cluster_4' ]},
index=[4,5,6,7])
merge_a_list = [flower, test]
from functools import reduce
merge_df = reduce(lambda flower, test:
pd.merge(flower , test,
on = ["flower"],
how = "outer"),
merge_a_list)
merge_df
flower | test | cluster | |
0 | Red Ginger | similarities | cluster_1 |
1 | Tree Poppy | accuracy | cluster_2 |
2 | passion flower | correctness | NaN |
3 | water lily | classification | NaN |
4 | rose flower | NaN | cluster_3 |
5 | sun flower | NaN | cluster_4 |
Using Concat() function
You can merge DataFrames by calling pandas.concat( merge_a_list ) with pandas.DataFrames with the same column labels are merged and give a single dataframe. DataFrame data that belong to the same column label are combined into a single column by the Concat () function. A column outside of the intersection will be empty. The value ” NaN ” will be returned if no values are present.
import pandas as pd
flower=pd.DataFrame({'flower':['Red Ginger','Tree Poppy','passion flower','water lily'],'test':['similarities','accuracy','correctness','classification']},
index=[0,1,2,3])
test=pd.DataFrame({'flower':['Red Ginger','Tree Poppy','rose flower','sun flower'],'cluster':['cluster_1','cluster_2','cluster_3','cluster_4' ]},
index=[4,5,6,7])
merge_a_list = [flower, test]
merge = pd. concat(merge_a_list)
print(type(merge))
merge
<class 'pandas.core.frame.DataFrame'>
flower | test | cluster | |
0 | Red Ginger | similarities | NaN |
1 | Tree Poppy | accuracy | NaN |
2 | passion flower | correctness | NaN |
3 | water lily | classification | NaN |
4 | Red Ginger | NaN | cluster_1 |
5 | Tree Poppy | NaN | cluster_2 |
6 | rose flower | NaN | cluster_3 |
7 | sun flower | NaN | cluster_4 |
Conclusion
On this Page, two ways are discussed with examples of how to merge a list of pandas dataframes into a single dataframes in Python. These two methods include using a merge() function to join dataframes into a single dataframe and using a concat() function to do so.