TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

Pandas Merge

Pandas Merge with What is Python Pandas, Reading Multiple Files, Null values, Multiple index, Application, Application Basics, Resampling, Plotting the data, Moving windows functions, Series, Read the file, Data operations, Filter Data etc.

<< Back to PANDAS

Pandas DataFrame.merge()

Pandas merge() is defined as the process of bringing the two datasets together into one and aligning the rows based on the common attributes or columns. It is an entry point for all standard database join operations between DataFrame objects:

Syntax:

pd.merge(left, right, how='inner', on=None, left_on=None, right_on=None,
left_index=False, right_index=False, sort=True)

Parameters:

  • right: DataFrame or named Series
    It is an object which merges with the DataFrame.
  • how: {'left', 'right', 'outer', 'inner'}, default 'inner'
    Type of merge to be performed.
    • left: It use only keys from the left frame, similar to a SQL left outer join; preserve key order.
    • right: It use only keys from the right frame, similar to a SQL right outer join; preserve key order.
    • outer: It used the union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.
    • inner: It use the intersection of keys from both frames, similar to a SQL inner join; preserve the order of the left keys.
  • on: label or list
    It is a column or index level names to join on. It must be found in both the left and right DataFrames. If on is None and not merging on indexes, then this defaults to the intersection of the columns in both DataFrames.
    left_on: label or list, or array-like
    It is a column or index level names from the left DataFrame to use as a key. It can be an array with length equal to the length of the DataFrame.
  • right_on: label or list, or array-like
    It is a column or index level names from the right DataFrame to use as keys. It can be an array with length equal to the length of the DataFrame.
  • left_index : bool, default False
    It uses the index from the left DataFrame as the join key(s), If true. In the case of MultiIndex (hierarchical), many keys in the other DataFrame (either the index or some columns) should match the number of levels.
  • right_index : bool, default False
    It uses the index from the right DataFrame as the join key. It has the same usage as the left_index.
  • sort: bool, default False
    If True, it sorts the join keys in lexicographical order in the result DataFrame. Otherwise, the order of the join keys depends on the join type (how keyword).
  • suffixes: tuple of the (str, str), default ('_x', '_y')
    It suffixes to apply to overlap the column names in the left and right DataFrame, respectively. The columns use (False, False) values to raise an exception on overlapping.
  • copy: bool, default True
    If True, it returns a copy of the DataFrame.
    Otherwise, It can avoid the copy.
  • indicator: bool or str, default False
    If True, It adds a column to output DataFrame "_merge" with information on the source of each row. If it is a string, a column with information on the source of each row will be added to output DataFrame, and the column will be named value of a string. The information column is defined as a categorical-type and it takes value of:
    • "left_only" for the observations whose merge key appears only in 'left' of the DataFrame, whereas,
    • "right_only" is defined for observations in which merge key appears only in 'right' of the DataFrame,
    • "both" if the observation's merge key is found in both of them.
  • validate: str, optional
    If it is specified, it checks the merge type that is given below:
    • "one_to_one" or "1:1": It checks if merge keys are unique in both the left and right datasets.
    • "one_to_many" or "1:m": It checks if merge keys are unique in only the left dataset.
    • "many_to_one" or "m:1": It checks if merge keys are unique in only the right dataset.
    • "many_to_many" or "m:m": It is allowed, but does not result in checks.

Example1: Merge two DataFrames on a key

# import the pandas library
import pandas as pd
left = pd.DataFrame({
   'id':[1,2,3,4],
   'Name': ['John', 'Parker', 'Smith', 'Parker'],
   'subject_id':['sub1','sub2','sub4','sub6']})
right = pd.DataFrame({
    'id':[1,2,3,4],
   'Name': ['William', 'Albert', 'Tony', 'Allen'],
   'subject_id':['sub2','sub4','sub3','sub6']})
print (left)
print (right)

Output

    id      Name     subject_id

0   1       John        sub1
1   2       Parker      sub2
2   3       Smith       sub4
3   4       Parker      sub6

    id      Name     subject_id

0   1       William     sub2
1   2       Albert      sub4
2   3       Tony        sub3
3   4       Allen       sub6 

Example2: Merge two DataFrames on multiple keys:

import pandas as pd
left = pd.DataFrame({
   'id':[1,2,3,4,5],
   'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
   'subject_id':['sub1','sub2','sub4','sub6','sub5']})
right = pd.DataFrame({
	'id':[1,2,3,4,5],
   'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
   'subject_id':['sub2','sub4','sub3','sub6','sub5']})
print pd.merge(left,right,on='id')

Output

    id   Name_x    subject_id_x   Name_y     subject_id_y
0   1     John      sub1          William     sub2
1   2     Parker    sub2          Albert      sub4
2   3     Smith     sub4          Tony        sub3
3   4     Parker    sub6          Allen       sub6





Related Links:


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf