C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
Pandas IndexPandas Index is defined as a vital tool that selects particular rows and columns of data from a DataFrame. Its task is to organize the data and to provide fast accessing of data. It can also be called a Subset Selection. The values are in bold font in the index, and the individual value of the index is called a label. If we want to compare the data accessing time with and without indexing, we can use %%timeit for comparing the time required for various access-operations. We can also define an index like an address through which any data can be accessed across the Series or DataFrame. A DataFrame is a combination of three different components, the index, columns, and the data. Axis and axesAn axis is defined as a common terminology that refers to rows and columns, whereas axes are collection of these rows and columns. Creating indexFirst, we have to take a csv file that consist some data used for indexing. # importing pandas package import pandas as pd data = pd.read_csv("aa.csv") data Output: Name Hire Date Salary Leaves Remaining 0 John Idle 03/15/14 50000.0 10 1 Smith Gilliam 06/01/15 65000.0 8 2 Parker Chapman 05/12/14 45000.0 10 3 Jones Palin 11/01/13 70000.0 3 4 Terry Gilliam 08/12/14 48000.0 7 5 Michael Palin 05/23/13 66000.0 8 Example1# importing pandas package import pandas as pd # making data frame from csv file info = pd.read_csv("aa.csv", index_col ="Name") # retrieving multiple columns by indexing operator a = info[["Hire Date", "Salary"]] print(a) Output: Name Hire Date Salary 0 John Idle 03/15/14 50000.0 1 Smith Gilliam 06/01/15 65000.0 2 Parker Chapman 05/12/14 45000.0 3 Jones Palin 11/01/13 70000.0 4 Terry Gilliam 08/12/14 48000.0 5 Michael Palin 05/23/13 66000.0 Example2:# importing pandas package importpandas as pd # making data frame from csv file info =pd.read_csv("aa.csv", index_col ="Name") # retrieving columns by indexing operator a =info["Salary"] print(a) Output: Name Salary 0 John Idle 50000.0 1 Smith Gilliam 65000.0 2 Parker Chapman 45000.0 3 Jones Palin 70000.0 4 Terry Gilliam 48000.0 5 Michael Palin 66000.0 Set indexThe 'set_index' is used to set the DataFrame index using existing columns. An index can replace the existing index and can also expand the existing index. It set a list, Series or DataFrame as the index of the DataFrame. info = pd.DataFrame({'Name': ['Parker', 'Terry', 'Smith', 'William'], 'Year': [2011, 2009, 2014, 2010], 'Leaves': [10, 15, 9, 4]}) info info.set_index('Name') info.set_index(['year', 'Name']) info.set_index([pd.Index([1, 2, 3, 4]), 'year']) a = pd.Series([1, 2, 3, 4]) info.set_index([a, a**2]) Output: Name Year Leaves 1 1 Parker 2011 10 2 4 Terry 2009 15 3 9 Smith 2014 9 4 16 William 2010 4 Multiple IndexWe can also have multiple indexes in the data. Example1: import pandas as pd import numpy as np pd.MultiIndex(levels=[[np.nan, None, pd.NaT, 128, 2]], codes=[[0, -1, 1, 2, 3, 4]]) Output: MultiIndex(levels=[[nan, None, NaT, 128, 2]], codes=[[0, -1, 1, 2, 3, 4]]) Reset indexWe can also reset the index using the 'reset_index' command. Let's look at the 'cm' DataFrame again. Example: info = pd.DataFrame([('William', 'C'), ('Smith', 'Java'), ('Parker', 'Python'), ('Phill', np.nan)], index=[1, 2, 3, 4], columns=('name', 'Language')) info info.reset_index() Output: index name Language 0 1 William C 1 2 Smith Java 2 3 Parker Python 3 4 Phill NaN
Next TopicMultiple Index
|