TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

Pandas DataFrame.sample()

Pandas DataFrame.sample() with What is Python Pandas, Reading Multiple Files, Null values, Multiple index, Application, Application Basics, Resampling, Plotting the data, Moving windows functions, Series, Read the file, Data operations, Filter Data etc.

<< Back to PANDAS

Pandas Dataframe.sample()

The Pandas sample() is used to select the rows and columns from the DataFrame randomly. If we want to build a model from an extensive dataset, we have to randomly choose a smaller sample of the data that is done through a function sample.

Syntax

DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)

Parameters

  • n: It is an optional parameter that consists of an integer value and defines the number of random rows generated.
  • frac: It is also an optional parameter that consists of float values and returns float value * length of data frame values. It cannot be used with a parameter n.
  • replace: It consists of boolean value. If it is true, it returns a sample with replacement. The default value of the replace is false.
  • weights: It is also an optional parameter that consists of str or ndarray-like. Default value "None" that results in equal probability weighting.
    If a Series is being passed; it will align with the target object on the index. The index values in weights that are not found in the sampled object will be ignored, and index values in the sampled object not in weights will be assigned zero weights.
    If a DataFrame is being passed when axis =0; it will accept the name of a column.
    If the weights are Series; then, the weights must be of the same length as axis being sampled.
    If the weights are not equal to 1; it will be normalized to the sum of 1.
    The missing value in the weights column is considered as zero.
    Infinite values are not allowed in the weights column.
  • random_state: It is also an optional parameter that consists of an integer or numpy.random.RandomState. If the value is int, it seeds for the random number generator or numpy RandomState object.
  • axis: It is also an optional parameter that consists of integer or string value. 0 or 'row' and 1 or 'column'.

Returns

It returns a new object of the same type as a caller that contains n items randomly sampled from the caller object.

Example1

import pandas as pd
info = pd.DataFrame({'data1': [2, 4, 8, 0],
'data2': [2, 0, 0, 0],
'data3': [10, 2, 1, 8]},
index=['John', 'Parker', 'Smith', 'William'])
info
info['data1'].sample(n=3, random_state=1)
info.sample(frac=0.5, replace=True, random_state=1)
info.sample(n=2, weights='data3', random_state=1)

Output

       data1    data2    data3
John     2	     2	     10
William	 0	     0	     8

Example2

In this example, we take a csv file and extract random rows from the DataFrame by using a sample.

The csv file named as aa that contains the following dataset:

Pandas DataFrame.sample()

Let's write a code that extract the random rows from the above dataset:

# importing pandas package 
import pandas as pd 
# define data frame from csv file  
data = pd.read_csv("aa.csv") 
 # randomly select one row  
row1 = data.sample(n = 1)   
# display row
row1
# randomly select another row 
row2 = data.sample(n = 2) 
# display  row
row2

Output

          Name         Hire Date    Salary      Leaves Remaining
2     Parker Chapman    02/21/14     45000.0      10
5     Michael Palin     06/28/13     66000.0      8

Next TopicDataFrame.shift()




Related Links:


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf