C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
Pandas Dataframe.sample()The Pandas sample() is used to select the rows and columns from the DataFrame randomly. If we want to build a model from an extensive dataset, we have to randomly choose a smaller sample of the data that is done through a function sample. SyntaxDataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None) Parameters
ReturnsIt returns a new object of the same type as a caller that contains n items randomly sampled from the caller object. Example1
import pandas as pd
info = pd.DataFrame({'data1': [2, 4, 8, 0],
'data2': [2, 0, 0, 0],
'data3': [10, 2, 1, 8]},
index=['John', 'Parker', 'Smith', 'William'])
info
info['data1'].sample(n=3, random_state=1)
info.sample(frac=0.5, replace=True, random_state=1)
info.sample(n=2, weights='data3', random_state=1)
Output data1 data2 data3 John 2 2 10 William 0 0 8 Example2In this example, we take a csv file and extract random rows from the DataFrame by using a sample. The csv file named as aa that contains the following dataset:
Let's write a code that extract the random rows from the above dataset:
# importing pandas package
import pandas as pd
# define data frame from csv file
data = pd.read_csv("aa.csv")
# randomly select one row
row1 = data.sample(n = 1)
# display row
row1
# randomly select another row
row2 = data.sample(n = 2)
# display row
row2
Output Name Hire Date Salary Leaves Remaining 2 Parker Chapman 02/21/14 45000.0 10 5 Michael Palin 06/28/13 66000.0 8
Next TopicDataFrame.shift()
|