C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
Pandas Dataframe.sample()The Pandas sample() is used to select the rows and columns from the DataFrame randomly. If we want to build a model from an extensive dataset, we have to randomly choose a smaller sample of the data that is done through a function sample. SyntaxDataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None) Parameters
ReturnsIt returns a new object of the same type as a caller that contains n items randomly sampled from the caller object. Example1import pandas as pd info = pd.DataFrame({'data1': [2, 4, 8, 0], 'data2': [2, 0, 0, 0], 'data3': [10, 2, 1, 8]}, index=['John', 'Parker', 'Smith', 'William']) info info['data1'].sample(n=3, random_state=1) info.sample(frac=0.5, replace=True, random_state=1) info.sample(n=2, weights='data3', random_state=1) Output data1 data2 data3 John 2 2 10 William 0 0 8 Example2In this example, we take a csv file and extract random rows from the DataFrame by using a sample. The csv file named as aa that contains the following dataset: Let's write a code that extract the random rows from the above dataset: # importing pandas package import pandas as pd # define data frame from csv file data = pd.read_csv("aa.csv") # randomly select one row row1 = data.sample(n = 1) # display row row1 # randomly select another row row2 = data.sample(n = 2) # display row row2 Output Name Hire Date Salary Leaves Remaining 2 Parker Chapman 02/21/14 45000.0 10 5 Michael Palin 06/28/13 66000.0 8
Next TopicDataFrame.shift()
|