from pandas import DataFrame
# Create DataFrame
cart = {'Product': ['Mobile', 'AC', 'Mobile', 'Sofa', 'Laptop'],
'Price': [20000, 28000, 22000, 19000, 45000],
'Year': [2014, 2015, 2016, 2017, 2018]
}
df = DataFrame(cart, columns = ['Product', 'Price', 'Year'])
# Original DataFrame
print("Original DataFrame:\n", df)
output
Original DataFrame:
Product Price Year
0 Mobile 20000 2014
1 AC 28000 2015
2 Mobile 22000 2016
3 Sofa 19000 2017
4 Laptop 45000 2018
Get the Descriptive Statistics for Pandas
DataFrame
Below are the examples from which we can understand
about descriptive statistics in Pandas in Python:
Descriptive Statistics in Pandas of Price Column
Descriptive Statistics in Pandas of Year Column
Descriptive Statistics of Whole DataFrame
Descriptive Statistics in Pandas of Data Individually
Descriptive Statistics in Pandas of Price Column
In this example, a DataFrame is created with product details, prices,
and years. Descriptive statistics, including count, mean, and
standard deviation of the ‘Price’ column, are then computed and
displayed using describe() method.
Python3
# Describing descriptive statistics of Price
print("\nDescriptive statistics of Price:\n")
stats = df['Price'].describe()
print(stats)
Output:
1
Descriptive statistics of Price:
count 5.000000
mean 26800.000000
std 9986.532963
min 19000.000000
25% 20000.000000
50% 22000.000000
75% 28000.000000
max 45000.000000
Name: Price, dtype: float64
Descriptive Statistics in Pandas of Year Column
In this example, a DataFrame is created to represent products with
their prices and respective years. The descriptive statistics, such as
count, mean, and standard deviation of the ‘Year’ column, are
computed and printed.
Python3
# Describing descriptive statistics of Year
print("\nDescriptive statistics of year:\n")
stats = df['Year'].describe()
print(stats)
Output:
Descriptive statistics of year:
count 5.000000
mean 2016.000000
std 1.581139
min 2014.000000
25% 2015.000000
50% 2016.000000
75% 2017.000000
max 2018.000000
Name: Year, dtype: float64
Descriptive Statistics of Whole DataFrame
In this example, a DataFrame is constructed with product details,
prices, and years. The entire DataFrame’s descriptive statistics,
encompassing all columns, are computed and displayed, including
count, unique values, top value, and frequency for categorical
columns, and mean, standard deviation, and quartile information for
numerical columns.
Python3
2
# Describing descriptive statistics of whole dataframe
print("\nDescriptive statistics of whole dataframe:\n")
stats = df.describe(include='all')
print(stats)
Output:
Descriptive statistics of whole dataframe:
Product Price Year
count 5 5.000000 5.000000
unique 4 NaN NaN
top Mobile NaN NaN
freq 2 NaN NaN
mean NaN 26800.000000 2016.000000
std NaN 9986.532963 1.581139
min NaN 19000.000000 2014.000000
25% NaN 20000.000000 2015.000000
50% NaN 22000.000000 2016.000000
75% NaN 28000.000000 2017.000000
max NaN 45000.000000 2018.000000
Descriptive Statistics in Pandas of Data Individually
Let’s print all the descriptive statistical data individually. In this
example, a DataFrame named df is created containing product
names, their respective prices, and purchase years. Various
statistics related to the ‘Price’ column, such as count, mean,
maximum value, and standard deviation, are calculated and printed.
Python3
# Count of Price
print("\nCount of Price:")
counts = df['Price'].count()
print(counts)
# Mean of Price
print("\nMean of Price:")
3
m = df['Price'].mean()
print(m)
# Maximum value of Price
print("\nMaximum value of Price:")
mx = df['Price'].max()
print(mx)
# Standard deviation of Price
print("\nStandard deviation of Price:")
sd = df['Price'].std()
print(sd)
Output:
Count of Price:
5
Mean of Price:
26800.0
Maximum value of Price:
45000
Standard deviation of Price:
9986.53296259569
How to Read and Write Files Using Pandas
4 mins read2.1K ViewsComment
Pandas is a very popular Python library that offers a set of functions and data
structures that aid in data analysis more efficiently. The Pandas package is
mainly used for data pre-processing purposes such as data cleaning,
manipulation, and transformation. Hence, it is a very handy tool for data
scientists and analysts. Let’s find out how to read and write files using
pandas.
We will cover the following sections:
Data Structures in Pandas
Writing a File Using Pandas
4
Reading a File Using Pandas
Importing a CSV File into the DataFrame
Endnotes
Data Structures in Pandas
There are two main types of Data Structures in Pandas –
Pandas Series: 1D labeled homogeneous array, size-immutable
Pandas DataFrame: 2D labeled tabular structure, size-mutable
Mutability refers to the tendency to change. When we say a value is mutable,
it means that it can be changed.
#Importing Pandas Library
import pandas as pd
Copy code
Creating a Pandas DataFrame
#Creating a Sample DataFrame
data = pd.DataFrame({
'id': [ 1, 2, 3, 4, 5, 6, 7],
'age': [ 27, 32, 23, 41, 37, 31, 49],
'gender': [ 'M', 'F', 'F', 'M', 'M', 'M', 'F'],
'occupation': [ 'Salesman', 'Doctor', 'Manager', 'Teacher', 'Mechanic', 'Lawyer', 'Nurse']
})
data
Copy code
Writing a File Using Pandas
Save the DataFrame we created above as a CSV file using pandas .to_csv() function,
as shown:
5
Want to get exclusive news related to your field for free? Sign up now!
Helped 25K+ students
Get news regarding upcoming exams, top colleges and more
News related to which course
Your Email
+91
Your mobile number
Bangalore is your current location
I agree to the Shiksha’s Terms and Conditions and Privacy Policy and provide consent to be
contacted for promotion via whatsapp, sms, mail, etc.
Done
Already have an account? Login
#Writing to CSV file
data.to_csv('data.csv')
Copy code
We can also save the DataFrame as an Excel file using pandas .to_excel() function, as
shown:
#Writing to Excel file
data.to_excel('data2.xlsx')
Copy code
Save the DataFrame we created above as a Text file using the same function that we
use for CSV files:
#Writing to Text file
data.to_csv('data3.txt')
NumPy ufuncs | Universal functions
Last Updated : 01 Feb, 2024
6
NumPy Universal functions (ufuncs in short) are simple
mathematical functions that operate on ndarray (N-dimensional
array) in an element-wise fashion.
It supports array broadcasting, type casting, and several other
standard features. NumPy provides various universal functions
like standard trigonometric functions, functions for
arithmetic operations, handling complex numbers, statistical
functions, etc.
Characteristics of NumPy ufuncs
These functions operate on ndarray (N-dimensional array) i.e.
NumPy’s array class.
It performs fast element-wise array operations.
It supports various features like array broadcasting, type casting,
etc.
Numpy universal functions are objects that belong
to numpy.ufunc class.
Python functions can also be created as a universal function using
the frompyfunc library function.
Some ufuncs are called automatically when the corresponding
arithmetic operator is used on arrays. For example, when the
addition of two arrays is performed element-wise using the ‘+’
operator then np.add() is called internally.
NumPy ufuncs are functions that operate on ndarray objects
in an element-by-element fashion. They provide a way to
execute mathematical, logical, and other operations on
arrays efficiently. Ufuncs support a wide range of arithmetic
operations such as addition, subtraction, multiplication,
division, and more.
Statistical functions
These functions calculate the mean, median, variance, minimum, etc. of array
elements.
They are used to perform statistical analysis of array elements.
It includes functions like:
ufunc’s Statistical Functions in NumPy
Function Description
amin, amax returns minimum or maximum of an
7
ufunc’s Statistical Functions in NumPy
array or along an axis
returns range of values (maximum-
ptp
minimum) of an array or along an axis
calculate the pth percentile of the
percentile(a, p, axis)
array or along a specified axis
compute the median of data along a
median
specified axis
compute the mean of data along a
mean
specified axis
compute the standard deviation of
std
data along a specified axis
compute the variance of data along a
var
specified axis
compute the average of data along a
average
specified axis
import numpy as np
# construct a weight array
weight = np.array([50.7, 52.5, 50, 58, 55.63, 73.25, 49.5, 45])
# minimum and maximum
print('Minimum and maximum weight of the students: ')
8
print(np.amin(weight), np.amax(weight))
# range of weight i.e. max weight-min weight
print('Range of the weight of the students: ')
print(np.ptp(weight))
# percentile
print('Weight below which 70 % student fall: ')
print(np.percentile(weight, 70))
# mean
print('Mean weight of the students: ')
print(np.mean(weight))
# median
print('Median weight of the students: ')
print(np.median(weight))
# standard deviation
print('Standard deviation of weight of the students: ')
print(np.std(weight))
# variance
print('Variance of weight of the students: ')
print(np.var(weight))
# average
print('Average weight of the students: ')
print(np.average(weight))
9
Output
Minimum and maximum weight of the students:
45.0 73.25
Range of the weight of the students:
28.25
Weight below which 70 % student fall:
55.317
Mean weight of the students:
54.3225
Median weight of the students:
51.6
Standard deviation of weight of the students:
8.05277397857
Variance of weight of the students:
64.84716875
Average weight of the students:
54.3225
Simple Arithmetic
ufuncs in NumPy allow for performing simple arithmetic operations on arrays
efficiently.
import numpy as np
array_a = np.array([10, 20, 30, 38])
array_b = np.array([2, 4, 6, 8])
# Addition using the add() ufunc
addition_result = np.add(array_a, array_b)
# Subtraction using the subtract() ufunc
subtraction_result = np.subtract(array_a, array_b)
# Multiplication using the multiply() ufunc
multiplication_result = np.multiply(array_a, array_b)
# Division using the divide() ufunc
division_result = np.divide(array_a, array_b)
# Finding power using the power() ufunc
power_result = np.power(array_a, array_b)
10
# Finding remainder using the mod() and remainder() ufunc
mod_result = np.mod(array_a, array_b)
remainder_result = np.remainder(array_a, array_b)
# Finding both the quotient and the the mod using divmod()ufunc
quotient_result = np.divmod(array_a, array_b)
print("Array A:", array_a)
print("Array B:", array_b)
print("Addition Result:", addition_result)
print("Subtraction Result:", subtraction_result)
print("Multiplication Result:", multiplication_result)
print("Division Result:", division_result)
print("Power Result:", power_result)
print("Mod Result:", mod_result)
print("Remainder Result:", remainder_result)
print("Quotient Result:", quotient_result)
Output:
Array A: [10 20 30 38]
Array B: [2 4 6 8]
Addition Result: [12 24 36 46]
Subtraction Result: [ 8 16 24 30]
Multiplication Result: [ 20 80 180 304]
Division Result: [5. 5. 5. 4.75]
Power Result: [ 100 160000 729000000 4347792138496]
Mod Result: [0 0 0 6]
Remainder Result: [0 0 0 6]
Quotient Result: (array([5, 5, 5, 4]), array([0, 0, 0, 6]))
sort
The Numpy unique() function is used to return the sorted unique elements
of an array. It can also optionally return the indices of the input array that
give the unique values and the counts of each unique value.
This function is useful for removing duplicates from an array and
understanding the frequency of elements.
Syntax
Following is the syntax of Numpy unique() function −
numpy.unique(arr, return_index, return_inverse, return_counts)
11
Parameters
Following are the parameters of the Numpy unique() function −
arr: The input array. Will be flattened if not 1-D array.
return_index: If True, returns the indices of elements in the input
array.
return_inverse: If True, returns the indices of unique array, which can
be used to reconstruct the input array.
return_counts: If True, returns the number of times the element in
unique array appears in the original array.
Example 1
Following is the example of Numpy unique() function in which creating an
array with the unique values of the given input array −
Open Compiler
import numpy as np
# Create a 1D array
a = np.array([5, 2, 6, 2, 7, 5, 6, 8, 2, 9])
print('First array:')
print(a)
print('\n')
# Get unique values in the array
print('Unique values of first array:')
u = np.unique(a)
print(u)
print('\n')
Output
First array:
[5 2 6 2 7 5 6 8 2 9]
Unique values of first array:
[2 5 6 7 8 9]
NumPy GCD Greatest Common Denominator
GCD, or "Greatest Common Divisor," also known as the greatest common factor or
highest common factor, represents the largest positive integer that can evenly divide
two or more integers without resulting in a remainder.
12
import numpy as np
n1 = 6
n2 = 9
gcd = np.gcd(n1, n2)
print(gcd) # Output: 3
To find the GCD of all values in an array, the reduce() method can be used.
import numpy as np
arr = np.array([20, 8, 32, 36, 16])
gcd = np.gcd.reduce(arr)
print(gcd) # Output: 4
NumPy Set Operations
A set is a collection of unique data. That is, elements of a set cannot be
repeated.
NumPy set operations perform mathematical set operations on arrays like
union, intersection, difference, and symmetric difference.
Set Union Operation in NumPy
The union of two sets A and B include all the elements of set A and B.
Set Union in NumPy
In NumPy, we use the np.union1d() function to perform the set union
operation in an array. For example,
import numpy as np
A = np.array([1, 3, 5])
B = np.array([0, 2, 3])
13
# union of two arrays
result = np.union1d(A, B)
print(result)
# Output: [0 1 2 3 5]
Run Code
In this example, we have used the np.union1d(A, B) function to compute the
union of two arrays: A and B .
Here, the function returns unique elements from both arrays.
Note: np.union1d(A,B) is equivalent to A ⋃ B set operation.
Set Intersection Operation in NumPy
The intersection of two sets A and B include the common elements between
set A and B.
Set Intersection in NumPy
We use the np.intersect1d() function to perform the set intersection
operation in an array. For example,
import numpy as np
A = np.array([1, 3, 5])
B = np.array([0, 2, 3])
# intersection of two arrays
result = np.intersect1d(A, B)
14
print(result)
# Output: [3]
Run Code
Note: np.intersect1d(A,B) is equivalent to A ⋂ B set operation.
Set Difference Operation in NumPy
The difference between two sets A and B include elements of set A that are
not present on set B.
Set Difference in NumPy
We use the np.setdiff1d() function to perform the difference between two
arrays. For example,
import numpy as np
A = np.array([1, 3, 5])
B = np.array([0, 2, 3])
# difference of two arrays
result = np.setdiff1d(A, B)
print(result)
# Output: [1 5]
Run Code
Note: np.setdiff1d(A,B) is equivalent to A - B set operation.
15
Set Symmetric Difference Operation in NumPy
The symmetric difference between two sets A and B includes all elements
of A and B without the common elements.
Set Symmetric Difference in NumPy
In NumPy, we use the np.setxor1d() function to perform symmetric
differences between two arrays. For example,
import numpy as np
A = np.array([1, 3, 5])
B = np.array([0, 2, 3])
# symmetric difference of two arrays
result = np.setxor1d(A, B)
print(result)
# Output: [0 1 2 5]
Run Code
Unique Values From a NumPy Array
To select the unique elements from a NumPy array, we use
the np.unique() function. It returns the sorted unique elements of an array. It
can also be used to create a set out of an array.
Let's see an example.
import numpy as np
array1 = np.array([1,1, 2, 2, 4, 7, 7, 3, 5, 2, 5])
# unique values from array1
result = np.unique(array1)
16
print(result)
# Output: [1 2 3 4 5 7]
Run Code
Here, the resulting array [1 2 3 4 5 7] contains only the unique elements of
the original array array1 .
17