0% found this document useful (0 votes)

10 views11 pages

Python Numpy Pandas1

Uploaded by

Arittra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views11 pages

Python Numpy Pandas1

Uploaded by

Arittra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 11

f=open(r"d:\student.

txt","r")
x=f.read()
print(x)

f=open(r"d:\ret.txt","r+")
d='hello'
f.write(d)
f.close()
f=open(r"d:\ret.txt","r+")
x=f.read()
print(x)

with open('workfile') as f:
... read_data = f.read()
f.closed

f.readline() // to read each line

with open(r"d:\ret.txt","r+") as f:
for x in f:
print(x,end='')

If you want to read all the lines of a file in a list you can also use list(f) or
f.readlines().

NumPy is a Python library used for working with arrays.

It also has functions for working in domain of linear algebra, fourier transform,
and matrices.

NumPy was created in 2005 by Travis Oliphant. It is an open source project and you
can use it freely.

NumPy stands for Numerical Python.

In Python we have lists that serve the purpose of arrays, but they are slow to
process.

NumPy aims to provide an array object that is up to 50x faster than traditional
Python lists.

The array object in NumPy is called ndarray, it provides a lot of supporting

functions that make working with ndarray very easy.

Arrays are very frequently used in data science, where speed and resources are very
important.

NumPy is used to work with arrays. The array object in NumPy is called ndarray.

We can create a NumPy ndarray object by using the array() function.

import numpy

arr = numpy.array([1, 2, 3, 4, 5])

print(arr)
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

print(type(arr))

Create a 0-D array with value 42

import numpy as np

arr = np.array(42)

print(arr)

Create a 1-D array containing the values 1,2,3,4,5:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

Create a 2-D array containing two arrays with the values 1,2,3 and 4,5,6:

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)

3-D arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.

import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(arr)

Check how many dimensions the arrays have:

import numpy as np

a = np.array(42)
b = np.array([1, 2, 3, 4, 5])

print(a.ndim)
print(b.ndim)

Get third and fourth elements from the following array and add them.

import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[2] + arr[3])
2D Array

Access the element on the first row, second column:

import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('2nd element on 1st row: ', arr[0, 1])

Slicing arrays
Slicing in python means taking elements from one given index to another given
index.

We pass slice instead of index like this: [start:end].

We can also define the step, like this: [start:end:step].

If we don't pass start its considered 0

If we don't pass end its considered length of array in that dimension

If we don't pass step its considered 1

Slice elements from index 1 to index 5 from the following array:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5])

Return every other element from index 1 to index 5:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5:2])

By default Python have these data types:

strings - used to represent text data, the text is given under quote marks. e.g.
"ABCD"
integer - used to represent integer numbers. e.g. -1, -2, -3
float - used to represent real numbers. e.g. 1.2, 42.42
boolean - used to represent True or False.
complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j

NumPy has some extra data types, and refer to data types with one character, like i
for integers, u for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent
them.

i - integer
b - boolean
u - unsigned integer
f - float
c - complex float
m - timedelta
M - datetime
O - object
S - string
U - unicode string
V - fixed chunk of memory for other type ( void )

import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr.dtype)

import numpy as np

arr = np.array(['apple', 'banana', 'cherry'])

print(arr.dtype)

Create an array with data type string:

import numpy as np

arr = np.array([1, 2, 3, 4], dtype='S')

print(arr)
print(arr.dtype)

For i, u, f, S and U we can define size as well.

Create an array with data type 4 bytes integer:

import numpy as np

arr = np.array([1, 2, 3, 4], dtype='i4')

print(arr)
print(arr.dtype)

A non integer string like 'a' can not be converted to integer (will raise an
error):

import numpy as np

arr = np.array(['a', '2', '3'], dtype='i')

The Difference Between Copy and View

The main difference between a copy and a view of an array is that the copy is a new
array, and the view is just a view of the original array.

The copy owns the data and any changes made to the copy will not affect original
array, and any changes made to the original array will not affect the copy.

The view does not own the data and any changes made to the view will affect the
original array, and any changes made to the original array will affect the view.

Make a copy, change the original array, and display both arrays:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

x = arr.copy()
arr[0] = 42

print(arr)
print(x)

Make a view, change the original array, and display both arrays:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

x = arr.view()
arr[0] = 42

print(arr)
print(x)

Shape of an Array
The shape of an array is the number of elements in each dimension.

Get the Shape of an Array

NumPy arrays have an attribute called shape that returns a tuple with each index
having the number of corresponding elements.

import numpy as np

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

print(arr.shape)

Reshaping arrays
Reshaping means changing the shape of an array.

The shape of an array is the number of elements in each dimension.

By reshaping we can add or remove dimensions or change number of elements in each

dimension.

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(4, 3)

print(newarr)

Try converting 1D array with 8 elements to a 2D array with 3 elements in each

dimension (will raise an error):

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

newarr = arr.reshape(3, 3)
print(newarr)

Flattening the arrays

Flattening array means converting a multidimensional array into a 1D array.

We can use reshape(-1) to do this.

Convert the array into a 1D array:

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

newarr = arr.reshape(-1)

print(newarr)

Iterate

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

for x in arr:
print(x)

Joining NumPy Arrays

Joining means putting contents of two or more arrays in a single array.

In SQL we join tables based on a key, whereas in NumPy we join arrays by axes.

We pass a sequence of arrays that we want to join to the concatenate() function,

along with the axis. If axis is not explicitly passed, it is taken as 0.

import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))

print(arr)

Join two 2-D arrays along rows (axis=1):

import numpy as np

arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2), axis=1)

print(arr)

Joining Arrays Using Stack Functions

Stacking is same as concatenation, the only difference is that stacking is done
along a new axis.

We can concatenate two 1-D arrays along the second axis which would result in
putting them one over the other, ie. stacking.

We pass a sequence of arrays that we want to join to the stack() method along with
the axis. If axis is not explicitly passed it is taken as 0.

import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.stack((arr1, arr2), axis=1)

print(arr)

Splitting NumPy Arrays

Splitting is reverse operation of Joining.

Joining merges multiple arrays into one and Splitting breaks one array into
multiple.

We use array_split() for splitting arrays, we pass it the array we want to split
and the number of splits.

Split the array in 3 parts:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr)

If the array has less elements than required, it will adjust from the end
accordingly.

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 4)

print(newarr)

Splitting 2-D Arrays

Use the same syntax when splitting 2-D arrays.

Use the array_split() method, pass in the array you want to split and the number of
splits you want to do.

Split the 2-D array into three 2-D arrays.

import numpy as np
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])

newarr = np.array_split(arr, 3)

print(newarr)

Pandas is a Python library.

Pandas is used to analyze data.

Basic-- Pandas Series, Dataframes, Read CSV, Read Json, Analyze DAta
Cleaning data-- clean data, Clean empty cells, clean wrong format, clean wrong
data, remove duplicates
Advanced-- Coorelation, plotting

Load a CSV file into a Pandas DataFrame:

import pandas as pd

df = pd.read_csv('data.csv')

print(df.to_string())

What is Pandas?
Pandas is a Python library used for working with data sets.

It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis"
and was created by Wes McKinney in 2008.

Why Use Pandas?

Pandas allows us to analyze big data and make conclusions based on statistical
theories.

Pandas can clean messy data sets, and make them readable and relevant.

Relevant data is very important in data science.

Pandas gives you answers about the data. Like:

Is there a correlation between two or more columns?

What is average value?
Max value?
Min value?
Pandas are also able to delete rows that are not relevant, or contains wrong
values, like empty or NULL values. This is called cleaning the data.

import pandas

mydataset = {
'cars': ["BMW", "Volvo", "Ford"],
'passings': [3, 7, 2]
}

myvar = pandas.DataFrame(mydataset)

print(myvar)
What is a Series?
A Pandas Series is like a column in a table.

It is a one-dimensional array holding data of any type.

Create a simple Pandas Series from a list:

import pandas as pd

a = [1, 7, 2]

myvar = pd.Series(a)

print(myvar)

Labels
If nothing else is specified, the values are labeled with their index number. First
value has index 0, second value has index 1 etc.

This label can be used to access a specified value.

Create your own labels:

import pandas as pd

a = [1, 7, 2]

myvar = pd.Series(a, index = ["x", "y", "z"])

print(myvar)

When you have created labels, you can access an item by referring to the label.

ExampleGet your own Python Server

Return the value of "y":

print(myvar["y"])

Key/Value Objects as Series

You can also use a key/value object, like a dictionary, when creating a Series.

Create a simple Pandas Series from a dictionary:

import pandas as pd

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories)

print(myvar)

To select only some of the items in the dictionary, use the index argument and
specify only the items you want to include in the Series.

Example

Create a Series using only data from "day1" and "day2":

import pandas as pd

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories, index = ["day1", "day2"])

print(myvar)

DataFrames

Data sets in Pandas are usually multi-dimensional tables, called DataFrames.

A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array,

or a table with rows and columns.

Series is like a column, a DataFrame is the whole table.

Example

Create a simple Pandas DataFrame:

import pandas as pd

data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}

#load data into a DataFrame object:

df = pd.DataFrame(data)

print(df)

Locate Row
As you can see from the result above, the DataFrame is like a table with rows and
columns.

Pandas use the loc attribute to return one or more specified row(s)

Return row 0:

#refer to the row index:

print(df.loc[0])

Return row 0 and 1:

#use a list of indexes:

print(df.loc[[0, 1]])

Named Indexes
With the index argument, you can name your own indexes.

dd a list of names to give each row a name:

import pandas as pd

data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}

df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

print(df)

Locate Named Indexes

Use the named index in the loc attribute to return the specified row(s).

Return "day2":

#refer to the named index:

print(df.loc["day2"])

Load Files Into a DataFrame

If your data sets are stored in a file, Pandas can load them into a DataFrame.

Load a comma separated file (CSV file) into a DataFrame:

import pandas as pd

df = pd.read_csv(r'C:\Users\Student\Desktop\diabetes.csv')

print(df)

Numpy - Pandas
No ratings yet
Numpy - Pandas
26 pages
Num Py
No ratings yet
Num Py
15 pages
Numpy
No ratings yet
Numpy
27 pages
NUMPY
No ratings yet
NUMPY
8 pages
Numpy and Pandas
No ratings yet
Numpy and Pandas
28 pages
Numpy New
No ratings yet
Numpy New
16 pages
NumPy Notes
No ratings yet
NumPy Notes
15 pages
Numpy, Pandas and Matplotlib
No ratings yet
Numpy, Pandas and Matplotlib
60 pages
NumPy Class 11th
No ratings yet
NumPy Class 11th
10 pages
NumPy Array Basics for Beginners
No ratings yet
NumPy Array Basics for Beginners
8 pages
Python 5 Unit
No ratings yet
Python 5 Unit
74 pages
Unit 2
No ratings yet
Unit 2
21 pages
Unit3 - Arrays and Strings
No ratings yet
Unit3 - Arrays and Strings
20 pages
11 Arrays
No ratings yet
11 Arrays
12 pages
Numpy in Python
No ratings yet
Numpy in Python
34 pages
Numpy ML - AI
No ratings yet
Numpy ML - AI
135 pages
Numpy Tutorial
No ratings yet
Numpy Tutorial
19 pages
Numpy & Pandas
No ratings yet
Numpy & Pandas
13 pages
Python Libraries
No ratings yet
Python Libraries
22 pages
Num Py
No ratings yet
Num Py
35 pages
Numpy - Basics
100% (1)
Numpy - Basics
18 pages
PP&DS 3
No ratings yet
PP&DS 3
109 pages
Unit3 Notes
No ratings yet
Unit3 Notes
23 pages
Content://com Whatsapp Provider Media/item
No ratings yet
Content://com Whatsapp Provider Media/item
42 pages
NumPy - The Absolute Basics For Beginners - NumPy V2.4.dev0 Manual
No ratings yet
NumPy - The Absolute Basics For Beginners - NumPy V2.4.dev0 Manual
41 pages
Unit 4
No ratings yet
Unit 4
19 pages
Python Data Analysis for Beginners
No ratings yet
Python Data Analysis for Beginners
100 pages
NUMPYA03
No ratings yet
NUMPYA03
36 pages
Lab 1
No ratings yet
Lab 1
6 pages
Getting Started With NumPy in Data Analytics
No ratings yet
Getting Started With NumPy in Data Analytics
45 pages
De Lab Manual New
No ratings yet
De Lab Manual New
24 pages
Num Py
No ratings yet
Num Py
8 pages
Kuliah #7 Alprog - Numpy, Pandas, Matplotlib
No ratings yet
Kuliah #7 Alprog - Numpy, Pandas, Matplotlib
48 pages
Working With NumPy For Class 12th PDF
No ratings yet
Working With NumPy For Class 12th PDF
5 pages
Num Py
No ratings yet
Num Py
30 pages
NumPy Basics: Arrays and Operations
No ratings yet
NumPy Basics: Arrays and Operations
49 pages
Unit 3 Numpy
No ratings yet
Unit 3 Numpy
23 pages
NumPy - The Absolute Basics For Beginners - NumPy v1.23 Manual
No ratings yet
NumPy - The Absolute Basics For Beginners - NumPy v1.23 Manual
29 pages
Data Types in NumPy
No ratings yet
Data Types in NumPy
9 pages
Using Python For Data Science: Dr. D. Kothandaraman Associate Professor, SCOPE, VIT-AP
No ratings yet
Using Python For Data Science: Dr. D. Kothandaraman Associate Professor, SCOPE, VIT-AP
41 pages
1 - Numpy
No ratings yet
1 - Numpy
1 page
An Introduction To Numpy and Scipy by Scott Shell
No ratings yet
An Introduction To Numpy and Scipy by Scott Shell
24 pages
Numpy 2
No ratings yet
Numpy 2
24 pages
Basic Array Creation and Operations
No ratings yet
Basic Array Creation and Operations
27 pages
W03 - FA23 - AIC270 - Programming For AI - Syed Ahmed
No ratings yet
W03 - FA23 - AIC270 - Programming For AI - Syed Ahmed
57 pages
Lab 2, Python Numpy
No ratings yet
Lab 2, Python Numpy
9 pages
NumPy Library and Function
No ratings yet
NumPy Library and Function
129 pages
NumPy Array Operations Guide
100% (1)
NumPy Array Operations Guide
73 pages
Jovia Report
No ratings yet
Jovia Report
18 pages
Num Py
No ratings yet
Num Py
18 pages
Numpy
No ratings yet
Numpy
38 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
Unit 7 - Python Libraries
No ratings yet
Unit 7 - Python Libraries
22 pages
Exp 12345
No ratings yet
Exp 12345
15 pages
Numpy
No ratings yet
Numpy
14 pages
Unit - Iii
No ratings yet
Unit - Iii
79 pages
Python Unit-4 Notes
No ratings yet
Python Unit-4 Notes
44 pages
Lab 02
No ratings yet
Lab 02
5 pages
JavaScript Arrays
No ratings yet
JavaScript Arrays
13 pages
16 Linear Data Structures 110627100158 Phpapp02
No ratings yet
16 Linear Data Structures 110627100158 Phpapp02
59 pages
Asynchronous Messaging Explained
No ratings yet
Asynchronous Messaging Explained
65 pages
40 Java Collections Interview Questions and Answers
No ratings yet
40 Java Collections Interview Questions and Answers
54 pages
Labmanual 2024 2025 PSPP
No ratings yet
Labmanual 2024 2025 PSPP
76 pages
CT-157 Dsaa
No ratings yet
CT-157 Dsaa
63 pages
Module 3 Complete Notes
No ratings yet
Module 3 Complete Notes
123 pages
Final Solutions
No ratings yet
Final Solutions
20 pages
Informatics Practices-Class-11-2024-25
0% (1)
Informatics Practices-Class-11-2024-25
2 pages
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea Dictionaries. Reading Weiss Chap. 5, Sec. 10.4.2
No ratings yet
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea Dictionaries. Reading Weiss Chap. 5, Sec. 10.4.2
26 pages
PHP Arrays and Sorting Guide
No ratings yet
PHP Arrays and Sorting Guide
6 pages
Information Retrieval Course Guide
No ratings yet
Information Retrieval Course Guide
16 pages
CAT 2 Question Bank
No ratings yet
CAT 2 Question Bank
8 pages
Unit 03. PWP (22616)
No ratings yet
Unit 03. PWP (22616)
25 pages
Lab II Tybsc
No ratings yet
Lab II Tybsc
30 pages
Yooz Rising Feedbacks Dev Manual
No ratings yet
Yooz Rising Feedbacks Dev Manual
26 pages
PHP Question Answer
No ratings yet
PHP Question Answer
20 pages
Cloud Computing Chapter-6
No ratings yet
Cloud Computing Chapter-6
31 pages
Python Pandas for Data Science
No ratings yet
Python Pandas for Data Science
22 pages
Python File01
No ratings yet
Python File01
12 pages
Pyomo Installation Guide
No ratings yet
Pyomo Installation Guide
292 pages
Google Interview Prep Guide
No ratings yet
Google Interview Prep Guide
100 pages
Ip Project Class 11 2021-2022
50% (2)
Ip Project Class 11 2021-2022
31 pages
Python New
No ratings yet
Python New
6 pages
College Bus Management
33% (6)
College Bus Management
50 pages
Data Structures QP
No ratings yet
Data Structures QP
3 pages
DSA Practical
No ratings yet
DSA Practical
51 pages
18CS752 Python Code
No ratings yet
18CS752 Python Code
48 pages
Python Dictionary Guide
No ratings yet
Python Dictionary Guide
51 pages
Python Programming Exercises
No ratings yet
Python Programming Exercises
4 pages