 # Introduction to Numpy

- 16 mins

This is the second tutorial of the Explained! series.

I will be cataloging all the work I do with regards to PyLibraries and will share it here or on my Github.

I will also be updating this post as and when I work on Numpy.

That being said, Dive in!

## Numpy

NumPy is the fundamental package for scientific computing with Python.

In Python, data is almost universally represented as NumPy arrays. Even newer tools like Pandas are built around the NumPy array.

We will be seeing:

1. 1D array, 2D array
2. Array slices, joins, subsets
3. Arithmetic Operations on 2D arrays
4. Covariance, Correlation

### Initializing Numpy Arrays

1. Using np.array
2. Using np.ndarray
``````import numpy as np
np.random.seed(0)  # seed for reproducibility
x1 = np.random.randint(10, size=6)  # One-dimensional array
x2 = np.random.randint(10, size=(3, 3))  # Two-dimensional array
``````
``````print(x1)
``````
``````[5 0 3 3 7 9]
``````
``````print(x2)
``````
``````[[3 5 2]
[4 7 6]
[8 8 1]]
``````

# Using np.array()

``````x1 = np.array([1,2,3,4])
print(x1)
print(type(x1))
``````
``````[1 2 3 4]
<class 'numpy.ndarray'>
``````
``````x2 = np.array([[1,2,3],[4,5,6]])
print(x2)
print(type(x2))
``````
``````[[1 2 3]
[4 5 6]]
<class 'numpy.ndarray'>
``````

## Using np.ndarray()

``````x = np.ndarray(shape=3,dtype=int, buffer = np.array([1,2,3]))
print(x)
x = np.append(x,5)
print(x)
print(type(x))
``````
``````[1 2 3]
[1 2 3 5]
<class 'numpy.ndarray'>
``````
``````x = np.ndarray(shape=(2,2), dtype=float,buffer = np.array([[1.4,2.5],[1.3,2.4]]))
print(x)
``````
``````[[1.4 2.5]
[1.3 2.4]]
``````
``````x = np.ndarray(shape=(2,2), dtype=int,buffer = np.array([[1,2],[1,2]]))
print(x)
``````
``````[[1 2]
[1 2]]
``````

## Attributes of ndarray

Each array has attributes `ndim` (the number of dimensions), `shape` (the size of each dimension), and `size` (the total size of the array):

``````x2 = np.array([[1,2,3],[4,5,6]])
print("x2.ndim = ",x2.ndim)
print("x2.shape = ",x2.shape)
print("x2.size = ",x2.size)
``````
``````x2.ndim =  2
x2.shape =  (2, 3)
x2.size =  6
``````

Another useful attribute is the `dtype` which tells you about the type of elements in the array:

``````print("x2.dtype = ",x2.dtype)
``````
``````x2.dtype =  int64
``````

Other attributes include `itemsize`, which lists the size (in bytes) of each array element, and `nbytes`, which lists the total size (in bytes) of the array:

``````print("itemsize:", x.itemsize, "bytes")
print("nbytes:", x2.nbytes, "bytes")
``````
``````itemsize: 8 bytes
nbytes: 48 bytes
``````

## Array Indexing: Accessing Single Elements

In a one-dimensional array, the ith value (counting from zero) can be accessed by specifying the desired index in square brackets, just as with Python lists:

``````x1 = np.array([1,2,3,4])
print("x1 = ",x1)
print("x1 = ",x1) # just like arrays in c/c++
print("x1[-1] = ",x1[-1]) # negative indexing just like lists
``````
``````x1 =  [1 2 3 4]
x1 =  1
x1[-1] =  4
``````

In a multi-dimensional array, items can be accessed using a comma-separated tuple of indices:

``````x2 = np.array([[1,2,3],[4,5,6],[7,8,9]])
print("x2 = "); print(x2)
print("x2 = ",x2) # will print the entire 1st row

# to print 1st element of 1st row
print("x2= ", x2)
print("x2[0,0] = ",x2[0,0])

# to print 2nd element of 3rd row
print("x2= ", x2)
print("x2[2,1]= ", x2[2,1])

``````
``````x2 =
[[1 2 3]
[4 5 6]
[7 8 9]]
x2 =  [1 2 3]
x2=  1
x2[0,0] =  1
x2=  8
x2[2,1]=  8
``````

Values can also be modified using any of the above index notation:

``````x2[0, 0] = 12
print(x2)
``````
``````[[12  2  3]
[ 4  5  6]
[ 7  8  9]]
``````
``````x2 = 14
print(x2)
``````
``````[[14  2  3]
[ 4  5  6]
[ 7  8  9]]
``````

NOTE : Unlike Python lists, NumPy arrays have a fixed type. This means, for example, if you attempt to insert a floating-point value to an integer array, the value will be silently truncated.

i.e the float value gets converted to nearest int value.

``````x1 = np.ndarray(5, buffer = np.array([1,2,3,4,5]),dtype = int)
print(x1)
x1 = 5.7
print("x1 after changing : ",x1)
``````
``````[1 2 3 4 5]
x1 after changing :  [1 2 5 4 5]
``````

## Array Slicing and Subsetting : Accessing Subarrays

Just as we can use square brackets to access individual array elements, we can also use them to access subarrays with the slice notation, marked by the colon (:) character. The NumPy slicing syntax follows that of the standard Python list; to access a slice of an array x, use :

``````x[start:stop:step]
``````

NOTE : Default value of `start` is `0`, `stop` is `size of object` and `step` is `1`.

``````x = np.ndarray(10, buffer = np.array([0,1,2,3,4,5,6,7,8,9]),dtype = int)
print(x)
print("x[:] = ",x[:])
print("x[:5] = ",x[:5])
print("x[5:] = ",x[5:])
print("x[1:5] = ",x[1:5])
print("x[1:5:2] = ",x[1:5:2])
print("x[::-1] = ",x[::-1])
``````
``````[0 1 2 3 4 5 6 7 8 9]
x[:] =  [0 1 2 3 4 5 6 7 8 9]
x[:5] =  [0 1 2 3 4]
x[5:] =  [5 6 7 8 9]
x[1:5] =  [1 2 3 4]
x[1:5:2] =  [1 3]
x[::-1] =  [9 8 7 6 5 4 3 2 1 0]
``````
``````np.random.seed(0)  # seed for reproducibility
x2 = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(x2)
print("x2[:2, :3] = "); print(x2[:2, :3])  # first two rows, first three columns
print("x2[:3, ::2] = "); print(x2[:3, ::2]) # all rows, every other column
print("x2[::-1, ::-1] = "); print(x2[::-1, ::-1]) #reversed 2D array
print("x2[:, 0] = ",x2[:, 0])  # first column of x2
``````
``````[[1 2 3]
[4 5 6]
[7 8 9]]
x2[:2, :3] =
[[1 2 3]
[4 5 6]]
x2[:3, ::2] =
[[1 3]
[4 6]
[7 9]]
x2[::-1, ::-1] =
[[9 8 7]
[6 5 4]
[3 2 1]]
x2[:, 0] =  [1 4 7]
``````

## Joining Two Arrays

Joining of two arrays in NumPy, is primarily accomplished using the routine `np.concatenate`.

``````x = np.array([1,2,3])
y = np.array([4,5,6])
z = np.concatenate([x,y]) # Combines x and y to give one array.
# a = np.concatenate([x,y,z])
print(z)
# print(a)
``````
``````[1 2 3 4 5 6]
``````
``````x2 = np.array([[1,2,3],[2,3,4]])
y2 = np.array([[3,4,5],[4,5,6]])
z2 = np.concatenate([x2,y2])
print(z2)
``````
``````[[1 2 3]
[2 3 4]
[3 4 5]
[4 5 6]]
``````

# Arithmetic Operations on 2D arrays

Operator Equivalent Function
- np.subtract
* np.multiply
``````x2 = np.array([[1,2,3],[2,3,4]])
y2 = np.array([[3,4,5],[4,5,6]])
print("x2 + y2 = "); print(np.add(x2,y2))
print("x2 - y2 = "); print(np.subtract(x2,y2))
``````
``````x2 + y2 =
[[ 4  6  8]
[ 6  8 10]]
x2 - y2 =
[[-2 -2 -2]
[-2 -2 -2]]
``````
``````print(x2)
print(y2)
print("x2 * y2 = "); print(np.multiply(x2,y2))
``````
``````[[1 2 3]
[2 3 4]]
[[3 4 5]
[4 5 6]]
x2 * y2 =
[[ 3  8 15]
[ 8 15 24]]
``````

NOTE : Here as you can see, matrix multiplication isn’t actually possible and we actually just get a matrix where `x2[i,j] * y2[1,j]` is the output.

``````x2 = np.array([[1,2,3],[2,3,4]])
y2 = np.array([[3,4,5],[4,5,6]])
print(np.matmul(x2,y2))
# This gives an error. Why?
``````
``````---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-28-389e99023c8e> in <module>
1 x2 = np.array([[1,2,3],[2,3,4]])
2 y2 = np.array([[3,4,5],[4,5,6]])
----> 3 print(np.matmul(x2,y2))
4 # This gives an error. Why?

ValueError: shapes (2,3) and (2,3) not aligned: 3 (dim 1) != 2 (dim 0)
``````
``````x2 = np.array([[1,2,3],[2,3,4]])
y2 = np.array([[3,4],[4,5],[5,6]])
print(np.matmul(x2,y2))
``````
``````[[26 32]
[38 47]]
``````

## Covariance

Covariance indicates the level to which two variables vary together.

If we examine N-dimensional samples, `X = [x_1, x_2, ... x_N]^T`, then the covariance matrix element `C_{ij}` is the covariance of `x_i` and `x_j`. The element `C_{ii}` is the variance of `x_i`.

``````x2 = np.array([[0,1,2],[2,1,0]])
print(x2)
``````
``````[[0 1 2]
[2 1 0]]
``````

Note here how `x` increases and `x` decreases.

This is also shown by the covariance matrix :

``````print(np.cov(x2))
``````
``````[[ 1. -1.]
[-1.  1.]]
``````

Note that again, `C[0,1]` and `C[1,0]` which shows the correlation between `x` and `x`, is negative.

However `C[0,0]` and `C[1,1]` which show the correlation between `x` and `x` and `x` and `x` is 1.

Note how x and y are combined:

``````x = np.array([-2.1, -1,  4.3])
y = np.array([3,  1.1,  0.12])
X = np.stack((x, y))
print(np.cov(X))
# To check
print(np.cov(x))
print(np.cov(y))
``````
``````[[11.71       -4.286     ]
[-4.286       2.14413333]]
11.709999999999999
2.1441333333333334
``````

# Correlation

The term “correlation” refers to a mutual relationship or association between quantities. It is a standardised form of Covariance.

Correlation Coeffecients take values between `[-1,1]`

In Numpy (and in general), Correlation Matrix refers to the normalised version of a Covariance matrix.

``````x = np.array([-2.1, -1,  4.3])
y = np.array([3,  1.1,  0.12])
X = np.stack((x, y))
print(np.corrcoef(X))
``````
``````[[ 1.         -0.85535781]
[-0.85535781  1.        ]]
``````

Find more at my Github repository Explained.

Show some by ing it.