Python : NumPy Arrays

Let me show you how to create a NumPy array from the python object such as the list.

In [1]: my_list = [1,2,3]
In [2]: my_list
Out[3]: [1,2,3]

The above is a simple list output which I have created several times throughout these blog series. In order to work with NumPy arrays, you will need to import the NumPy library in order to proceed forward.

In [4]: import numpy as np
In [5]: arr = np.array(my_list)
In [6]: arr
Out[7]: array ( [1,2,3])

Now, we have created a one dimensional array by casting a normal python list into an array using NumPy. In order to create a two dimensional array we will have to cast a list of list within the NumPy array.

In [8]: my_mat = [[1,2,3],[4,5,6],[7,8,9]]
In [9]:np.array(my_mat)

If I want to generate a range of values I can use the np.arange method ( a range and not arrange ) which is a builtin value generation method within the NumPy library.

In [11]:np.arange(0,10)

In the above example 0 is the starting point and 10 is the ending point. So the values are 0 to 9 which accounts for 10 digits. If the starting point is 1 and ending point is 10 – how many digits are retrieved ? Leave your answers in the comment section below.

If we want to create a one dimensional array of numbers between 0 and 5 with 10 evenly spaced points between 0 and 5, we can make use of the np.linspace method.

In [13]:np.linspace(0,5,10)
Out[14]:array([ 0. , 0.55555556, 1.11111111, 1.66666667, 
        2.22222222, 2.77777778, 3.33333333, 3.88888889, 4.44444444, 5. ])
Though the above example looks like a 2 dimensional array, its actual one dimensional.There is only one square bracket in the output array.
Note : The difference between the third parameter in arange and linspace is that in arange the third parameter will be the step size and in linspace it will be the number of points you want between the start ( first parameter ) & end ( second parameter ).
Lets now create some random numbers using the NumPy arrays, this is the most useful and widely used feature to generate random numbers for working with or testing data science driven problems.

In [15]:np.random.rand(5,5)
Out[16]:array([[ 0.77137502,  0.28137699,  0.06032922,  0.86411406,  0.18949888],
       [ 0.82909673,  0.49042171,  0.80491907,  0.0319554 ,  0.96599774],
       [ 0.37228731,  0.77284343,  0.99227889,  0.29140901,  0.69968362],
       [ 0.51595908,  0.55724643,  0.11668673,  0.80466184,  0.53639437],
       [ 0.03324634,  0.31724955,  0.88069452,  0.78311061,  0.00182403]])

The distribution above is uniform, if in case we need normal distribution centered around 0 of randomly generated numbers we can make use of np.random.randn(3,3).

In [17]:np.random.randn(3,3)
Out[18]:array([[-1.73848932,  1.05867812, -0.18784838],
       [ 1.09207573,  0.59035732, -0.29759153],
       [ 0.50540417,  0.21646846, -0.40956973]])

In the above examples we generated random integers using a distrubution logic ( Uniform / Normal / Gaussian ). If we need 10 random integers between 0 to 100 we can make use of np.random.randint()

In [19]: np.random.randint(0,100,10)
Out[20]: array([69,35,34,43,19,18,79,93,71,82])
So we get 10 random integers between 0 ( included ) and 100 ( excluded ). If we want to reshape the array, we can use the reshape method as shown below :
In [21]: arr=np.arange(4)
In [22]: arr
Out[23]: array([0,1,2,3])
In [24]: arr.reshape(2,2)
Out[25]: array([[0,1],
Make sure that the reshape size fits the number of elements in your array and no element is missed out based on the size you choose. A reshape of (2,1) will not hold good as the 4th element will be left out in reshape.
If we want to find out the data type of the elements in our array, what would be the line of code ? Leave the answer in the comment section below.


Python : Introduction To NumPy

Num Pi or Num P as it is commonly referred to is a Linear Algebra library for python. Almost all of the libraries in the PyData ecosystem reply on NumPy as one of their main building blocks. NumPy is incredibly fast.

The NumPy library can be installed using pip install command. In case you are new to installing libraries in python, I would recommend you to visit this link to get a deeper understanding on how to install libraries in python.

  • NumPy arrays will be widely used while working with data.
  • NumPy arrays essential come in two flavors : vectors and matrices.
  • NumPy Vectors are strictly 1 dimensional arrays and matrices 2 dimensional.
  • A matrix however can have just one row or just one column.

This should be sufficient for you to get started with NumPy. Let’s get started with using NumPy arrays in python using jupyter notebook.

Python : Crash Course Part 12 – Methods

The most common and widely used feature of python programming for data science is methods. You will see the usage of methods quite often while you dive further into data science with python.

Lets go ahead and create a string :

In [1]: s=' hello! Welcome To Python Crash Course'
In [2]: s.lower()
Out[3]: 'hello! welcome to python crash course'

It’s so simple isn’t it ? The lower method on a string will convert all the characters to lower case. You can try the s.upper() method and see if the method changes your string to upper case. There are several methods of a string which can be found by pressing tab in jupyter notebook after the dot symbol ( s. ), you can try out various methods available.

Lets go over the split method.

In[4] : s.split()
Out[5]: ['hello!', 'Welcome', 'To', 'Python', 'Crash', 'Course']

The split functions splits the strings on the white space. If you need to split the dataset based out of ‘!’ then you can pass the ! mark within quotes and within the split function. The method will split the string where it finds the ! mark.

In [6]: s.split('!')
Out[7]: ['hello', 'Welcome To Python Crash Course']

You can try and use methods on strings, dictionaries and lists. Lets try a method on a list.

In [8]: list=[1,2,3]
In [9]: list.pop()
Out[10]: 3

The pop method removes the last component/element from the list permanently. If there is a need to pop a particular item of your choice, you have to target the items index and pass the value within the pop functional brackets.

In [11]: list
Out[12]: [1,2]

Here we come to an end of the python crash course for data science. These 12 modules / blog posts should help any beginner ( like me ) to gain fair bit of knowledge on python as we move along more complex codes within data science.

Leave a comment if you really enjoyed this crash course series.


Python : Crash Course Part 11 – Functions

We use the def keyword to define a function in python, followed by a function name and parameters. Let me show you a basic function in python :

In [1]: def my_func (param1) :
In [2]: my_func('hello')

Let’s see some more examples. We can use the + operator to concatenate the string ‘Hello’ with the input parameter which is name. In case we do not pass any parameter to the function, the output returned is Default Name

In [3]: def my_func (name='Default Name') :
               print('Hello' + name)
In [4]: my_func('Analytics')
        Hello Analytics
In [5]: my_func()
        Default Name

In case we have to return a value in a function after the computing the results, we can do that using the return statement within the function as shown below :

In [6]: def square(num): 
        This function squares a number
        return num**2
In [7]: output = square(2)
In [8]: output
Out[9]: 4

Note the use of double quote is used 3 times to enter comment text within the function. Now that we know the basics of functions, lets move on to map function.

Lets say we have a sequence :

In [10]: seq=[1,2,3,4,5]

If we have to lets say use the above function square to square every number in our sequence above, we can use the map function as shown below :

In [11]: map(square,seq)
Out[12]:<map at 0x17ce8b6d550>

The above map is now saved at a memory location. If we have to retrieve the output we will have to cast the map function to a list as shown below :

In [13]: list(map(square,seq))
Out[14]: [1,4,9,16,25]

If you have to rewrite the def function as we discussed above to a much simpler way of computing / calculating desired results we could use the lambda expression:

In [15]: t=lambda(var:var*2)
In [16]: t(6)
Out[17]: 12

But mostly we will be using the lambda expression in our map function as shown below:

In [18]: seq=[1,2,3]
In [19]: list(map(lambda var:var*3,seq))
Out[20]: [3,6,9]

In [19] statement basically multiplies numbers in the sequence ( seq ) by 3.

Lets take a look at the filter function, as the name suggests the function would retrieve values which are true/satisfied/met.

In [21]: list(filter(lambda: num:num%2 == 0,seq))
Out[22]: [2]

The modulus function above retrieving a 0 is true only for the integer 2 and false for 1 and 3, hence the output is integer 2.

Let’s move on to methods in python. If you have any questions or suggestions, please let me know in the comments section below.


Python : Crash Course Part 10 – Loops

Lets first create a sequence and use a FOR loop to print the elements in a sequence, that way it is easy to understand the looping concepts.

In [1]:seq=[1,2,3,4]
In [2]:for item in seq:

The temporary variable name ( item ) as shown above can be changed to num / any other variable name of your choice, it wouldn’t matter for python. So make a good choice of your own while choosing the temporary variable names.

In [3]:seq=[1,2,3,4]
In [4]:for item in seq:

The above code printed hello for every element in our sequence. Lets now discuss the WHILE loop in python.

In [5]:i = 1
       while i < 5:
           print('i is : {}'.format(i))
i is : 1
i is : 2
i is : 3
i is : 4

The above loops executes printing until the condition happens to be true. You can try some more examples on your own and let me know in the comments.

Python : Crash Course Part 9 – Statements

Python uses IF, ELIF and ELSE statements for processing results based on the condition passed within them. The condition is then followed by colon to pass/perform the desired action.

In [1]:if 1<2:
            print('perform this action')
       perform this action

Note that we are performing a print action in python and there is no python Out prompt here on the jupyter notebook.

Lets try some more examples, I have coded the operation statement in the same line, but as a best practice and easier understanding you can code the operation in the second line in your juypter notebook. You will notice the indentation happening automatically in jupyter notebook after pressing the enter key after the colon symbol.

In [2]:if 1<2 : x=2+2
In [3]:x

Lets now use multiple IF conditions and see how it works.

In [5]: if 1==1 : 
                 print ('First') 
           else : 
                 print ('Last')
Out[6]: First

If the condition is True ( 1==1) then the code will print First else it will print Last.

In [7]:if 1 < 2  : 
                print ('First') 
       elif 2==2 : 
                print ('Second')
       elif 3==3 :
                print ('Third')
       else :
                print ('Last')
Out[8]: Second

Notice what happened above, even though 3==3, it did not print Third as python stops the iteration when the first condition is true or is satisfied. In our example 2==2, which is true and hence Second was printed.

Try out some more examples using these statements and let me know if you found it easy to absorb.


Python : Crash Course Part 8 – Operators

Is 1 > 2 ? Is it True or False ? We know its False. Let’s see if python knows the answer

In [1]: 1>2
Out[2]: False
In [3]: 1<2
Out[4]: True
In [5]: 1==1
Out[6]: True

Note that equality is represented by two equal to signs, if we represent it by just one equal to sign then python thinks you are trying a variable assignment and it will error. Try it out and let me know what the error is.

Python also supports string equality checks:

In [7]: 'hi' == 'bye'
Out[8]: False

Lets discuss some of the logic operators in python ( AND, OR )

In  [9]: (1<2) and (2>3)
Out[10]: False

Notice that for the AND operator – both the conditions should be true for the result/output to be True, but for the OR operator any one of the conditions should be true for the output to be true.

In [11]: (1<2) or (2>3)
Out[12]: True

Now that we have some understanding on some Boolean operators, lets move on to statements ( IF, ELSE ) in the next part. Try out some more examples in your jupyter notebook and let me know if you face any errors, would be interesting to discuss them here.