The Role of Python in AI
In order to put basic AI concepts into practice, we need a programming language that supports AI. In this book, we have chosen Python. There are a few reasons why Python is such a good choice for AI:
- Convenience and Compatibility: Python is a high-level programming language. This means that you don't have to worry about memory allocation, pointers, or machine code in general. You can write code in a convenient fashion and rely on Python's robustness. Python is also cross-platform compatible.
- Popularity: The strong emphasis on developer experience makes Python a very popular choice among software developers. In fact, according to a 2018 developer survey by https://www.hackerrank.com, across all ages, Python ranks as the number one preferred language of software developers. This is because Python is easily readable and simple. Therefore, Python is great for rapid application development.
- Efficiency: Despite being an interpreted language, Python is comparable to other languages that are used in data science, such as R. Its main advantage is memory efficiency, since Python can handle large, in-memory databases.
Note
Python is a multi-purpose language. It can be used to create desktop applications, database applications, mobile applications, and games. The network programming features of Python are also worth mentioning. Furthermore, Python is an excellent prototyping tool.
Why Is Python Dominant in Machine Learning, Data Science, and AI?
To understand the dominant nature of Python in machine learning, data science, and AI, we have to compare Python to other languages that are also used in these fields.
Compared to R, which is a programming language built for statisticians, Python is much more versatile and easy as it allows programmers to build a diverse range of applications, from games to AI applications.
Compared to Java and C++, writing programs in Python is significantly faster. Python also provides a high degree of flexibility.
There are some languages that are similar in nature when it comes to flexibility and convenience: Ruby and JavaScript. Python has an advantage over these languages because of the AI ecosystem that's available for Python. In any field, open source, third-party library support vastly determines the success of that language. Python's third-party AI library support is excellent.
Anaconda in Python
We installed Anaconda in the Preface. Anaconda will be our number one tool when it comes to experimenting with AI.
Anaconda comes with packages, IDEs, data visualization libraries, and high-performance tools for parallel computing in one place. Anaconda hides configuration problems and the complexity of maintaining a stack for data science, machine learning, and AI. This feature is especially useful in Windows, where version mismatches and configuration problems tend to arise the most.
Anaconda comes with Jupyter Notebook, where you can write code and comments in a documentation style. When you experiment with AI features, the flow of your ideas resembles an interactive tutorial where you run each step of your code.
Note
IDE stands for Integrated Development Environment. While a text editor provides some functionalities to highlight and format code, an IDE goes beyond the features of text editors by providing tools to automatically refactor, test, debug, package, run, and deploy code.
Python Libraries for AI
The list of libraries presented here is not complete as there are more than 700 available in Anaconda. However, these specific ones will get you off to a good start because they will give you a good foundation to be able to implement the fundamental AI algorithms in Python:
- NumPy: NumPy is a computing library for Python. As Python does not come with a built-in array data structure, we have to use a library to model vectors and matrices efficiently. In data science, we need these data structures to perform simple mathematical operations. We will use NumPy extensively in future chapters.
- SciPy: SciPy is an advanced library containing algorithms that are used for data science. It is a great complementary library to NumPy because it gives you all the advanced algorithms you need, whether it be a linear algebra algorithm, image processing tool, or a matrix operation.
- pandas: pandas provides fast, flexible, and expressive data structures, such as one-dimensional series and two-dimensional DataFrames. It efficiently loads, formats, and handles complex tables of different types.
- scikit-learn: scikit-learn is Python's main machine learning library. It is based on the NumPy and SciPy libraries. scikit-learn provides you with the functionality required to perform both classification and regression, data preprocessing, as well as supervised and unsupervised learning.
- NLTK: We will not deal with NLP in this book, but NLTK is still worth mentioning because this library is the main natural language toolkit of Python. You can perform classification, tokenization, stemming, tagging, parsing, semantic reasoning, and many other operations using this library.
- TensorFlow: TensorFlow is Google's neural network library, and it is perfect for implementing deep learning AI. The flexible core of TensorFlow can be used to solve a vast variety of numerical computation problems. Some real-world applications of TensorFlow include Google voice recognition and object identification.
A Brief Introduction to the NumPy Library
The NumPy library will play a major role in this book, so it is worth exploring it further.
After launching your Jupyter Notebook, you can simply import numpy
as follows:
import numpy as np
Once numpy
has been imported, you can access it using its alias, np
. NumPy contains the efficient implementation of some data structures, such as vectors and matrices.
Let's see how we can define vectors and matrices:
np.array([1,3,5,7])
The expected output is this:
array([1, 3, 5, 7])
We can declare a matrix using the following syntax:
A = np.mat([[1,2],[3,3]]) A
The expected output is this:
matrix([[1, 2], [3, 3]])
The array
method creates an array data structure, while .mat
creates a matrix.
We can perform many operations with matrices. These include addition, subtraction, and multiplication. Let's have a look at these operations here:
Addition in matrices:
A + A
The expected output is this:
matrix([[2, 4], [6, 6]])
Subtraction in matrices:
A - A
The expected output is this:
matrix([[0, 0], [0, 0]])
Multiplication in matrices:
A * A
The expected output is this:
matrix([[ 7, 8], [12, 15]])
Matrix addition and subtraction work cell by cell.
Matrix multiplication works according to linear algebra rules. To calculate matrix multiplication manually, you have to align the two matrices, as follows:
To get the (i,j)th element of the matrix, you compute the dot (scalar) product on the ith row of the matrix with the jth column. The scalar product of two vectors is the sum of the product of their corresponding coordinates.
Another frequent matrix operation is the determinant of the matrix. The determinant is a number associated with square matrices. Calculating the determinant using NumPy's linalg
function (linear algebra algorithms) can be seen in the following line of code:
np.linalg.det( A )
The expected output is this:
-3.0000000000000004
Technically, the determinant can be calculated as 1*3 – 2*3 = -3
. Notice that NumPy calculates the determinant using floating-point arithmetic, so the accuracy of the result is not perfect. The error is due to the way floating points are represented in most programming languages.
We can also transpose a matrix, as shown in the following line of code:
np.matrix.transpose(A)
The expected output is this:
matrix([[1, 3], [2, 3]])
When calculating the transpose of a matrix, we flip its values over its main diagonal.
NumPy has many other important features, so we will use it in most of the chapters in this book.
Exercise 1.01: Matrix Operations Using NumPy
We will be using Jupyter Notebook and the following matrix to solve this exercise.
We will calculate the square of the matrix, which is determinant of the matrix and the transpose of the matrix shown in the following figure, using NumPy:
The following steps will help you to complete this exercise:
- Open a new Jupyter Notebook file.
- Import the
numpy
library asnp
:import numpy as np
- Create a two-dimensional array called
A
for storing the[[1,2,3],[4,5,6],[7,8,9]]
matrix usingnp.mat
:A = np.mat([[1,2,3],[4,5,6],[7,8,9]]) A
The expected output is this:
matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
Note
If you have created an
np.array
instead ofnp.mat
, the solution for the array multiplication will be incorrect. - Next, we perform matrix multiplication using the asterisk and save the result in a variable called
matmult
, as shown in the following code snippet:matmult = A * A matmult
The expected output is this:
matrix([[ 30, 36, 42], [ 66, 81, 96], [102, 126, 150]])
- Next, manually calculate the square of
A
by performing matrix multiplication. For instance, the top-left element of the matrix is calculated as follows:1 * 1 + 2 * 4 + 3 * 7
The expected output is this:
30
- Use
np.linalg.det
to calculate the determinant of the matrix and save the result in a variable calleddet
:det = np.linalg.det( A ) det
The expected output (might vary slightly) is this:
0.0
- Use
np.matrix.transpose
to get the transpose of the matrix and save the result in a variable calledtranspose
:transpose = np.matrix.transpose(A) transpose
The expected output is this:
matrix([[1, 4, 7], [2, 5, 8], [3, 6, 9]])
If
T
is the transpose of matrixA
, thenT[j][i]
is equal toA[i][j]
.Note
To access the source code for this specific section, please refer to https://packt.live/316Vd6Z.
You can also run this example online at https://packt.live/2BrogHL. You must execute the entire Notebook in order to get the desired result.
By completing this exercise, you have seen that NumPy comes with many useful features for vectors, matrices, and other mathematical structures.
In the upcoming section, we will be implementing AI in an interesting tic-tac-toe game using Python.