Book Image

40 Algorithms Every Programmer Should Know

By : Imran Ahmad
5 (2)
Book Image

40 Algorithms Every Programmer Should Know

5 (2)
By: Imran Ahmad

Overview of this book

Algorithms have always played an important role in both the science and practice of computing. Beyond traditional computing, the ability to use algorithms to solve real-world problems is an important skill that any developer or programmer must have. This book will help you not only to develop the skills to select and use an algorithm to solve real-world problems but also to understand how it works. You’ll start with an introduction to algorithms and discover various algorithm design techniques, before exploring how to implement different types of algorithms, such as searching and sorting, with the help of practical examples. As you advance to a more complex set of algorithms, you'll learn about linear programming, page ranking, and graphs, and even work with machine learning algorithms, understanding the math and logic behind them. Further on, case studies such as weather prediction, tweet clustering, and movie recommendation engines will show you how to apply these algorithms optimally. Finally, you’ll become well versed in techniques that enable parallel processing, giving you the ability to use these algorithms for compute-intensive tasks. By the end of this book, you'll have become adept at solving real-world computational problems by using a wide range of algorithms.
Table of Contents (19 chapters)
1
Section 1: Fundamentals and Core Algorithms
7
Section 2: Machine Learning Algorithms
13
Section 3: Advanced Topics

Introducing Python packages

Once designed, algorithms need to be implemented in a programming language as per the design. For this book, I chose the programming language Python. I chose it because Python is a flexible and open source programming language. Python is also the language of choice for increasingly important cloud computing infrastructures, such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP).

The official Python home page is available at https://www.python.org/, which also has instructions for installation and a useful beginner's guide. 

If you have not used Python before, it is a good idea to browse through this beginner's guide to self-study. A basic understanding of Python will help you to better understand the concepts presented in this book.

For this book, I expect you to use the recent version of Python 3. At the time of writing, the most recent version is 3.7.3, which is what we will use to run the exercises in this book.

Python packages

Python is a general-purpose language. It is designed in a way that comes with bare minimum functionality. Based on the use case that you intend to use Python for, additional packages need to be installed. The easiest way to install additional packages is through the pip installer program. This pip command can be used to install the additional packages:

pip install a_package

The packages that have already been installed need to be periodically updated to get the latest functionality. This is achieved by using the upgrade flag:

pip install a_package --upgrade

Another Python distribution for scientific computing is Anaconda, which can be downloaded from http://continuum.io/downloads.

In addition to using the pip command to install new packages, for Anaconda distribution, we also have the option of using the following command to install new packages:

conda install a_package

To update the existing packages, the Anaconda distribution gives us the option to use the following command:

conda update a_package

There are all sorts of Python packages that are available. Some of the important packages that are relevant for algorithms are described in the following section.

The SciPy ecosystem

Scientific Python (SciPy)—pronounced sigh pie—is a group of Python packages created for the scientific community. It contains many functions, including a wide range of random number generators, linear algebra routines, and optimizers. SciPy is a comprehensive package and, over time, people have developed many extensions to customize and extend the package according to their needs.

The following are the main packages that are part of this ecosystem:

  • NumPy: For algorithms, the ability to create multi-dimensional data structures, such as arrays and matrices, is really important. NumPy offers a set of array and matrix data types that are important for statistics and data analysis. Details about NumPy can be found at http://www.numpy.org/.

  • scikit-learn: This machine learning extension is one of the most popular extensions of SciPy. Scikit-learn provides a wide range of important machine learning algorithms, including classification, regression, clustering, and model validation. You can find more details about scikit-learn at http://scikit-learn.org/.

  • pandas: pandas is an open source software library. It contains the tabular complex data structure that is used widely to input, output, and process tabular data in various algorithms. The pandas library contains many useful functions and it also offers highly optimized performance. More details about pandas can be found at http://pandas.pydata.org/.

  • Matplotlib: Matplotlib provides tools to create powerful visualizations. Data can be presented as line plots, scatter plots, bacharts, histograms, pie charts, and so on. More information can be found at https://matplotlib.org/.

  • Seaborn: Seaborn can be thought of as similar to the popular ggplot2 library in R. It is based on Matplotlib and offers an advanced interface for drawing brilliant statistical graphics. Further details can be found at https://seaborn.pydata.org/.

  • iPython: iPython is an enhanced interactive console that is designed to facilitate the writing, testing, and debugging of Python code. 

  • Running Python programs: An interactive mode of programming is useful for learning and experimenting with code. Python programs can be saved in a text file with the .py extension and that file can be run from the console.

Implementing Python via the Jupyter Notebook

Another way to run Python programs is through the Jupyter Notebook. The Jupyter Notebook provides a browser-based user interface to develop code. The Jupyter Notebook is used to present the code examples in this book. The ability to annotate and describe the code with texts and graphics makes it the perfect tool for presenting and explaining an algorithm and a great tool for learning.

To start the notebook, you need to start the Juypter-notebook process and then open your favorite browser and navigate to http://localhost:8888:

Note that a Jupyter Notebook consists of different blocks called cells