Book Image

40 Algorithms Every Programmer Should Know

By : Imran Ahmad
5 (2)
Book Image

40 Algorithms Every Programmer Should Know

5 (2)
By: Imran Ahmad

Overview of this book

Algorithms have always played an important role in both the science and practice of computing. Beyond traditional computing, the ability to use algorithms to solve real-world problems is an important skill that any developer or programmer must have. This book will help you not only to develop the skills to select and use an algorithm to solve real-world problems but also to understand how it works. You’ll start with an introduction to algorithms and discover various algorithm design techniques, before exploring how to implement different types of algorithms, such as searching and sorting, with the help of practical examples. As you advance to a more complex set of algorithms, you'll learn about linear programming, page ranking, and graphs, and even work with machine learning algorithms, understanding the math and logic behind them. Further on, case studies such as weather prediction, tweet clustering, and movie recommendation engines will show you how to apply these algorithms optimally. Finally, you’ll become well versed in techniques that enable parallel processing, giving you the ability to use these algorithms for compute-intensive tasks. By the end of this book, you'll have become adept at solving real-world computational problems by using a wide range of algorithms.
Table of Contents (19 chapters)
1
Section 1: Fundamentals and Core Algorithms
7
Section 2: Machine Learning Algorithms
13
Section 3: Advanced Topics

Case study: movie review sentiment analysis 

Let's use NLP to conduct a movie review sentiment analysis. For this, we will use some open source movie review data available at http://www.cs.cornell.edu/people/pabo/movie-review-data/:

  1. First, we will import the dataset that contains the movie reviews:

import numpy as np
import pandas as pd
  1. Now, let us load the movies' data and print the first few rows to observe its structure. 

df=pd.read_csv("moviereviews.tsv",sep='\t')
df.head()

Note that the dataset has 2000 movie reviews. Out of these, half are negative and half are positive.

  1. Now, let's start preparing the dataset for training the model. First, let us drop any missing values that are in the data

df.dropna(inplace=True)
  1. Now we need to remove the whitespaces as well. Whitespaces are not null but need to be removed. For...