Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On Data Analysis with NumPy and pandas

You're reading from   Hands-On Data Analysis with NumPy and pandas Implement Python packages from data manipulation to processing

Arrow left icon
Product type Paperback
Published in Jun 2018
Publisher Packt
ISBN-13 9781789530797
Length 168 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Curtis Miller Curtis Miller
Author Profile Icon Curtis Miller
Curtis Miller
Arrow right icon
View More author details
Toc

Exploring alternatives to Jupyter

Now we will consider alternatives to Jupyter Notebooks. We will look at:

  • Jupyter QT Console
  • Spyder
  • Rodeo
  • Python interpreter
  • ptpython

The first alternative we will consider is the Jupyter QT Console; this is a Python interpreter with added functionality, aimed specifically for data analysis.

The following screenshot shows the Jupyter QT Console:

It is very similar to the Jupyter Notebook. In fact, it is effectively the Console version of the Jupyter Notebook. Notice here that we have some interesting syntax. We have In [1], and then let's suppose you were to type in a command, for example:

print ("Hello, world!")

We see some output and then we see In [2].

Now let's try something else:

1 + 1

Right after In [2], we see Out[2]. What does this mean? This is a way to track historical commands and their outputs in a session. To access, say, the command for In [42], we type _i42. So, in this case, if we want to see the input for command 2, we type in i2. Notice that it gives us a string, 1 + 1. In fact, we can run this string.

If we type in eval and then _i2, notice that it gives us the same output as the original command, In [2], did. Now, how about Out[2]? How can we access the actual output? In this case, all we would do is just _ and then the number of the output, say 2. This should give us 2. So this gives you a more convenient way to access historical commands and their outputs.

Another advantage of Jupyter Notebooks is that you can see images. For example, let's get Matplotlib running. First we're going to import Matplotlib with the following command:

import matplotlib.pyplot as plt
  

After we've imported Matplotlib, recall that we need to run a certain magic, the Matplotlib magic:

%matplotlib inline
  

We need to give it the inline parameter, and now we can create a Matplotlib figure. Notice that the image shows up right below the command. When we type in _8, it shows that a Matplotlib object was created, but it does not actually show the plot itself. As you can see, we can use the Jupyter console in a more advanced way than the typical Python console. For example, let's work with a dataset called Iris; import it using the following line:

from sklearn.datasets import load_iris
  

This is a very common dataset used in data analysis. It's often used as a way to evaluate training models. We will also use k-means clustering on this:

from sklearn.cluster import KMeans
  

The load_Iris function isn't actually the Iris dataset; it is a function that we can use to get the Iris dataset. The following command will actually give us access to that dataset:

iris  = load_iris()
  

Now we will train a k-means clustering scheme on this dataset:

iris_clusters = KMeans(n_clusters = 3, init =  "random").fit(iris.data)
  

We can see the documentation right away when we're typing in a function. For example, I know what the end clusters parameter means; it is actually the original doc string from the function. Here, I want the number of clusters to be 3, because I know that there are actually three real clusters in this dataset. Now that a clustering scheme has been trained, we can plot it using the following code:

plt.scatter(iris.data[:, 0], iris.data[:, 1], c = iris_clusters.labels_)
  

Spyder

Spyder is an IDE unlike the Jupyter Notebook or the Jupyter QT Console. It integrates NumPy, SciPy, Matplotlib, and IPython. It is extensible with plugins, and it is included with Anaconda.

The following screenshot shows Spyder, an actual IDE intended for data analysis and scientific computing:

Spyder Python 3.6

On the right, you can go to File explorer to search for new files to load. Here, we want to open up iris_kmeans.py. This is a file that contains all the commands that we used before in the Jupyter QT Console. Notice on the right that the editor has a console; that is in fact the IPython console, which you saw as the Jupyter QT Console. We can run this entire file by clicking on the Run tab. It will run in the console, shown as follows:

The following screenshot will be the output:

Notice that at the end we see the result of the clustering that we saw before. We can type in commands interactively as well; for example, we can make our computer say Hello, world!.

In the editor, let's type in a new variable, let's say n = 5. Now let's run this file in the editor. Notice that n is a variable that the editor is aware of. Now let's make a change, say n = 6. Unless we were to actually run this file again, the console will be unaware of the change. So if I were to type n in the console again, nothing changes, and it's still 5. You would need to run this line in order to actually see a change.

We also have a variable explorer where we can see the values of variables and change them. For example, I can change the value of n from 6 to 10, shown as follows:

The following screenshot shows the output:

Then, when I go to the console and ask what n is, it will say 10:

n
10
  

That concludes our discussion of Spyder.

Rodeo

Rodeo is a Python IDE developed by Yhat, and is intended for data analysis applications exclusively. It is intended to emulate the RStudio IDE, which is popular among R users, and it can be downloaded from Rodeo's website. The only advantage of the base Python interpreter is that every Python installation includes it, shown as follows:

ptpython

What may be a lesser known console-based Python REPL is ptpython, designed by Jonathan Slenders. It exists only in the console and is an independent project by him. You can find it on GitHub. It has lightweight features, yet it also includes syntax highlighting, autocompletion, and even IPython. It can be installed with the following command:

pip install ptpython
  

That concludes our discussion on alternatives to the Jupyter Notebooks.

You have been reading a chapter from
Hands-On Data Analysis with NumPy and pandas
Published in: Jun 2018
Publisher: Packt
ISBN-13: 9781789530797
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image