Summary
In this chapter, you have learned how to use Zeppelin to query parquet files and display some charts. Then, you developed a small program to stream transaction data from a WebSocket to a Kafka topic. Finally, you used Spark Streaming inside Zeppelin to query the data arriving in the Kafka topic in real time.
With all these building blocks in place, you have all the tools to analyze the Bitcoin transaction data in more detail. You could let the BatchProducerApp
run for several days or weeks to get some historical data. With the help of Zeppelin and Spark, you could then try to detect patterns and come up with a trading strategy. Finally, you could then use a Spark Streaming flow to detect in real time when some trading signal arises and perform a transaction automatically.
We have produced streaming data on only one topic, but it would be quite straightforward to add other topics covering other currency pairs, such as BTC/EUR or BTC/ETH. You could also create another program that fetches...