Spark Cookbook
上QQ阅读APP看书,第一时间看更新

Chapter 3. External Data Sources

One of the strengths of Spark is that it provides a single runtime that can connect with various underlying data sources.

In this chapter, we will connect to different data sources. This chapter is pided into the following recipes:

  • Loading data from the local filesystem
  • Loading data from HDFS
  • Loading data from HDFS using a custom InputFormat
  • Loading data from Amazon S3
  • Loading data from Apache Cassandra
  • Loading data from relational databases