Sunitha Kambhampati is an advisory software engineer in IBM analytics. She is working on Spark SQL in the open source community as part of Spark Technology Center.
Apache Spark™ provides a pluggable mechanism to integrate with external data sources using the DataSource APIs. These APIs allow Spark to read data from external data sources and also for data that is analyzed in Spark to be written back out to the external data sources. The DataSource APIs also support filter pushdowns and column pruning that can significantly improve the performance of queries.... Read More