Working with External Data Sources

SnappyData relies on the Spark SQL Data Sources API to parallelly load data from a wide variety of sources. Any data source or database that supports Spark to load or save state can be accessed from within SnappyData.

There is built-in support for many data sources as well as data formats. You can access data from sources such as S3, file system, HDFS, Hive, and RDB. The loaders have built-in support to handle data formats such as CSV, Parquet, ORC, Avro, JSON, and Java/Scala Objects.

Attention

This section currently only details the advanced connectors that SnappyData introduced. Refer to the howto section for a brief description about working with external data sources and some examples.

SnappyData provides a utility to deploy third-party connectors using the SQL Deploy command. Refer Deployment of Third Party Connectors

For more information see: