How to Use Apache Zeppelin with SnappyData

Step 1: Download, Install and Configure SnappyData

Download and Install SnappyData
The table below lists the version of the SnappyData Zeppelin Interpreter and Apache Zeppelin Installer for the supported SnappyData Releases.

SnappyData Zeppelin Interpreter	Apache Zeppelin Binary Package	SnappyData Release
Version 0.7.3.5	Version 0.7.3	Release 1.1.0
Version 0.7.3.4	Version 0.7.3	Release 1.0.2.1
Version 0.7.3.2	Version 0.7.3	Release 1.0.2
Version 0.7.3	Version 0.7.3	Release 1.0.1
Version 0.7.2	Version 0.7.2	Release 1.0.0

Configure the SnappyData Cluster.
In lead node configuration set the following properties:
- Enable the SnappyData Zeppelin interpreter by adding -zeppelin.interpreter.enable=true
- In the classpath option, define the location where the SnappyData Interpreter is downloaded by adding
  -classpath=/<download_location>/snappydata-zeppelin-<version_number>.jar
- In the conf/spark-env.sh file, set the SPARK_PUBLIC_DNS property to the public DNS name of the lead node. This enables the Member Logs to be displayed correctly to users accessing the SnappyData Monitoring Console from outside the network.
Start the SnappyData cluster.
Extract the contents of the Zeppelin binary package.
Install the SnappyData Zeppelin interpreter in Apache Zeppelin by executing the following command from Zeppelin's bin directory:
```
./install-interpreter.sh --name snappydata --artifact io.snappydata:snappydata-zeppelin:<snappydata_interpreter_version_number>
```
Zeppelin interpreter allows the SnappyData interpreter to be plugged into Zeppelin using which, you can run queries.
Rename the zeppelin-site.xml.template file (located in zeppelin-<version_number>-bin-all/conf directory) to zeppelin-site.xml.
Edit the zeppelin-site.xml file:

In the zeppelin.interpreters property, add the following interpreter class names: org.apache.zeppelin.interpreter.SnappyDataZeppelinInterpreter,org.apache.zeppelin.interpreter.SnappyDataSqlZeppelinInterpreter
Download the predefined SnappyData notebooks notebook.tar.gz.
Extract and copy the contents of the notebook.tar.gz compressed file to the notebook folder in the Zeppelin installation on your local machine.
Start the Zeppelin daemon using the command:
bin/zeppelin-daemon.sh start
To ensure that the installation is successful, log into the Zeppelin UI (http://localhost:8080) from your web browser.

Step 2: Configure Interpreter Settings

Log on to Zeppelin from your web browser and select Interpreter from the Settings option.
Click Create to add an interpreter.
From the Interpreter group drop-down select snappydata.
Note

If snappydata is not displayed in the Interpreter group drop-down list, try the following options, and then restart Zeppelin daemon:
- Delete the interpreter.json file located in the conf directory (in the Zeppelin home directory).
- Delete the zeppelin-spark_<version_number>.jar file located in the interpreter/snappydata directory (in the Zeppelin home directory).
Click the Connect to existing process option. The fields Host and Port are displayed.
Specify the host on which the SnappyData lead node is executing, and the SnappyData Zeppelin Port (Default is 3768).

Property Default Values Description

Host localhost Specify host on which the SnappyData lead node is executing

Port 3768 Specify the Zeppelin server port

Configure the interpreter properties.
The table lists the properties required for SnappyData.

Property	Value	Description
default.url	jdbc:snappydata://localhost:1527/	Specify the JDBC URL for SnappyData cluster in the format `jdbc:snappydata://<locator_hostname>:1527`
default.driver	io.snappydata.jdbc.ClientDriver	Specify the JDBC driver for SnappyData
snappydata.connection	localhost:1527	Specify the `host:clientPort` combination of the locator for the JDBC connection
master	local[*]	Specify the URI of the spark master (only local/split mode)
zeppelin.jdbc.concurrent.use	true	Specify the Zeppelin scheduler to be used. Select True for Fair and False for FIFO

If required, edit other properties, and then click Save to apply your changes.

Note

You can modify the default port number of the Zeppelin interpreter by setting the property:
-zeppelin.interpreter.port=<port_number> in lead node configuration.

Additional Settings

Create a note and bind the interpreter by setting SnappyData as the default interpreter.
SnappyData Zeppelin Interpreter group consist of two interpreters. Click and drag <Interpreter_Name> to the top of the list to set it as the default interpreter.

Interpreter Name	Description
%snappydata.snappydata or %snappydata.spark	This interpreter is used to write Scala code in the paragraph. SnappyContext is injected in this interpreter and can be accessed using variable snc
%snappydata.sql	This interpreter is used to execute SQL queries on the SnappyData cluster. It also has features of executing approximate queries on the SnappyData cluster.

Click Save to apply your changes.

Known Issue

If you are using SnappyData Zeppelin Interpreter 0.7.1 and Zeppelin Installer 0.7 with SnappyData 0.8 or future releases, the approximate result does not work on the sample table, when you execute a paragraph with the %sql show-instant-results-first directive.