Tableau does not read JSON data natively. By using Apache Drill, you can use Tableau to point at JSON data and start analysing it.
Apache Drill is an open source software framework that supports data intensive distributed applications for interactive analysis of large-scale datasets – Wikipedia.
The Apache drill version I am using does not require a lot of infrastructure such as a Hadoop cluster platform behind it. The version I am using is the Apache Drill for Windows. The installation and connection to JSON data into Tableau is quite simple and is done in my laptop.
Apache Drill is a very powerful open source framework and in this blog post I am using Apache Drill on Windows simply to connect my Tableau into JSON data. Yes, this only works in Windows operating system. Sorry non-Windows OS users, you can go here to install Drill on Linux and Mac OS X.
Below are simple, easy steps to get you connecting Tableau into JSON data and start analysing it.
If your computer does not have JAVA installed, please follow this section. If already installed, skip to the next section of Installing Apache Drill on Windows.
1. Download the latest JDK from Java SE Downloads
2. Install Java
3. Once installed, find the folder in which your Java is installed. Mine is located in C:\Program Files\Java\jdk1.8.0_45
4. Copy the location of your Java installation directory into a clipboard – mine is C:\Program Files\Java\jdk1.8.0_45
5. In the Windows search, I typed in System Variables and this should open the System Properties window
- Click on the Advanced tab
- Click on the Environment Variables button
- Under System variables, click the New button
- Enter Variable Name = JAVA_HOME
- Variable Value = the location of your Java, mine is C:\Program Files\Java\jdk1.8.0_45
INSTALLING APACHE DRILL ON WINDOWS
Source: Installing Apache Drill on Windows
1. Download the latest version of Apache Drill from Latest version of Apache Drill
2. Move the apache-drill-1.1.0.tar.gz file to a directory where you want to install Drill. I put mine in C:\Program Files\
3. Unzip the TAR.GZ file using a third party tool. The extraction process creates the installation directory named apache-drill-1.1.0 containing the Drill software. For example:
STARTING DRILL ON WINDOWS
1. Open Command Prompt.
2. Open the apache-drill-1.1.0 folder. For example, my Apache Drill is located on C:\Program Files\apache-drill-1.1.0
Therefore, in my command prompt, I typed in cd C:\Program Files\apache-drill-1.1.0
3. Go to the bin directory. For example, I typed cd bin
4. Type the following command on the command line: sqlline.bat -u “jdbc:drill:zk=local”
INSTALLING MapR Drill ODBC DRIVER
Source: Installing MapR Drill ODBC Driver
1. Download the MapR Drill ODBC Driver from here
I installed the 64-bit driver as my computer is Windows 64-bit.
2. Double-click the installer from the location within which you downloaded the file.
- Click Next
- Select the check box to accept the terms of the License Agreement and click Next
- Verify or change the install location. Then, click Next – I didn’t change the install location, I leave it to the default one.
- Click Install
- When the installation completes, click Finish
3. Lets verify the installation. In the Start menu, type in ODBC and select the 64-bit ODBC Administrator to open ODBC Administrator.
4. Once open, click on Drivers tab and verify that the MapR Drill ODBC Driver appears in the list of drivers that are installed on your computer.
5. Let’s test the drivers. Whilst still having your ODBC Administrator open, select the System DSN tab.
- Click the Add button
- Select the MapR Drill ODBC Driver
- In the Data Source Name, enter MapR ODBC Driver for Drill DSN
- Click on the Test button on the bottom and you should see a SUCCESS! message as per the below. Click OK several times to close the ODBC Administrator
STARTING DRILL EXPLORER ON WINDOWS
1. If your ODBC Administrator is not yet open, go to Start menu and type in ODBC Administrator and click Enter to open. Select System DSN tab and double click on MapR ODBC Driver for Drill DSN.
2. Click on the Drill Explorer button to open Drill Explorer.
3. Expand the plus sign next to dfs_default until you get into a folder for your sample JSON data. My own sample JSON data is located in C:\Data\JSON so I browse from dfs_default\Data\JSON
4. Click on the sample JSON file – mine is my_json.json and I can see the preview of the JSON data.
5. Select the SQL tab (next to Browse tab) and click Preview button. You should see the preview of JSON data. Copy the SQL query into the clipboard.
6. Open Tableau, select connecting via ODBC Driver. Once the ODBC Driver is opened, connect using DSN and select your MapR ODBC Driver for Drill DSN that you created earlier. Click Connect. Click OK.
7. Click on the Schema drop down menu to select a schema, click on the magnifying glasses to show all schema, select dfs.default schema.
8. Double Click on the New Custom SQL option to open a custom SQL window.
9. Paste the SQL Query that you copied from the Drill Explorer in Step 5 above. Click on the Preview Results button.
10. This is where I spent a bit of time noting down which individual column fields I would like to include in my SQL query. Unfortunately, you have to explicitly define the individual fields you would like to bring into Tableau. I noted in my sample JSON data that I would like to bring 8 fields – address, balance, company, eyeColor, favoriteFruit, gender, name, age.
So I close the preview data window, and modify my Tableau Custom SQL Query into the below then I clicked the Preview Results button.
SELECT address, balance, company, eyeColor, favoriteFruit, gender, name, age FROM `dfs`.`default`.`./Data/JSON/my_json.json`
11. Close the Preview Results window and Click OK to close my Edit Custom SQL window. You can now can see the selected fields in the Tableau data explorer.
12. Select Go to Worksheet and start using the fields from my JSON sample data.
13. Create a Tableau Data Extract to create a portable offline copy in Tableau inline columnar format. This is to avoid the ODBC driver limitation. In my Tableau Data Window, I right click into my Custom SQL_Query data source then select Extract Data.
14. Continue with creating an extract and save the extract in a desired location in your computer.
15. Now you can start analysing your JSON data in Tableau. Voila!!
JSON documentation – https://drill.apache.org/docs/
Using Apache Drill with Tableau Desktop 9 – https://drill.apache.org/docs/using-apache-drill-with-tableau-9-desktop/