TABLUM.IO
.:.

How to Easily Flatten JSON data into a SQL Database and Visualize It with DoubleCloud

This article demonstrates a fast and effortless way to flatten JSON data, ingest it into a relational database, and visualize it in DoubleCloud, without requiring any coding skills. This method is not limited to JSON datasets; it can also be used with CSV, XML, and other unstructured and semi-structured data formats.

Dealing with messy and unstructured data is a common issue when working with data analytics. This is where TABLUM.IO SaaS tool can be a game-changer. By simply ingesting any file, whether it is stored locally or remotely, TABLUM.IO automatically transforms it into an analytics-ready SQL database, making it easy to share and analyze data with any BI software like DoubleCloud. With TABLUM.IO, you can streamline your data analytics workflow and quickly go from raw flat files to creating gorgeous charts and dashboards with DoubleCloud. In this article, I will show you how it's done.

This diagram depicts the steps we will take:

  1. Importing the data from the source into TABLUM.IO to convert it into a relational database.
  2. Transforming and cleansing the data in TABLUM.IO.
  3. Connect DoubleCloud to the TABLUM.IO account using a ClickHouse connector and visualize the data in DoubleCloud for easy analysis and presentation.


Data Source

The data source for our example is an API endpoint provided by Wikimedia, which contains metrics related to the number of pageviews for a specific Wikipedia article. Let take the one that has recently become popular — “Silicon Valley Bank” page.
The data is broken down into daily counts, and the URL specifies the time range for which the data is requested (from September 1, 2021, to March 12, 2023).
The data is returned in a JSON format:


Data Import (Data Ingestion)

To initiate the process of importing data via URL into TABLUM.IO, click on the dropdown menu and choose ‘URL Downloader’:

Copy and paste the URL of the data source into the designated field and click the ‘Run Query’ button to import the data into TABLUM.IO. TABLUM.IO will automatically convert the dataset into a ready-to-use SQL table:

After that, click on the ‘Save’ button to store the dataset in TABLUM.IO:


Data Transformation and Data Cleansing

We want to analyze pageviews of the article ‘Silicon Valley Bank’ over time, including the period when the bank went bankrupt on March 10, 2023. So, we are interested in two columns in our table: Date (timestamp) and Views (views):

We want to focus on the information we need by deleting all unnecessary columns. To achieve this, first, we will create a new view (SQL table) in TABLUM.IO by clicking on the ‘Plus’ button:

A new view has been created for us to continue working with our dataset:

Now we want to access the first view’s dataset by selecting ‘Ingested Data’ from the dropdown menu:

We can either press ‘Play’ icon next to the V_1001 view or just refer to it in the SQL SELECT query as DS_.V_1001 (DS_ — the current dataset, V_1001 — the first view in this dataset)

Let’s write a SQL query to select necessary columns from the first view:
SELECT
timestamp,
views
FROM
DS_.V_1001

After selecting the necessary columns, we get a new SQL table containing the required information:

However, the Date column is currently in an integer format and is not readable as a date. To fix this, we will change timestamp to parseDateTime32BestEffort(toString(timestamp/100)) that will remove trailing zeros and parse integer timestamp to DateTime field.
SELECT
parseDateTime32BestEffort(toString(timestamp/100)),
views
FROM
DS_.V_1001

Now we are happy with our result table:

As you may have noticed, TABLUM.IO performs many transformations and type recognitions automatically.

If you right-click on the header of the Date column, you can access the column metadata, which includes information such as the total number of values, the first and last date, the interval in days, the number of unique values, and the number of null cells.


Connecting DoubleCloude to TABLUM.IO (ClickHouse)

Once the data is ingested, you can share it through the common ClickHouse interface (over https). To do this, go to the “Ingested Data” tab and click on the “Enable DB Sharing” button to obtain credentials for your account:

The ClickHouse connection parameters:

  1. Host: go.tablum.io
  2. Port: 8443
  3. User: your username
  4. Password: your password
  5. Database: your db name

Next, navigate to your DoubleCloud account to connect it to TABLUM.IO. Create workbook, and in the Connection tab select ‘ClickHouse’:

Select ‘Databases’ and then ‘Add a Database’. From the dropdown menu, choose ‘ClickHouse’ to input TABLUM.IO (ClickHouse) credentials. Then check the connection, and click ‘Create Connection’ to TABLUM.IO:

Finally, synchronize the data. Now that the data stored in TABLUM.IO can be accessed through DoubleCloud, you can visualize it and gain deeper insights into your data:

Let’s explore our new dataset in more detail by going to the ‘Fields’ tab in DoubleCloud. Here, you’ll see that TABLUM.IO has automatically recognized the data types without any additional input from the user. It has accurately identified all the data types, including float, integer, text, and date/time. Additionally, TABLUM.IO has generated the data schema automatically, allowing you to quickly transform raw data into structured data that’s ready for analysis:


Final thoughts

TABLUM.IO is an invaluable tool for data engineers and analysts looking to quickly transform raw data files into SQL databases. With TABLUM.IO, there’s no need to spend time implementing complex programming solutions to parse, load, and clean data from a variety of formats such as CSV, TSV, JSON, or XML files, or Google Sheets. Instead, you can get your data ready for analysis and visualization in just a few seconds.
By importing multiple flat files into TABLUM.IO, you can also eliminate the need to configure various third-party connectors in DoubleCloud and prepare the data beforehand. This not only saves time but also enhances the capabilities of DoubleCloud when it comes to analyzing and visualizing unstructured data. With TABLUM.IO, you can improve your productivity and focus on what really matters — deriving insights and making informed decisions based on your data.
Thank you for reading! If you want to simplify your data preparation process and boost your productivity, give TABLUM.IO a try for free.

Stay tuned for our next post, where we'll share more helpful tips on streamlining your data preparation workflow.
Product Use Cases
Made on
Tilda