Data Pipeline

Modified on Sun, 13 Oct 2024

8 minutes read

This document is a draft version and subject to further review and revision. Please refrain from distributing or relying on its contents as final.

A data pipeline is a structured process for moving, transforming, and managing data from one stage to another within a data processing workflow.

To Create a Data Pipeline:

Click the + icon near the New Data Pipeline option in the Process Dashboard.
Add the Name, Description, and Thumbnail for the data pipeline.
Add the Source Folder: Start by adding the folder where the data file is stored. In this case, the file is located in the Jiffy Drive. Make sure to specify the correct file path.
Configure the Batch Connector: Select the Batch Connector you configured in the previous steps. This connector will facilitate the transfer and processing of data.
Use the Debug option from the top right corner to ensure smooth pipeline operation and troubleshoot any issues as needed.
Incorporate ETL Nodes: Customize your data pipeline by incorporating ETL nodes as required. These nodes allow you to perform data transformations and computations to meet your specific data processing needs.
After setting up and verifying the pipeline, start the execution to transfer and process the data as per the defined workflow.