Header Ads

Introduction to IBM InfoSphere Data Stage: a powerful ETL Tool

Before proceeding any further I believe you must be aware of the term Data Stage. Data Stage is an ETL tool which is a part of software giant IBM. I will not go deep into the history like from where Data Stage originated, how it switched over various manufacturers and all. I will simply introduce you to this wonderful ETL tool that en-captures some of the most powerful features that any ETL tool must have (with little glimpse over its history).

For now just understand that Data Stage is a part of IBM Information Platforms Solutions suite and IBM InfoSphere. It was originally developed by a company called VMark (in 1996). Later VMark merged with Unidata with the name Ardent Software. In short VMark + Undata = Ardent Software. It was done in 1997. Ardent Software was further acquired by another company Informix in year 1999. In 2001 IBM acquired Informix but they took the charge over its DataBase side only. They left Data Integration services that time to another software company called Ascential Software.

It was March 2005 when IBM acquired Ascential Software and Data Stage became integral part of IBM. Huhh, too much history, dates, isn’t it? Now let we leave the history behind and focus on our main topic i.e. Data Stage, an ETL Tool.

If you want to learn Data Stage then you must be well comfortable with ETL, Data Warehousing concepts because they acts as the foundation stone to Data Stage and without them you won’t be able to use DS to efficient level.

Data Warehousing is the process of collection of huge amount of data (may or may
not be from similar source systems, basically data bases). Data Warehousing has made data analysis much easier and stronger than ever.

ETL stands for Extraction, Transform and Load. Its function can be described with these 3 words only:
Extraction: To pull data from different source systems.
Transform: To apply business rules to extracted data (data cleansing comes to role here).
Load: Loading the data into Data Warehouse.

This is the work of our Data Stage and as ETL Developer, we are supposed to extract, transform and load data into respective systems. Data Stage is a powerful tool which even allows the user to connect it to future technologies like Hadoop, Big Data through JSON support and a new JDBC connector.

Data Stage started with v6 and now it have reached v11 which shows immense trust of its users and zeal of source developers to keep Data Stage updated with timely enhancements (be in terms of new connectors or functionalities). 
Powered by Blogger.