Technology

IBM Information Server 8.X (DataStage): architecture and its components

admin May 20, 2023

What is DataStage?

An ETL tool to extract, transform and load the data in data mart or data warehouse
It is used for data integration projects like data warehouse, ODS (Operational Data Store) and can connect to major databases like Teradata, Oracle, DB2, SQL Server etc.
Designed ETL jobs can migrate across different environments, such as Dev, UAT, and Prod, by importing and exporting DataStage components.
You can manage metadata on jobs.
You can schedule, run, and monitor jobs in DataStage

Data stage architecture:

DataStage allows us to develop jobs in Server or Parallel editions. Parallel editing uses parallel processing capabilities to process the data and is ideal for large volumes of data.

Components:

Designated
Director
Administrator

Administrator:

The following tasks performed with the administrator.

Add, delete and move projects
Set user permissions for projects
Purge job log files
Set the timeout interval in the engine
Engine Activity Tracking
Set Job Parameter Defaults
Issue WebSphere DataStage Engine commands from the administration client
Configure the parallel processing job settings.
Create/set environmental variables.

Enabling job management on the Director client:

These functions allow WebSphere DataStage operators to release the resources of a job that has been canceled or hung, and therefore return the job to a state in which it can be executed.

This procedure enables two commands on the Director menu.

CleaningResources
Clear state file

Appointed:

Design and develop using the graphic design tool.
Various stages like General, Database, File, Processing stages used when developing jobs
Table definitions can be imported directly from data source or data warehouse tables
Jobs are compiled with the designer, and the designer checks main inputs, reference outputs, key expressions, transformations, and so on for compile errors.
Import and/or export projects from different environments
Server, mainframe and parallel jobs can be created using the designer
Define the parameters in the parameters page under the properties and they will be used accordingly in the development phase
You can create custom routines
Multiple jobs can be selected for the build and provide the report after the build is complete

Director:

Validate, schedule, run, and monitor jobs run by the DataStage server
The job status displays the current status as running, compiled, finished, aborted, and not compiled
Job Log displays the log file for the selected job
Reset the job if the state is canceled or stopped before running it again.
Provides the execution times of the jobs.
Ability to clean up resources (if administrator has enabled this option)

Along with these jobs, DataStage provides containers (local containers and shared containers) and stream jobs allow you to specify a stream of servers or parallel jobs to run.

Gibuthy.com

Gibuthy.com

IBM Information Server 8.X (DataStage): architecture and its components

LEAVE A RESPONSE Cancel reply

admin

Another Phablet Review: Apple iPhone 6S Plus Vs Samsung Galaxy Note 5

Script analysis – Where the wild things live – Archetypes and emotional-symbolic structure of the script

How to make money online: Making money online is not easy

What is Cyber Risk? Definition & Examples

Recent Posts

Recent Comments

Archives

Categories

Meta

IBM Information Server 8.X (DataStage): architecture and its components

LEAVE A RESPONSE Cancel reply

admin

You Might Also Like

Another Phablet Review: Apple iPhone 6S Plus Vs Samsung Galaxy Note 5

Script analysis – Where the wild things live – Archetypes and emotional-symbolic structure of the script

How to make money online: Making money online is not easy

What is Cyber Risk? Definition & Examples

Recent Posts

Recent Comments

Archives

Categories

Meta