Publicación Académica creada y compartida por Mary Galicia para el tema Fuentes de Datos.
Dentro del ámbito de estudios: Carrera Analista en SAP BI / Unidad 2: Proceso de ETL.
¿Buscás una carrera con Futuro?
Data Flow in Business Warehouse
The data flow in the Business Warehouse (BW) defines which objects are needed at design time and which processes are needed at runtime. These objects and processes are needed to transfer data from a source to BW, to cleanse, consolidate and integrate the data, so that it can be used for analysis, reporting and planning. The individual requirements of your company processes are supported by numerous options for designing the data flow. You can use any data sources that transfer the data to BW or access the source data directly, apply simple or complex cleansing and consolidating methods, and define data repositories that correspond to the requirements of your layer architecture.
With SAP NetWeaver 7.0, the concepts and technologies for certain elements in the data flow were changed. The most important components of the new data flow are explained below, whereby the changes to the previous data flow are also mentioned. To distinguish them from the new objects, the objects previously used are appended with 3.x.
In BW, the metadata description of the source data is modeled with DataSources. A DataSource is a set of fields that are used to extract data of a business unit from a source system and transfer it to the entry layer of the BW system or provide it for direct access.
There is a new object concept available for DataSources in BW. In BW, the DataSource is edited or created independently of 3.x objects on a unified user interface. When the DataSource is activated, the system creates a PSA table in the Persistent Staging Area (PSA), the entry layer of BW. In this way the DataSource represents a persistent object within the data flow.
Before data can be processed in BW, it has to be loaded into the PSA using an InfoPackage. In the InfoPackage, you specify the selection parameters for transferring data into the PSA. In the new data flow, InfoPackages are only used to load data into the PSA.
Using the transformation, data is copied from a source format to a target format in BW. Transformation thereby allows you to consolidate and cleanse data from multiple sources. You can perform semantic synchronization of data from various sources. You integrate the data into the BW system by assigning fields from the DataSource to InfoObjects. In the data flow, the transformation replaces the update and transfer rules, including transfer structure maintenance.
InfoObjects are the smallest information units in BW. They structure the information in the form needed to build up InfoProviders.
InfoProviders consist of several InfoObjects. They are persistent data repositories that are used in the layer architecture of the Data Warehouse or in data views. They provide data for analysis, reporting and planning. You also have the option of writing the data to other InfoProviders.
Using an InfoSource (optional in the data flow), you can connect multiple sequential transformations. You therefore only require an InfoSource for complex transformations (multistep procedures).
You use the data transfer process (DTP) to transfer the data within BW from one persistent object to another object, in accordance with certain transformations and filters. Possible sources for the data transfer include DataSources and InfoProviders; possible targets include InfoProviders and open hub destinations. To distribute data within BW and in downstream systems, the DTP replaces the InfoPackage, the Data Mart Interface (export DataSources) and the InfoSpoke.
You can also distribute data to other systems using an open hub destination.
In BW, process chains are used to schedule the processes associated with the data flow, including InfoPackages and data transfer processes.
The complexity of data flows varies. As an absolute minimum, you need a DataSource, a transformation, an InfoProvider, an InfoPackage and a data transfer process.
DataSource object type RSDS (new DataSource):
The new DataSource of object type RSDS enables real-time data acquisition, as well as direct access to source systems of type File and DB Connect.
Data Transfer Process:
The data transfer process (DTP) makes the transfer processes in the data warehousing layers more transparent. The performance of the transfer processes increases when you optimize parallelization. With the DTP, delta processes can be separated for different targets and filtering options can be used for the persistent objects on different levels. Error handling can also be defined for DataStore objects with the DTP. The ability to sort out incorrect records in an error stack and to write the data to a buffer after the processing steps of the DTP simplifies error handling. When you use a DTP, you can also directly access each DataSource in the SAP source system that supports the corresponding mode in the metadata (also master data and text DataSources).
Transformations simplify the maintenance of rules for cleansing and consolidating data. Instead of two rules (transfer rules and update rules), as in the past, only the transformation rules are still needed. You edit the transformation rule in an intuitive graphic user interface. InfoSources are no longer mandatory; they are optional and are only required for certain functions. Transformations also provide additional functions - such as quantity conversion, performance-optimized reading of master data and DataStore objects - as well as the option to create an end routine or expert routine.
SAP HANA-optimized DataStore object as update source:
Important if you are using the SAP HANA database: the update rules cannot read from an SAP HANA-optimized DataStore object due to its architecture. Therefore, a DataStore object used as a source of update rules in a 3.x data flow cannot be converted to an SAP HANA-optimized object. Once you have migrated the 3.x data flow with the DataStore object used as the source for the update rules, you can perform the conversion and use the SAP HANA-optimized DataStore object.
You model data flows and elements in the Modeling functional area of the Data Warehousing Workbench. The graphical user interface here helps you to create top-down model and use the best practice models (data flow templates provided by SAP). With top-down modeling, you create a model blueprint on the BW system, which you can use later on to create a persistent data flow
Data Warehousing Workbench Purpose
The Data Warehousing Workbench (DWB) is the central tool for performing the tasks in the data warehousing process. It provides data modeling functions as well as functions for control, monitoring and maintenance of all processes in SAP NetWeaver BI having to do with data procurement, data retention, and data processing.
DB Connect is used to define other database connection in addition to the default connection and these connections are used to transfer data into the BI system from tables or views.
To connect an external database, you should have the following information −
- Source Application knowledge
- SQL syntax in Database
- Database functions
In case the source of your Database management system is different from BI DBMS, you need to install database client for source DBMS on the BI application server.
DB Connect key feature includes loading of data into BI from a database that is supported by SAP. When you connect a database to BI, a source system requires creating a direct point of access to the external relational database management system.DB Architecture
SAP NetWeaver component’s multiconnect function allows you to open extra database connections in addition to the SAP default connection and you can use this connection to connect to external databases.
DB Connect can be used to establish a connection of this type as a source system connection to BI. The DB Connect enhancements to the database allows you to load the data to BI from the database tables or views of the external applications.
For default connection, the DB Client and DBSL are preinstalled for the database management system (DBMS). To use DB Connect to transfer data into the BI system from other database management systems, you need to install database-specific DB Client and database-specific DBSL on the BI application server that you are using to run the DB Connect.Creating DBMS as Source System
Go to RSA1 → Administration workbench. Under the Modeling Tab → Source Systems
Go to DB Connect → Right click → Create.
Enter the logical system name (DB Connect) and description. Click on Continue.
Enter the database management system (DBMS) that you want to use to manage the database. Then enter the database user under whose name you want the connection to be opened and the DB Password has to enter for authentication by the database.
In the Connection Info, you have to enter the technical information required to open the database connection.Permanent Indicator
You can set this indicator to keep a permanent connection with the database. If the first transaction ends, then each transaction is checked to see if the connection has been reinitiated. You can use this option if the DB connection has to be accessed frequently.
Save this configuration and you can Click Back to see it in the table.SAP BW - Flat File Data Transfer
ou can load the data from an external system to BI using these flat files. SAP BI supports data transfer using flat files, files in ASCII format, or in the CSV format.
The data from a flat file can be transferred to BI from a workstation or from an application server.
Following are the steps involved in a Flat File Data Transfer −
Define a file source system.
Create a DataSource in BI, defining the metadata for your file in BI.
Create an InfoPackage that includes the parameters for data transfer to the PSA.
If there are character fields that are not filled in a CSV file, they are filled with a blank space and with a zero (0) if they are numerical fields.
If separators are used inconsistently in a CSV file, the incorrect separator is read as a character and both fields are merged into one field and may be shortened. Subsequent fields are then no longer in the correct order.
A line break cannot be used as part of a value, even if the value if enclosed with an escape character.
The conversion routines that are used to determine whether you have to specify leading zeros. More information − Conversion Routines in the BI-System.
For dates, you usually use the format YYYYMMDD, without internal separators. Depending on the conversion routine being used, you can also use other formats.
Before you can transfer data from a file source system, the metadata must be available in BI in the form of a DataSource. Go to Modeling tab → DataSources.
Right click in context area → Create DataSource.
Enter the technical name of the data source, type of data source and then click on Transfer.
Go to General tab → Select the General Tab. Enter descriptions for the DataSource (short, medium, long).
If required, specify whether the DataSource is initial non-cumulative and might produce duplicate data records in one request.
You can specify whether you want to generate the PSA for the DataSource in the character format. If the PSA is not typed it is not generated in a typed structure but is generated with character-like fields of type CHAR only.
The next step is to click on the Extraction tab page and enter the following details −
Define the delta process for the DataSource. Specify whether you want the DataSource to support direct access to data (Real-time data acquisition is not supported for data transfer from files).
Select the adapter for the data transfer. You can load text files or binary files from your local work station or from the application server. Select the path to the file that you want to load or enter the name of the file directly.
In case you need to create a routine to determine the name of your file. The system reads the file name directly from the file name field, if no, then the routine is defined.
As per the adapter and the file to be loaded, the following setting has to be made −
Binary files − Specify the character record settings for the data that you want to transfer.
Text-type files − For text files, determine the rows in your file are header rows and they can therefore be ignored when the data is transferred. Specify the character record settings for the data that you want to transfer.
For ASCII files − To load the data from an ASCII file, the data is requested with a fixed data record length.
For CSV files − To load data from an Excel CSV file, mention the data separator and the escape character.
The next step is to go to the Proposal tab page, this is required only for CSV files. For files in different formats, define the field list on the Fields tab page.
The next step is to go to Fields tab −
You can edit the fields that you transferred to the field list of the DataSource from the Proposal tab. If you did not transfer the field list from a proposal, you can define the fields of the DataSource here as shown in the following screenshot.
You can then perform check, save and activate the DataSource.
You can also select the Preview tab. If you select read Preview Data, the number of data records you specified in your field selection is displayed in a preview.
Universal Data Connect (UDC) allows you to access relational and multidimensional data sources and transfer the data in the form of flat data. Multidimensional data is converted to a flat format when the Universal Data Connect is used for data transfer.
UD uses a J2EE connector to allow reporting on SAP and non-SAP data. Different BI Java connectors are available for various drivers, protocols as resource adapters, some of which are as follows −
- BI ODBO Connector
- BI JDBC Connector
- BI SAP Query Connector
- XMLA Connector
To set up the connection to a data source with source object (Relational/ OLAP) on J2EE engine. Firstly, you have to enable communication between the J2EE engine and the BI system by creating RFC destination from J2EE to BI. Then model the InfoObjects in BI as per the source object elements, and in the BI system determine the data source.Creating a UD Connect Source System
As mentioned above, you have created an RFC destination through which the J2EE engine and BI allows communication between these two systems.
Go to Administration workbench, RSA1 → Go to Modeling tab → Source Systems.
Right click on the UD Connect → Create. Then in the next window, enter the following details −
- RFC Destination for the J2EE Engine
- Specify a logical system name
- Type of connector
Then you should enter the −
- Name of the connector.
- Name of the source system if it was not determined from the logical system name.
Once you fill in all these details → Choose Continue.
Overview of Data Flow
Data flow in data acquisition involves transformation, info package for loading to PSA, and data transfer process for distribution of data within BI. In SAP BI, you determine which data source fields are required for decision making and should be transferred.
When you activate the data source, a PSA table is generated in SAP BW and then data can be loaded.
In the transformation process, fields are determined for InfoObjects and their values. This is done by using the DTP data which is transferred from PSA to different target objects.
The transformation process involves the following different steps −
- Data Consolidation
- Data Cleansing
- Data Integration
When you move the data from one BI object to another BI object, the data is using a transformation. This transformation converts the source field in to the format of the target. Transformation is created between a source and a target system.
BI Objects − InfoSource, DataStore objects, InfoCube, InfoObjects, and InfoSet act as the source objects and these same objects serve as target objects.
A Transformation should consist of at least one transformation rule. You can use different transformation, rule types from the list of available rules and you can create simple to complex transformations.Directly Accessing Source System Data
This allows you to access data in the BI source system directly. You can directly access the source system data in BI without extraction using Virtual Providers. These Virtual providers can be defined as InfoProviders where transactional data is not stored in the object. Virtual providers allow only read access on BI data.
There are different types of Virtual Providers that are available and can be used in various scenarios −
- VirtualProviders based on DTP
- VirtualProviders with function modules
- VirtualProviders based on BAPI’s
These VirtualProviders are based on the data source or an InfoProvider and they take characteristics and key figures of the source. Same extractors are used to select data in a source system as you use to replicate data into the BI system.
- When are Virtual Providers based on DTP?
- When only some amount of data is used.
- You need to access up to date data from a SAP source system.
- Only few users execute queries simultaneously on the database.
Virtual Providers based on DTP shouldn’t be used in the following conditions −
When multiple users are executing queries together.
When same data is accessed multiple times.
When a large amount of data is requested and no aggregations are available in the source system.
To go to Administration Workbench, use RSA1
In the Modeling tab → go to Info Provider tree → In Context menu → Create Virtual Provider.
In Type Select Virtual Provider based on Data Transfer Process for direct access. You can also link a Virtual Provider to a SAP source using an InfoSource 3.x.
A Unique Source System Assignment Indicator is used to control the source system assignment. If you select this indicator, only one source system can be used in the assignment dialog. If this indicator is not checked, you can select more than one source system and a Virtual Provider can be considered as a multi-provider.
Click on Create (F5) at the bottom. You can define the virtual provider by copying objects. To Activate the Virtual Provider, click as shown in the following screenshot.
To define Transformation, right click and go to Create Transformation.
Define the Transformation rules and activate them.
The next step is to create a Data Transfer Process. Right click → Create Data Transfer Process
The default type of DTP is DTP for Direct access. You have to select the source for Virtual Provider and activate DTP.
To activate direct access, context menu → Activate Direct Access.
Select one or more Data transfer processes and activate the assignment.
Virtual Providers with BAPIs
This is used for reporting on the data in external systems and you don’t need to store transaction data in the BI system. You can connect to non-SAP systems like hierarchical databases.
When this Virtual Provider is used for reporting, it calls Virtual Provider BAPI.Virtual Provider with Function Module
This Virtual Provider is used to display data from a non BI data source to a BI without copying the data to BI structure. The data can be local or remote. This is primarily used for SEM applications.
If you compare this with other Virtual Providers, this is more generic and offer more flexibility, however you need to put a lot of efforts in implementing this.
Enter the name of the Function Module that you want to use as data source for Virtual Providers.
:: Al igual que Mary ¿Usted está dispuesto a aprender SAP BI BW?
El lunes 18 Febrero 2019 comenzamos la 🎓Carrera Analista en SAP BI.
Juntos para realizar sus más grandes desafíos!
✒️Más Publicaciones Académicas sobre Tecnología SAP
- Publicaciones sobre el tema SAP BI BW Fuentes de Datos
- Publicaciones sobre el área SAP BI BW
- Publicaciones en otras áreas de la Tecnología SAP
Buscador de Publicaciones:
Los mejores recursos Libres y Gratuitos compartidos con la comunidad: