Interpretation of data governance cases of banking institutions, building a bank-wide data asset system

In recent years, with the rapid development of informatization, digitization, and intelligence, data has gradually transformed from resources to assets. The financial industry has the advantages of massive data resources and rich application scenarios, and plays an important role in key links such as business management, product research and development, and technological innovation. As the basis for unlocking the value of data, data governance plays a key role in the process of promoting the digital transformation of banks.

Through data governance consulting , combined with data platform construction and data application implementation, build a scientific organizational structure system, improve data governance process systems and norms, establish a digital collaboration mechanism , integrate and open up multi-source, scattered heterogeneous data systems in the bank, and build The industry-wide data asset system, improve the management system of data authority, data lineage, data quality, classification and classification, use data products to deeply process data assets to mine value, and output more comprehensive data and statistical indicators for retail, risk, finance, etc. Promote the digital transformation of banking business.

Data Governance Solutions for Banking Institutions

Build a row-level data resource directory based on data inventory

Data asset inventory includes physical inventory . The first point is to design a classification framework. In order to facilitate the management of data assets, data assets need to be classified. According to industry practice, it is generally divided into three levels of classification. The classification basis can refer to the process framework of the enterprise, and classify according to the business classification.

The second point is to clarify the scope of the inventory. The inventory of business objects in the human resources, finance, and marketing business systems requires the participation of system developers and business personnel to solve business and technical problems encountered in the sorting process.

The third point is to pay attention to the content of the inventory, that is, the inventory of business entity objects .

In addition to physical inventory, there is also data item inventory. The inventory of data items depends on the inventory of entities. Each entity has several data items. Data items are the basic attributes of business development. These attributes can be obtained from the business system interface or from the background.

file

Standardized ETL data modeling system based on data standards

Data standards are the main reference and basis for data standardization and disambiguation of data services. Classifying data standards will facilitate the compilation, query, implementation and maintenance of data standards. There are many classification methods of data standards. For different classification methods, data standard system can be constructed by using data element as the basic unit of data standard formulation.

Data can be divided into basic data and index data. Basic data refers to the basic business information that is directly generated in the business process and has not been processed and processed. Indicator data refers to basic data with statistical significance, which is usually calculated from one or more basic data according to certain statistical rules.

Correspondingly, data standards can also be divided into basic data standards or index data standards. Basic data standards are data standards formulated in accordance with the data standard management process in order to unify the consistency and accuracy of data related to all business activities of an enterprise, solve data consistency and data integration between businesses. Indicator data standards are generally divided into basic indicator standards and calculated indicators (also known as combined indicators) standards.

Basic data standards and index data standards realize the implementation of basic data standards and index data standards by establishing basic data elements and index data elements respectively, and mapping the basic data elements and index data elements with data.

Specifically, for any field in the structured data, when it does not have the characteristics of indicators, it can be directly combined with the basic class data elements under a certain business category (such as attributes such as naming rules, data types, and value domains) ) mapping to realize the standardization of the field (in compliance with the naming rules, data types and value domain requirements); when it has the characteristics of indicators, it can be directly compared with the indicator data elements under a certain business category (such as naming rules, constraints Rules, data types and value domains, etc.) mapping to realize the standardization of the field (in compliance with naming rules, constraint rules, data types and value domains).

file

Build a data classification system based on laws and regulations

According to the "Technical Specifications for Protection of Personal Financial Information" (JR/T 0171-2020) and "A Bank's Data Security Management Measures", when personal customer C3 and C2 category information is displayed in the background management and business support system, except for the validity period of the bank card , C3 category information should not be displayed in plain text; C2 category payment account number, customer legal name, mobile phone number reserved for payment, certificates or other identification information should be masked, and batch query and batch download functions should not be provided.

If the above-mentioned sensitive fields of category C2 still need to be fully displayed or downloaded in plain text under special circumstances, the following control measures must be in place:

1) The requesting department must limit the minimum user scope, and be approved and authorized by the general manager of the department (if it is a branch, the main person in charge of the branch). The approval form refers to the "Approval Form for Individual Classification and Grading Data Authorization of a Bank", and the attachment of the approval form must be included when the OA project demand is initiated.

2) The setting and control of relevant role permissions should be done well in the system.

3) In category C2, the payment account number, legal name of the customer, mobile phone number reserved for payment, certificates or other types of identification information need to be displayed on the front page with watermark processing, and copying and pasting is prohibited; and files must be encrypted by DRM when batch downloading, and passed Approval authorization process, approval authorization at least to the general manager of the department in charge or the main person in charge of the branch.

4) Individual customer C2 sensitive information must have an audit log function for batch query or download. It must be clear who is using it at what time, in what scenario, scope of use permissions, approval process, etc. The log supports front-end page query.

5) All data usage follows the principle of "whoever uses it is responsible".

6) If the above management and control measures involve application system development, a closed-loop management and control mechanism should be formed in the whole development process (including the proposal of relevant business requirements to the business acceptance test). For the display of the above-mentioned sensitive information but without the approval of the leader's "A Bank's Individual Classification and Grading Data Authorization Approval Form", the development will not be started.

7) In the management process involving the authorization and approval of the business department, the demand-proposing department needs to archive the relevant approval and authorization documents for subsequent review.

file

Data Governance Deliverables for Banking Institutions

Data Governance Consulting

Assist in improving the data governance organizational structure (covering the data governance decision-making level, management level and executive level) and data governance-related systems, processes and evaluation standards. Improve data standards, complete the data benchmarking of 10 systems, and complete the sorting out of master data and metadata. Establish work processes such as data quality inspection, rectification, and evaluation to improve data quality. Develop a data asset inventory, evaluation and management working mechanism, and complete the data asset inventory of 10 systems. Establish an indicator management system to assist in sorting out the basic indicators of the whole bank.

Data platform construction

The data platform includes a data governance platform and a data application platform to realize the "management, governance, and use" of data. The data governance platform provides functions such as offline development, data standards, data modeling , data quality, master data, metadata, index management, data assets and data security, and realizes the integration of data development and governance. The data application platform includes a data intelligent analysis module, which provides effective data analysis tools and meets flexibility, security and convenience.

data application implementation

Complete the sorting out of data kinship for big data platforms . Restructure the data model of the middle layer (200 tables) of the big data platform, and carry out the bidding implementation. According to the data quality inspection rules , the source data and result data are audited on the data platform.

Combining marketing and risk control application scenarios, complete at least 200 basic index processing. Provide data support required by no less than 5 modeling scenarios and functional support for no less than 5 application scenarios (carry out independent data analysis, develop reports, and manage the cockpit, etc.).

Data Governance Construction Achievements of Banking Institutions

Data Governance Results

The first is data governance strategic goals, governance top-level design, governance objectives, governance operations, governance results and other strategic planning and design materials related to data governance.

The second is governance organization and regulations, designing a bank's data governance organizational structure, and customizing relevant rules and regulations according to data governance objectives to ensure the implementation of governance.

The third is the implementation path of governance, clarifying the implementation path of a bank's data governance work, detailing the key steps of data governance work and delivering relevant documents to a certain bank.

The fourth is governance assessment and operation, formulate a data governance assessment plan in combination with a bank's digital assessment objectives, and provide methodological materials for data governance to support operation and optimization.

The fifth is the data asset catalog, sorting out the data resources of a banking business system and data warehouse, improving the metadata of various data attributes, and building a bank- level data asset catalog .

The sixth is the data standard system, sorting out the existing data content of a certain bank, building two standard systems around basic data and index data, and completing the work of winning historical data.

The seventh is the grading and classification system. According to the relevant plans of the People's Bank of China's data classification and classification guidelines, a bank's data classification system was constructed, and the classification and classification of historical data was completed.

The eighth is the data governance system, which builds a bank-level data quality monitoring rule system around the data submission and verification rules of the People's Bank of China and the China Banking and Insurance Regulatory Commission combined with the verification rules of a bank's data processing process.

Platform implementation results

The first is blood relationship analysis on the big data platform . The ETL task of a bank's ImpalaSQL performs data blood relationship analysis, including table blood relationship, field blood relationship and other information, and at the same time uses the kangaroo cloud data platform to realize the blood relationship of cross-system data exchange.

The second is the establishment of a data quality rule system, centering on the China Banking Regulatory Commission's East4.0/5.0 data quality specifications, to build regulatory data verification rules, including regulatory report single verification, cross-system data verification (1104 and East), traceability data quality Inspection and other information. At the same time, according to the PBOC’s anti-money laundering regulatory requirements, the transaction counterparty information check is constructed to meet the PBOC’s anti-money laundering inspection requirements.

The third is big data model reconstruction . According to the requirements of a bank project, the data model of the middle layer (nearly 200 tables) of the big data platform is reconstructed. data for model validation.

The fourth is the design and construction of the indicator system, combined with a bank’s application scenario research indicator system in the field of credit marketing and risk control, and completed the processing of at least 200 basic indicators and the construction of complex derivative indicators according to the actual needs of the business; sorting out data assets (including core, At least 10 systems such as credit, online loan, etc.) and build a data asset catalog; sort out the data standard system according to regulatory and internal management needs and implement it through the platform; build a bank's data classification system around the People's Bank of China's financial data classification and classification standards and combine it with data from the China Banking and Insurance Regulatory Commission The encryption specification realizes the data security guarantee system .

The fifth is the data analysis scenario service, which provides data support for no less than 5 modeling scenarios and functional support for no less than 5 application scenarios according to the application requirements of a bank to carry out independent data analysis, develop reports, and manage the cockpit, etc.

"Dutstack Product White Paper": https://www.dtstack.com/resources/1004?src=szsm

"Data Governance Industry Practice White Paper" download address: https://www.dtstack.com/resources/1001?src=szsm If you want to know or consult more about Kangaroo Cloud big data products, industry solutions, and customer cases, visit Kangaroo Cloud official website: https://www.dtstack.com/?src=szkyzg

At the same time, students who are interested in big data open source projects are welcome to join "Kangaroo Cloud Open Source Framework DingTalk Technology qun" to exchange the latest open source technology information, qun number: 30537511, project address: https://github.com/DTStack

iQIYI client "White" TV, the background uploads TIOBE's July list at full speed: C++ is about to surpass C, JavaScript enters Top6 GPT-4 model architecture leak: Contains 1.8 trillion parameters, using Mixed Expert Model (MoE) CURD has been suffering for a long time in the middle and backstage front-end, and Koala Form will be used for 30 years, and the Linux market share has reached 3%. Musk announced the establishment of xAI company ChatGPT traffic dropped by 10 % . SUSE invests $10 million in sweeping data theft , forks RHEL
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3869098/blog/10088529