atlas 学习---简单认识

atlas是hadoop数据治理和元数据框架。

Atlas是一组可伸缩和可扩展的核心基础治理服务。

使企业能够有效地满足Hadoop中的遵从性需求,并允许与整个企业数据生态系统集成。

Apache Atlas为组织提供开放的元数据管理和治理功能,以构建数据资产的目录,对这些资产进行分类和治理,并为数据科学家、分析师和数据治理团队提供围绕这些数据资产的协作功能。

Features:

Metadata types & instances:

  * 各种Hadoop和非Hadoop元数据的预定义types

 * 为要管理的元数据定义新types的能力

  * types 可以有原始属性、复杂属性、对象引用;可以从其他types 继承

 *  instances of types 类型的实例(称为实体)捕获元数据对象细节及其关系

 * 用于处理类型和实例的REST api允许更容易的集成

Classification:

能够动态创建Classifications-像PII, EXPIRES_ON, DATA_QUALITY, SENSITIVE

Classifications can include attributes - like expiry_date attribute in EXPIRES_ON classification

实体可以与多个分类相关联,从而更容易地发现和安全实施

Entities can be associated with multiple classifications, enabling easier discovery and security enforcement

Propagation of classifications via lineage - automatically ensures that classifications follow the data as it goes through various processing

Lineage:

  • Intuitive UI to view lineage of data as it moves through various processes
  • REST APIs to access and update lineage

Search/Discovery:

  • Intuitive UI to search entities by type, classification, attribute value or free-text
  • Rich REST APIs to search by complex criteria
  • SQL like query language to search entities - Domain Specific Language (DSL)

Security & Data Masking

  • Fine grained security for metadata access, enabling controls on access to entity instances and operations like add/update/remove classifications
  • Integration with Apache Ranger enables authorization/data-masking on data access based on classifications associated with entities in Apache Atlas. For example:
    • who can access data classified as PII, SENSITIVE
    • customer-service users can only see last 4 digits of columns classified as NATIONAL_ID
发布了184 篇原创文章 · 获赞 32 · 访问量 11万+

猜你喜欢

转载自blog.csdn.net/wjandy0211/article/details/103636834
今日推荐