【Apache nifi 介绍】

一、官方介绍

Apache nifi is an easy to use, powerful, and reliable system to process and distribute data.

Apache NiFi是由美国过国家安全局(NSA)贡献给Apache基金会的开源项目,其设计目标是自动化系统间的数据流。基于其工作流式的编程理念,NiFi非常易于使用,强大,可靠及高可配置。两个最重要的特性是其强大的用户界面及良好的数据回溯工具。



 Put simply NiFi was built to automate the flow of data between systems. While the term dataflow is used in a variety of contexts, we’ll use it here to mean the automated and managed flow of information between systems. This problem space has been around ever since enterprises had more than one system, where some of the systems created data and some of the systems consumed data. The problems and solution patterns that emerged have been discussed and articulated extensively. A comprehensive and readily consumed form is found in the Enterprise Integration Patterns [eip].

Some of the high-level challenges of dataflow include:

Systems fail

Networks fail, disks fail, software crashes, people make mistakes.

Data access exceeds capacity to consume

Sometimes a given data source can outpace some part of the processing or delivery chain - it only takes one weak-link to have an issue.

Boundary conditions are mere suggestions

You will invariably get data that is too big, too small, too fast, too slow, corrupt, wrong, or in the wrong format.

What is noise one day becomes signal the next

Priorities of an organization change - rapidly. Enabling new flows and changing existing ones must be fast.

Systems evolve at different rates

The protocols and formats used by a given system can change anytime and often irrespective of the systems around them. Dataflow exists to connect what is essentially a massively distributed system of components that are loosely or not-at-all designed to work together.

Compliance and security

Laws, regulations, and policies change. Business to business agreements change. System to system and system to user interactions must be secure, trusted, accountable.

Continuous improvement occurs in production

It is often not possible to come even close to replicating production environments in the lab.

Over the years dataflow has been one of those necessary evils in an architecture. Now though there are a number of active and rapidly evolving movements making dataflow a lot more interesting and a lot more vital to the success of a given enterprise. These include things like; Service Oriented Architecture [soa], the rise of the API [api][api2], Internet of Things [iot], and Big Data [bigdata]. In addition, the level of rigor necessary for compliance, privacy, and security is constantly on the rise. Even still with all of these new concepts coming about, the patterns and needs of dataflow are still largely the same. The primary differences then are the scope of complexity, the rate of change necessary to adapt, and that at scale the edge case becomes common occurrence. NiFi is built to help tackle these modern dataflow challenges.

二、特点

关键特性包括:

  • 基于web的用户界面

    • 无缝体验设计、控制和监视

  • 高度可配置的

    • 数据丢失容错和保证交付

    • 低延迟和高吞吐量

    • 动态优先级

    • 流可以在运行时修改

    • 背压 Back presure

  • 数据来源

    • 从始至终跟踪数据流

  • 为扩展设计

    • 构建自己数据处理器

    • 支持快速开发和有效的测试

  • 安全

    • SSL,SSH,HTTPS加密内容,等等……

    • 可插拔的基于角色的验证/授权

猜你喜欢

转载自gaojingsong.iteye.com/blog/2317325