Take you to experience the latest version of the DataOps big data platform - StreamSets ControlHub

This is the 9th day of my participation in the Gengwen Challenge. For details of the event, please check: Gengwen Challenge

DataOps, as the name suggests, is derived from the concept of DevOps, providing fully automatic and integrated data collection and analysis functions in one basket. A long time ago, the company had the intention to buy the ControlHub version, so I contacted the company, but unfortunately, the person in charge replied to me by email and told me that there are currently no sales channels in China. And now the Beta version of Online has arrived~~~, follow me to see the advantages of this big platform?

1. The upcoming 4.0 version

I saw the help of StreamSets 4.0 online a long time ago, but the download version does not have it, which makes people very curious, what big trick is StreamSets holding back?

Yes, through experience, I have discovered the secret, this version will open up its own cloud-native joints, providing the following powerful features:

  • job management
  • Scheduling Job Management
  • load, dynamic scaling
  • function fragment support
  • Cloud platform
  • Distributed computing power
  • Good monitoring and user management

insert image description here

2. Experience entrance

StreamSets has launched the third-quarter experience event, which is a rare opportunity. Friends who want to try it out may wish to try it.

2.1 Registration

Register the entrance . insert image description hereA ladder is required to log in. After entering, follow the guide and build it within 5 minutes.

2.2 Build the deployment script

insert image description here

2.3 Copy the deployment script

insert image description hereShaped like:

curl -s https://dev.hub.streamsets.com/streamsets-engine-install.sh | bash -s -- --deployment-id="1b72d612-b533-48f0-966b-927b488231a7:cd534f44-cf0f-11eb-a0cd-b3e334979695" --deployment-token="eyJ0eXAiOiJKV1QiLCJhbGciOiJub25lIn0.eyJzIjoiMTBjNGFmMTdlNWIwYzUwOGM4MGZhZmY3MjI4NjAzZDZmZDIwNGY4MmMwYzliYWY2MjQ5MDZmZjdiZWM0NmMyNWI1YjA4N2Q0MGM1Mjc3Y2E4YmQ0NGQ2MThmNTI3MDI1ZGE3ZTFlMGI0NTg2OTZkNzU2M2U3MGJiZjQ5NGE0MzIiLCJ2IjoxLCJpc3MiOiJkZXYiLCJqdGkiOiI5YmFiMDk1MS1mM2JhLTQxYTYtYjk0NC00ZTE4NzVlZDEwZTciLCJvIjoiY2Q1MzRmNDQtY2YwZi0xMWViLWEwY2QtYjNlMzM0OTc5Njk1In0." --sch-url="https://dev.hub.streamsets.com"
复制代码

If you copy my script, it will add a computing power engine for me. You can contact me and open an account for you to experience. Of course, you copied the script you generated yourself, so you can experience it directly.

2.4 Increase the computing power engine

First we need a cloud host~~~ Then install the java sdk, and then execute the above script.

# 1.安装javasdk
yum -y install java-1.8.0-openjdk*

# 2. 复制你的部署脚本

复制代码

Note that the computing power platform requires 1G+ memory, so ensure that your memory is sufficient. insert image description herePress Y all the way. StreamSet 4.0 has been deployed and connected to your cloud platform.

2.5 Check the computing engine

Click on Setup - Engines on the control hub platform, and you should be able to open a computer that has added computing power.insert image description here

3 Experience Pipeline

Click on the build pipeline: insert image description hereopen one and you can see the following picture, the icons of each component have a new look, and the color matching is very comfortable.insert image description here

3.1 Let's build an acquisition pipeline

insert image description hereDrag and drop components and place them, and a pipeline is built in minutes.

3.2 Version management

The cloud platform provides the Check In function, and the version problem has been solved very well.

insert image description here

3.3 Running the preview

Click the little eye icon. insert image description hereThe data preview is as follows:insert image description here

4 Experience Fragments (Functions)

The previous SDC platform could not create functions, which made us unable to reuse code. How about this snippet?

insert image description here

4.1 Create a new segment

We build a simple http request fragment, as follows, just fine. Yes, fragments don't need source and target, source and target are the parameters and return value of the function.

insert image description here

4.2 Debugging the next fragment

Because there is no source, debugging requires selecting the test source.insert image description here

4.3 Version Management

Regarding fragments, it also has version management.

4.4 Quoting Fragments

To create a new pipeline, we refer to the fragment function we just created. insert image description hereso hi!

5 job

The newly added Job is an upgraded version of the previous simple run pipeline. insert image description hereMonitoring information is complete.insert image description here

5.1 Create a job

insert image description here

5.2 Create a scheduling job

insert image description here insert image description hereWith scheduling jobs, are you still worried about not being able to start the pipeline regularly?

6 Data and computing power monitoring

insert image description here insert image description here

7 User management

Say goodbye to simple user management, here are commonly used users, groups, auditing, api authentication keys, etc.

insert image description here

8 Summary

Are you guys stunned?

A powerful integrated platform is what we think in our minds!

During operation, no ladders are required, and the operation is super smooth. It is currently in the Beta period, and may be charged later, I hope it is not too expensive.

If you like it, just click to follow and favorite! Your click is my driving force!

Guess you like

Origin juejin.im/post/6976059535966339086